Skip to content

Data model

To understand the data model, please first consult the general introduction on the specification website and the technical description of the metadata model.

The main principles are:

  • For each patient and each singular (i.e., basic or datestamped) item reference, there may exist zero or one item value.
  • For datestamped item references, the datestamp is included in the data model.
  • For each patient and each longitudinal item reference, there may exist any number of item values, and the respective datestamp is included in the data model.
  • For each patient and each record, there may exist any number of record instances. Depending on the record type, the record instance contains various dates (datestamp, start/stop dates etc.; see the introduction for details).
  • For each record instance and each record item reference belonging to that record, there may exist zero or one item value.

Currently, data is submitted by registries in a tabular format for which an Excel template is provided in the section "Downloads" section on the website (see, for instance, the SMA downloads). This Excel file contains the following tables:

  • One table for all singular items. It contains exactly one row for each patient. For every basic item reference, there exists one column for the item value. For each datestamped item reference, there exists one column for the value and one for the datestamp.
  • One table for all longitudinal items references. It contains one row for each patient and unique datestamp, so that every patient generally has multiple rows. For every longitudinal item reference, there is one column for the value.
  • One table for each record. In each record table, there is one row per record instance, and one column for every item reference belonging to that record.

This format is centered around the multiplicity of the values and ensures that there are no redunancies while each row makes sense by itself.

The following UML diagram illustrates the conceptual model (although it currently does not convey the multiplicity constraints described above):

UML model