Detailed longitudinal data
This page has been last updated for R11.
Service sector data contains detailed longitudinal data with additional columns.
Sandbox directory
Detailed longitudinal data is available in the following Sandbox directory:
/finngen/library-red/finngen_R[RELEASE]/phenotype_1.0/
Note, this data also available in Atlas and OMOP common data model.
Data files
The data is available in the following file:
data/finngen_R{RELEASE]_detailed_longitudinal_1.0.txt.gz
The data file has eleven columns:
Column | Description |
FINNGEN ID | Sample ID |
SOURCE | Register source |
EVENT_AGE | Individual's age at the event to two decimals |
APPROX_EVENT_DAY | A randomized event date: +/- 1-15 days are added to the confidential exact event date |
CODE1 - CODE4 | Register source specific codes and other information |
ICDVER | ICD-code version: ICD8/9/10, ICD-O-3 |
CATEGORY | Register code sets (vocabularies) |
INDEX | Register index number. The same INDEX value within a register means that the codes have been given in the same hospital visit, or are, for example, from the same drug purchase event. |
Detailed information about the columns is available in the following file:
finngen_R[RELEASE]_detailed_longitudinal_readme_1.0.txt
This is the main register data in FinnGen and contains health register codes from different sources. The data is called longitudinal because it contains several events/entries of codes for the same individual recorded at different times.
Detailed longitudinal data was presented in the FinnGen data users meeting on 12th January 2021 and can be explored using Atlas in the Sandbox.
The data is used as input for the Endpointter to determine which individuals are included in each of FinnGen's phenotypic endpoints:
A mock example of the detailed longitudinal data file is shown below:
Sources
The detailed longitudinal data file contains codes from the following registries:
Source | Description | Types of codes |
PURCH | Kela drug purchase register | Medication codes |
REIMB | Kela drug reimbursement register | Medication codes |
INPAT | Inpatient Hilmo register | Diagnosis codes |
OPER_IN | Inpatient Hilmo register - operations | Operation codes |
OUTPAT | Specialist outpatient Hilmo register | Diagnosis codes |
OPER_OUT | Specialist outpatient Hilmo - register operations | Operation codes |
PRIM_OUT | Primary health care outpatient visits | Diagnosis and operation codes |
CANC | Cancer register | Cancer codes |
DEATH | Cause of death register | Cause of death codes |
Image below demonstrates how variables in the national health registries end up to the columns in the detailed longitudinal data.
Codes
Information stored in the CODE1 - CODE4 column depends on the register source:
Source | Description | Codes |
PURCH | Kela drug purchase register | CODE1: ATC code CODE2: Kela reimbursment code CODE3: Product number CODE4: Number of packages |
REIMB | Kela drug reimbursement register | CODE1: Kela reimbursment code CODE2: ICD code |
INPAT OUTPAT | Inpatient Hilmo register Specialist outpatient Hilmo register | CODE1: symptom code CODE2: cause code (e.g. CODE1 could be dementia associated with Alzheimer’s disease with CODE2 as Alzheimer's disease) CODE3: ATC code for drug's adverse effect CODE4: duration of stay |
OPER_IN OPER_OUT | Inpatient Hilmo register - operations Specialist outpatient Hilmo register - operations | CODE1: operation code |
PRIM_OUT | Primary health care outpatient visits | CODE1: diagnosis or operation code CODE2: symptom code CODE3: ATC code for drug's adverse effect |
CANC | Cancer register | CODE1: ICD-0-3 topography CODE2: ICD-0-3 morphology CODE3: ICD-0-3 behaviour |
DEATH | Cause of death register | CODE1: cause of death |
Categories
The register code sets (vocabularies) are stored in the CATEGORY column:
Detailed information about register code sets is available from:
For diagnosis codes:
0 at the end of the CATEGORY variable means the main diagnosis code (e.g. 0 in ICD0 and NOM0)
1:N at the end of the CATEGORY variable refers to side diagnoses (e.g. 1:N in ICD1:N, NOM1:N)
Register data availability dates
Data is available from the register start date until the end of the register-specific follow-up date. The follow-up dates are available here and in the following file:
finngen_R[RELEASE]_detailed_longitudinal_readme_1.0.txt
The start and follow-up dates differ between registries and FInnGen data releases and you should take this into account in your analyses. Register start and follow-up dates are shown below for Data Freeze 6:
Further information
Last updated