Detailed longitudinal data
This page has been last updated for R11.
Service sector data contains detailed longitudinal data with additional columns.
Sandbox directory
Detailed longitudinal data is available in the following Sandbox directory:
/finngen/library-red/finngen_R[RELEASE]/phenotype_1.0/
Note, this data also available in Atlas and OMOP common data model.
Data files
The data is available in the following file:
data/finngen_R{RELEASE]_detailed_longitudinal_1.0.txt.gz
The data file has eleven columns:
Column
Description
FINNGEN ID
Sample ID
SOURCE
Register source
EVENT_AGE
Individual's age at the event to two decimals
APPROX_EVENT_DAY
A randomized event date: +/- 1-15 days are added to the confidential exact event date
CODE1 - CODE4
Register source specific codes and other information
ICDVER
ICD-code version: ICD8/9/10, ICD-O-3
CATEGORY
Register code sets (vocabularies)
INDEX
Register index number. The same INDEX value within a register means that the codes have been given in the same hospital visit, or are, for example, from the same drug purchase event.
Detailed information about the columns is available in the following file:
finngen_R[RELEASE]_detailed_longitudinal_readme_1.0.txt
This is the main register data in FinnGen and contains health register codes from different sources. The data is called longitudinal because it contains several events/entries of codes for the same individual recorded at different times.
Detailed longitudinal data was presented in the FinnGen data users meeting on 12th January 2021 and can be explored using Atlas in the Sandbox.
The data is used as input for the Endpointter to determine which individuals are included in each of FinnGen's phenotypic endpoints:
A mock example of the detailed longitudinal data file is shown below:
Sources
The detailed longitudinal data file contains codes from the following registries:
Source
Description
Types of codes
PURCH
Kela drug purchase register
Medication codes
REIMB
Kela drug reimbursement register
Medication codes
INPAT
Inpatient Hilmo register
Diagnosis codes
OPER_IN
Inpatient Hilmo register - operations
Operation codes
OUTPAT
Specialist outpatient Hilmo register
Diagnosis codes
OPER_OUT
Specialist outpatient Hilmo - register operations
Operation codes
PRIM_OUT
Primary health care outpatient visits
Diagnosis and operation codes
CANC
Cancer register
Cancer codes
DEATH
Cause of death register
Cause of death codes
Image below demonstrates how variables in the national health registries end up to the columns in the detailed longitudinal data.
Codes
Information stored in the CODE1 - CODE4 column depends on the register source:
Source
Description
Codes
PURCH
Kela drug purchase register
CODE1: ATC code
CODE2: Kela reimbursment code
CODE3: Product number
CODE4: Number of packages
REIMB
Kela drug reimbursement register
CODE1: Kela reimbursment code
CODE2: ICD code
INPAT OUTPAT
Inpatient Hilmo register
Specialist outpatient Hilmo register
CODE1: symptom code
CODE2: cause code (e.g. CODE1 could be dementia associated with Alzheimer’s disease with CODE2 as Alzheimer's disease)
CODE3: ATC code for drug's adverse effect
CODE4: duration of stay
OPER_IN
OPER_OUT
Inpatient Hilmo register - operations
Specialist outpatient Hilmo register - operations
CODE1: operation code
PRIM_OUT
Primary health care outpatient visits
CODE1: diagnosis or operation code
CODE2: symptom code
CODE3: ATC code for drug's adverse effect
CANC
Cancer register
CODE1: ICD-0-3 topography
CODE2: ICD-0-3 morphology
CODE3: ICD-0-3 behaviour
DEATH
Cause of death register
CODE1: cause of death
Categories
The register code sets (vocabularies) are stored in the CATEGORY column:
Detailed information about register code sets is available from:
For diagnosis codes:
0 at the end of the CATEGORY variable means the main diagnosis code (e.g. 0 in ICD0 and NOM0)
1:N at the end of the CATEGORY variable refers to side diagnoses (e.g. 1:N in ICD1:N, NOM1:N)
Register data availability dates
Data is available from the register start date until the end of the register-specific follow-up date. The follow-up dates are available here and in the following file:
finngen_R[RELEASE]_detailed_longitudinal_readme_1.0.txt
The start and follow-up dates differ between registries and FInnGen data releases and you should take this into account in your analyses. Register start and follow-up dates are shown below for Data Freeze 6:
Further information
Last updated