Red Library Data (individual level data)

Types of red data available in FinnGen

Clinical codes are recorded in nationwide registries which are harmonized together into one table in FinnGen

Code sets in detailed longitudinal data

  • Kela (Social insurance institution)

    • The REIMB/KELA codes are unique to Finland - these are longterm reimbursement codes for a class of drugs and imply a robust review of diagnosis. For example, KELA code 307 provides reimbursement for donepezil, galantamine, memantine and rivastigmine and is granted only after review of physician

      • ATC – drug identifiers

      • REIMB – drug reimbursements

      • VNRO – drug product number including dosage and number in package

  • Hilmo/Hospital codes

    • Diagnosis in ICD-10, ICD-9 and ICD-8

    • Procedures (Nomesco, Finnish Hospital league*, Heart patient**)

  • Cancer registry

    • ICD-O-3 (topology, morphology, behavior codes, OBS. no specific information on gradus or metastatic spread)

  • Avohilmo

    • Diagnosis in ICD-10 and nurse codes ICPC2

    • Procedures (SPAT, Nomesco)

    • Dental codes

*) Finnish specific procedure codes, **) Finnish specific demanding heart patient procedure code

Additional registries

  • Vaccination registry from primary health care (vaccination, type of vaccination, disease, method and site, approx. date, contains also COVID vaccination)

  • Kidney registry (only dialysis/transplantation patients, set of diagnosis codes)

  • Digital and population data services agency, birth registry (reproductive history data - detailed information on births in FinnGen participants)

  • Visual impairment registry (data on legally blind patients: diagnosis, visual acuity, diameter of visual field, homonyme hemianopsia)

  • Cancer screening registries (cervical and breast)

  • Parental causes of death (ICD-8/9/10)

  • Infectious diseases (COVID positive patients and their treatment)

  • Congenital malformations (congenital chromosomal and structural anomalies, congenital hypothyroidism, teratomas)

  • Service sector data (service sector (e.g. elderly home), contact type, specialty, hospital type, urgency, professional type, Kela drug reimbursement costs, Kela drug reimbursement category)

Minimum phenotype data (comes with the samples from the biobanks)

  • Age

  • smoking status (yes/no, current/former/never, current/occasional/quitter/former/never)

  • BMI

Genetic data (germline)

  • GWAS array genotypes (500,000)

    • over 300,000 samples genotyped with ThermoFisher Axiom custom array v2

      • Total of 664,510 markers including:

        • About 500,000 core GWAS markers

        • 116,000 coding variants enriched in Finland

        • >10,000 specific markers for the HLA/KIR region

        • about 15,000 ClinVar variants

        • about 4,600 pharmacogenomic variants

        • 57,000 selected markers of special interest for partners

    • 80-90,000 samples genotypes with various other GWAS arrays (Legacy genotypes)

  • Imputed genotypes (500,000)

    • ~21,3M variants per individual

    • imputed using Finnish specific WGS reference panel of ~9000 individuals

In this section we will share the following:

Genotype data

Phenotype data

Expansion Area 3 (EA3) studies

Last updated