FinnGen clinical endpoints

FinnGen clinical endpoints, or simply FinnGen endpoints, are considered as indicators that a certain medical condition has occurred. This could be, e.g., an acute event such as Covid-19 infection, or onset of a chronic disease, such as cardiovascular disease. We use multiple registers for endpoint definitions including care register for healthcare (hospital discharges, specialist outpatient visits, operations and procedures), causes-of-death from Statistic Finland, medicine purchases and drug reimbursements from Kela, and specific cancer diagnoses from Finnish Cancer Registry. For example, if a person is discharged from a hospital with an ICD-10 code G43.1 (Migraine with aura), an endpoint of migraine is assigned to the person's FinnGen data. Many of these registers cover almost half a decade of data. This means that different versions of e.g. Finnish specific ICD codes (versions 8,9 and 10) have been used, requiring harmonization of the data and endpoint definitions.

Within the FinnGen project we set up to create a comprehensive endpoint library (available here) of health/disease endpoint definitions covering the whole 21 chapters of diseases in ICD-10, in a hierarchical manner. Our library generally follows the ICD-10 hierarchy where chapters are divided into code blocks of similar diseases, containing three- character categories, of usually single diseases, and four-character subcategories, usually disease subtypes. Some small changes were made into the hierarchy mainly when the ICD-8/9 codes necessitated. For some diseases the medicine purchases or reimbursements are a great source of information, and for cancers, we use the Finnish Cancer Registry data, which is coded in ICD-O-3.

Besides the ICD-10 based hierarchy, our endpoint library contains some more specific endpoints of interest within FinnGen, that could be overlapping with each other or the hierarchical endpoints. Currently we provide 4 648 endpoint definitions, including some diseases that are quite rare in the population. Besides the endpoint definitions, we provide suggested control exclusion criteria for each endpoint; these are used in the FinnGen GWAS association analyses.

In order to make the endpoints more comparable with other disease endpoint efforts and more easily searchable, we have matched our endpoints with disease ontology coding systems, using DOID, EFO, and MESH codes. This matching has been done with an automatic customized algorithm, following manual curation.

We have also created a Risteys portal to browse the endpoints. Find out how to use Risteys here.

Compared to the whole Finnish population, FinnGen will in the end contain 500 000 participants (~10% of the Finnish population). FinnGen is enriched with disease cases from hospital biobanks and disease specific 'legacy' cohorts deposited to Finnish biobanks. Thus FinnGen is well suited for genetic epidemiological studies, but due to the selection bias, it is not suitable for assessing accurate population prevalence of diseases.

FinnGen corrects endpoint and control exclusion criteria when necessary and creates new endpoints and control exclusion criteria each Data Freeze. You can access the current and previous endpoint and control exclusion criteria libraries here. The libraries are also available in Sandbox here.

Last updated