Complete follow-up time of the FinnGen registries – primary endpoint data

Follow-up times of the Finnish registers varies. Some registers begin as early as in the 1950-60s (Cancer register 1953; KELA drug reimbursement register 1964; inpatient Hilmo register 1969; Causes-of-death register 1969) and follow-up may continue even until the current year (Hilmo register, primary care register), whereas the records in other registers begin later (e.g. primary care register 2011) and follow-up may end even a couple of years before the current date (cancer register, cause-of-death register).

The section Finnish health registers shows the beginning of follow-up for each register; and the section detailed longitudinal data explains the registers with detailed longitudinal and endpoint data.

It has been decided to keep the follow-up of each register as far as it is received from the registry. This allows analyses to use the most recent data from each register, which is important, for example, for COVID-19 research.

However, one “common follow-up end date” (FU_END_DATE) for all registers in the endpoint data is needed to be able to calculate the pre-calculated variable ENDPOINT_AGE for each endpoint in the Endpoint data file.

Common follow-up end date is specific to each data freeze. It is decided based on the additional follow-up years in the register data that are received by the register team.

Follow-up end times of the registers belonging to the detailed longitudinal data file, and common follow-up end date used for the endpoind data:

Data FreezeHilmoAvohilmoKela drug purchasesKela drug reimbursementsDeathCancerCommon FU_END_DATE

DF12

28.4.2023

28.4.2023

31.12.2022

31.12.2022

31.12.2021

31.12.2021

28.4.2023

DF11

02.11.2022

02.11.2022

31.12.2021

31.12.2021

31.12.2020

31.12.2021

02.11.2022

DF10

31.12.2021

31.12.2021

31.12.2021

31.12.2021

31.12.2020

31.12.2019

31.12.2021

DF9

11.10.2021

11.10.2021

31.12.2020

31.12.2020

31.12.2019

31.12.2019

11.10.2021

DF8

24.3.2021

24.3.2021

31.12.2019

31.12.2019

31.12.2019

31.12.2018

31.12.2019

DF7

31.12.2019

31.12.2020

31.12.2019

31.12.2019

31.12.2018

31.12.2018

31.12.2019

DF6

31.12.2018

31.12.2018

31.12.2019

31.12.2019

31.12.2018

31.12.2018

31.12.2019

DF5

31.12.2018

31.12.2018

31.12.2018

31.12.2017

not included

31.12.2017

DF4

31.12.2017

31.12.2017

31.12.2017

31.12.2018

not included

31.12.2017

In the endpoint data the ENDPOINT_AGE variable contains the age for each individual at:

Cases:

  • first EVENT_AGE

Controls:

  • FU_END_DATE (DF9: 11.10.21, Common follow-up end data, =latest registry follow-up end date)

OR

  • Age at death (if deceased - and even if emigrated at some point)

OR

  • Age at emigration (if moved abroad and not deceased).

In the DF9 endpoint data, this common follow-up end date has been decided to be 11.10.21, which is the “latest follow-up end date” available in any of the registries (in this case the Hilmo and Avohilmo registries) in the endpoint data (Figure 1).

Keeping the follow-up as far back as possible in each registry leads to some features/biases that are good to keep in mind. For example, the follow-up in the “causes-of-death registry” ends on 31.12.2019 in DF9, which is long before the end of follow-up in the Hilmo and Avohilmo registries (11.10.2021). Thus, we cannot be sure that an individual is still alive in 2021, if there aren’t any records after 2019 in any other registers (Figure 1).

It is important to take into account the difference in follow-up beginning and end dates of registers. This is the case for, for example, survival analyses where the goal is to investigate the survival or disease progression of an individual by using information from several different registers (eg. using endpoint data; many endpoints are created by combining information from several registries).

Take a look at the section Truncated endpoint data for survival analysis for information about the additional endpoint data to be used for the survival analysis.

Related FAQs:

I found BL_AGE after FU_END_AGE in endpoint data, how is it possible?

Why individuals who are not dead have death age in endpoint data?

I found EVENT_AGE after FU_END_AGE in endpoint data, how is it possible?

Last updated