FinnGen Data Freezes and Releases
During the active sample collection phase during FinnGen 1 and 2, the Data Freeze (DF) happened twice a year in February and August. At these times, FinnGen produced Data Release (R) of updated genotype and phenotype data files to the Sandbox.
During FinnGen 3 we no longer increase the number of samples, but plan to update the health register data three times on Feb 2025 (R13), Feb 2026 (R14) and Feb 2027 (R15).
About 3 months after each release, the FinnGen Core Analysis team releases a set of FinnGen core analysis results to the Google Cloud storage green bucket (gs://finngen-production-library-green
): containing, but not limited to: GWAS summary statistics, finemapping, colocalization, and autoreporting results.
Note: you will not be able to run GWAS immediately when the first genotype and phenotype files are released because the analysis team needs to generate the covariate files for the new release. This usually takes place sometime midway between the releases and when they are ready it is announced at Users' meeting and on FinnGen Community Slack.
Timeline of data releases: sample size and endpoints
R2
Q4 2018
Q1 2020
96,499
1,485
1,485
R3
Q2 2019
Q2 2020
135,638
2,737
1,801
R4
Q4 2019
Q4 2020
176,899
3,452
2,444
R5
Q2 2020
Q2 2021
218,792
3,858
2,803
R6
Q3 2020
Q3 2021
260,405
3,995
2,861
R7
Q1 2021
Q1 2022
309,154
4,149
3,095
R8
Q3 2021
Q3 2022
342,499
4,431
2,202
R9
Q1 2022
Q2 2023
377,277
4,526
2,272
R10
Q3 2022
Q4 2023
412,181
4,519
2,408
R11
Q1 2023
~Q2 2024
~445,000
4,415
2,444
R12
Q3 2023
~Q4 2024
~480,000
4,421
2,469
R13
Q1 2024
~Q1 2025
~500,000
NA
[1] total endpoint definitions [2] endpoints used for core GWAS and PheWAS.
Timeline of data releases: N in imputation, endpoints, and registry data
R1
52,295
53,866
53,866
-
v4.0
R2
102,739
103,695
103,695
-
v2.0
R3
146,630
152,796
152,796
-
v1.0
R4
183,694
196,849
198,328
211,154
v2.0
R5
224,737
224,650
224,566
224,438
v3.0
R6
271,341
271,343
269,718
271,123
v2.1
R7
321,464
321,464
321,302
320,953
v4.0 and v2.0
R8
356,213
356,138
356,077
356,082
v4.0 and v3.0
R9
392,649
392,649
392,423
329,539
v1.0
R10
430,897
430,885
429,209
429,861
v1.0
R11
473,681
473,681
473,681
473,580
v1.0
R12
520,210
520,210
520,210
520,105
v1.0
[1] total endpoint definitions [2] endpoints used for core GWAS and PheWAS.
Number of individuals with genotypes and phenotypes in FinnGen Data Releases. Starting from R5 the phenotype data has been filtered by genotyped individuals. Endpoint data includes individuals who have baseline data. Detailed longitudinal data includes only those individuals who have register data available. Some individuals in the phenotype data have been removed in QC steps. Data files can be found from Sandbox: /finngen/library-red/
Following data files are released to Sandbox per each Data Release
other registry data files (periodically updated and released to Sandbox)
Core Analysis Results files (released to FinnGen Production Library Green per each Data Release)
Here is the expected schedule for the next data freeze file releases.
Last updated