FinnGen Data Freezes and Releases
FinnGen has a Data Freeze (DF) twice a year in February and August. This means that FinnGen produces Data Release (R) of updated genotype and phenotype data files to the Sandbox.
About 3 months after each release, the FinnGen Core Analysis team releases a set of FinnGen core analysis results to the Google Cloud storage green bucket (gs://finngen-production-library-green
): containing but not limited to GWAS summary statistics, finemapping, colocalization, and autoreporting results.
Note: you will not be able to run GWAS immediately when the first genotype and phenotype files are released in February and August because the analysis team needs to generate the covariate files for the new release. This usually takes place sometime midway between the releases and when they are ready it is announced at Users' meeting and on FinnGen Community Slack.
Timeline for Data Releases:
Data Release | Date release to partners | Date release to public | Total sample size [1] | Total Endpoints | Core endpoints [2] |
---|---|---|---|---|---|
R2 | Q4 2018 | Q1 2020 | 96,499 | 1,485 | 1,485 |
R3 | Q2 2019 | Q2 2020 | 135,638 | 2,737 | 1,801 |
R4 | Q4 2019 | Q4 2020 | 176,899 | 3,452 | 2,444 |
R5 | Q2 2020 | Q2 2021 | 218,792 | 3,858 | 2,803 |
R6 | Q3 2020 | Q3 2021 | 260,405 | 3,995 | 2,861 |
R7 | Q1 2021 | Q1 2022 | 309,154 | 4,149 | 3,095 |
R8 | Q3 2021 | Q3 2022 | 342,499 | 4,431 | 2,202 |
R9 | Q1 2022 | ~Q2 2023 | 377,277 | 4,526 | 2,272 |
R10 | Q3 2022 | ~Q4 2023 | 412,181 | 4,519 | 2,408 |
R11 | Q1 2023 | ~Q2 2024 | ~445,000 | 4,415 | 2,444 |
R12 | Q3 2023 | ~Q4 2024 | ~480,000 | 4,421 | 2,469 |
R13 | Q1 2024 | ~Q1 2025 | ~500,000 | NA |
[1] samples used for PheWAS; [2] endpoints used for PheWAS.
Release | Individuals with imputed genotypes | Individuals with minimum phenotypes | Individuals with endpoints | Individuals with detailed longitudinal data | Latest version (endpoint and detailed longitudinal data) |
---|---|---|---|---|---|
R1 | 52,295 | 53,866 | 53,866 | - | v4.0 |
R2 | 102,739 | 103,695 | 103,695 | - | v2.0 |
R3 | 146,630 | 152,796 | 152,796 | - | v1.0 |
R4 | 183,694 | 196,849 | 198,328 | 211,154 | v2.0 |
R5 | 224,737 | 224,650 | 224,566 | 224,438 | v3.0 |
R6 | 271,341 | 271,343 | 269,718 | 271,123 | v2.1 |
R7 | 321,464 | 321,464 | 321,302 | 320,953 | v4.0 and v2.0 |
R8 | 356,213 | 356,138 | 356,077 | 356,082 | v4.0 and v3.0 |
R9 | 392,649 | 392,649 | 392,423 | 329,539 | v1.0 |
R10 | 430,897 | 430,885 | 429,209 | 429,861 | v1.0 |
R11 | 473,681 | 473,681 | 473,681 | 473,580 | v1.0 |
R12 | 520,210 | 520,210 | 520,210 | 520,105 | v1.0 |
Number of individuals with genotypes and phenotypes in FinnGen Data Releases. Starting from R5 the phenotype data has been filtered by genotyped individuals. Endpoint data includes individuals who have baseline data. Detailed longitudinal data includes only those individuals who have register data available. Some individuals in the phenotype data have been removed in QC steps. Data files can be found from Sandbox: /finngen/library-red/
Following data files are released to Sandbox per each Data Release
other registry data files (periodically updated and released to Sandbox)
Core Analysis Results files (released to FinnGen Production Library Green per each Data Release)
Here is the expected schedule for the next data freeze file releases.
Last updated