Why are there differences in the GWAS results between Data Freezes/Releases?

Firstly, not all Data Freezes (DFs) or releases (Rs) are equal in terms of size and study subjects. The number of individuals in different DFs ranges from approximately 30,000 to 55,000 leading to bigger increase in power in some DFs. Additionally, the proportion of legacy cohorts varies across DFs. This can cause differences in results, particularly for certain diseases, as some legacy cohorts include disease-specific sample sets. The section "What's new in DF[XX] endpoints" in the Handbook gives you an overview of the legacy samples in each of the releases starting from DF8 onwards. The same section describes also the changes and updates made to the endpoints. Details of the endpoint changes and updates are also available at https://www.finngen.fi/en/researchers/clinical-endpoints (changelog files).

Secondly, we have continuously improved the analysis pipelines throughout FinnGen. Major changes include transitioning from SAIGE to Regenie in R7, and updating the imputation panel from Sisu v3 to Sisu v4 in R8, and then to Sisu v4.2 in R10. Also genotype QC process was updated for R10. For detailed information on tool and pipeline versions used, we recommend referring to the core analysis documents available alongside the GWAS results of each R in the green and red libraries, as well as in the FinnGen members area on finngen.fi (Documents/Analysis).

We recommend always using the latest datasets, as it reflects the most up-to-date information.

PreviousHow to find latest data releases?NextWhere can I find

Last updated 3 months ago

Was this helpful?