Meta-analysis of FinnGen with UK and Estonian Biobanks

Overview

Much like FinnGen, genotyping populations of specific geographical locations can provide insight into their genetic architecture. As such, the UK (UKBB) and Estonian (EstBB) Biobanks were established and we are employing it to further empower the analytical results from FinnGen.

GWAS summary statistics from matching phenotypes (assessed by ICD-10 code overlap) between studies i.e., FinnGen-UKBB and FinnGen-UKBB-EstBB, are meta-analyzed with a custom workflow using the inverse variance weighted method. Heterogeneity between studies is assessed with Cochran's Q test and p-value is reported for each variant. The meta-analysis summary statistics and LD clumping based autoreporting outputs (See topic Autoreporting - information of overlaps) will be released in the green library, in the analysis data for a given release. The files are in .tsv (tab-separated values) format.

Finemapping is not performed because the methods are not capable to handle heterogeneity between studies (e.g. different LD structure, different imputation qualities in variants).

Summary stat preprocessing

Prior to meta-analysis, sumstats are first filtered to remove any bad variants (e.g. missing or unrealistic beta/standard error/allele frequency).

Second, if needed, variants are lifted to genome build GRCh38 to match the FinnGen variants. Liftover is performed with Picard which, in addition to standard genome position liftover, checks that variants match the reference genome and flips alleles/strand if necessary. Variants that fail to liftover for any reason (e.g. variant doesn’t match reference, or no target for variant in new build) are discarded.

Third, variants are aligned with GnomAD reference and allele frequencies from the matching population are compared.

FinnGen-UKBB meta-analysis

Endpoints from pan-UKBB study are matched to FinnGen endpoints based on ICD-10 code overlap and meta-analysed together. Only perfectly matching endpoints are analysed. The European subset of the pan-UKBB study is used. Mainly already matching endpoints are analysed, but we are continuously creating and analysing custom endpoints from the pan-UKBB study to match our FinnGen endpoints.

Note: In FinnGen R6 - UKBB meta-analysis a simpler liftover (UCSC LiftOver) was used, with no reference alignment.

FinnGen R12 - UKBB meta-analysis results:

  • in the Sandbox:

/finngen/library-green/finngen_R12/finngen_R12_analysis_data/meta_analysis/ukbb/

FinnGen-UKBB-EstBB meta-analysis

Chosen FinnGen endpoints are matched by ICD-codes in Estonia Biobank and GWAS summary statistics are produced. Then, with matching endpoints from FinnGen and pan-UKBB, 3-way meta-analysis is performed.

FinnGen R12 - UKBB - EstBB meta-analysis results:

  • in the Sandbox:

/finngen/library-green/finngen_R12/finngen_R12_analysis_data/meta_analysis/ukbb_estbb/

Read more about file formats of the UKBB-FinnGen and Estonian BB-UKBB-Finngen meta-analysis results.

Last updated