Fine-mapping

We used two state-of-the-art methods, FINEMAP (Benner, C. et al., 2016; Benner, C. et al., 2018) and SuSiE (Wang, G. et al., 2020) to fine-map genome-wide significant loci in FinnGen endpoints.

Briefly, there are three main steps:

1. Preprocessing

For each genome-wide significant locus (default configuration: P < 5e-8), we define a fine-mapping region by taking a 3 Mb window around a lead variant (and merge regions if they overlap). If a merged window exceeds 10MB, we iteratively shrink the window by 10%, until the merged window fits into 10MB or is split into merged windows that each fit into 10MB. We preprocess an input GWAS summary statistics into separate files per region for the following steps.

2. LD computation

We compute in-sample dosage LD using LDstore2 for each fine-mapping region.

3. Fine-mapping

With the inputs of summary statistics and in-sample LD from the steps 1-2, we conduct fine-mapping using FINEMAP and SuSiE with the maximum number of causal variants in a locus L = 10.

Notes

Due to too many variants and too large LD regions the following endpoints were run with a refined region selection algorithm that limits the region's size to a set maximum length:

  • G6_MUSDYST

  • G6_MYONEU

  • Q17_CONGEN_MALFO_GALLB_BILE_DUCTS_LIVER

  • Q17_OTHER_CONGEN_MALFO_DIGES_SYSTEM1

  • Q17_OTHER_CONGEN_MALFO_SKIN

  • RX_STATIN

  • BMI_IRN

  • HEIGHT_IRN

Integration to PheWeb

The "Credible Sets"-table on a phenotype page in the R10 PheWeb browser shows the SuSiE-fine-mapped credible sets of that phenotype. The variant shown per credible set is the maximum PIP (posterior inclusion probability) variant of that credible set. In addition to the causal variants, variants that were in sufficient LD (Pearson r^2 > 0.05), had a small enough p-value (pval < 0.01), and were close enough to the lead variant (distance to lead variant < 1.5 megabases) were clumped together with the credible set. Variants have been compared against GWAS Catalog and annotated. The LD grouping, annotation and GWAS Catalog comparison were done using the autoreporting pipeline.

The columns of the table are explained below:

Column name

Explanation

top PIP variant

variant with largest PIP int he credible set. Click the arrow to the left of the variant to show the credible set variants.

CS quality

This column shows whether the credible set is well-formed. a 'true' value means that the credible set is likely trustworthy, and a 'false' value means that the credible set is likely not trustworthy.

chromosome

The chromosome in which the credible set lies.

position

The position of the lead variant

p-value

p-value of the top PIP variant.

-log10(p)

-log10(p-value)

effect size (beta)

effect size of the top PIP variant.

Finnish Enrichment

Finnish enrichment of the top PIP variant.

Alternate allele frequency

alternate allele frequency of the top PIP variant.

Lead Variant Gene

A probable gene of the top PIP variant.

# coding in cs

number of coding variants in the credible set. Hover over the number to see the variant, the consequence, and the correlation (pearsonr squared) to the lead variant.

# credible variants

number of variants in the credible set.

Credible set bayes factor (log10)

The bayes factor related to the credible set.

CS matching Traits

Number of matches found in GWAS Catalog for the credible set variants. Hover over the number to see the trait, as well as the associated variant's LD (pearsonr squared) to the lead variant.

LD Partner Traits

Number of matches found in GWAS Catalog to the group of credible variants and variants in LD with the top PIP variant.Hover over the numbr to see the trait, as well as the associated variant's LD (pearsonr squared) to the lead variant.

Last updated