Finemapping results format

The finemapping results come from two different finemapping methods: FINEMAP and SuSiE.

The purpose of finemapping is to find the set of 1 or more variants most likely to be responsible for the association at that locus. This set of likely variants is referred to as a "credible set". You can read more about the motivations for finemapping in the main concepts: Finemapping.

Most severe transcript is chosen by first taking the most severe among canonical protein coding transcripts, if no canonical transcript exists, then first (random) other protein coding transcript is chosen corresponding to the most severe annotation. Precedence of severity is chosen according to Ensembl Variant Effect Predictor (VEP) default.

SuSiE outputs

Both 95% credible set and 99% credible sets are provided. The file with _99 contains 99% credsets as below. The SuSiE outputs have been annotated using the variant annotation file.

PHENONAME.SUSIE.cred.bgz and PHENONAME.SUSIE_99.cred.bgz

Contains credible set summaries from SuSiE fine-mapping for all genome-wide significant regions.

Column

Description

region

Region for which the fine-mapping was run

cs

Running number for independent credible sets in a region

cs_log10bf

Log10 Bayes factor comparing the solution of this model (cs independent credible sets) to cs -1 credible sets.

cs_avg_r2

Average correlation R2 between variants in the credible set

cs_min_r2

Minimum R2 between variants in the credible set

cs_size

How many SNPs the credible set contains

PHENONAME.SUSIE.cred.summary.tsv and PHENONAME.SUSIE_99.cred.summary.tsv

Summary of credible sets where the top variant for each CS is included.

Column

Description

trait

Phenotype

region

Region for which the fine-mapping was run

cs

Running number for independent credible sets in a region

cs_log10bf

Log10 Bayes factor comparing the solution of this model (cs independent credible sets) to cs -1 credible sets.

cs_avg_r2

Average correlation R2 between variants in the credible set

cs_min_r2

Minimum R2 between variants in the credible set

low_purity

boolean (TRUE, FALSE) indicator if the CS is low purity (low min R2)

cs_size

How many SNPs the credible set contains

good_cs

boolean (TRUE, FALSE) indicator if this CS is considered reliable. IF this is FALSE then top variant reported for the CS will be chosen based on minimum p-value in the credible set, otherwise the top variant is chosen by maximum PIP

cs_id

Credible set ID

v

Top variant (chr:pos:ref:alt)

p

Top variant p-value

beta

Top variant beta

sd

Top variant standard deviation

prob

overall PIP of the variant in the region

cs_specific_prob

PIP of the variant in the current credible set (this and previous are typically almost identical)

0..n

Configured annotation columns. Typical default most_severe, gene_most_severe giving consequence and gene of top variant

PHENONAME.SUSIE.snp.bgz and PHENONAME.SUSIE_99.snp.bgz

Contains variant summaries with credible set information.

Column

Description

trait

Phenotype

region

Region for which the fine-mapping was run

v, rsid

Variant IDs

chromosome

Chromosome no.

position

Position on the chromosome

allele1

Major allele

allele2

Minor allele

maf

Minor allele frequency

beta

Original marginal beta

se

Original standard error

p

Original p-value

mean

Posterior mean beta after fine-mapping

sd

Posterior standard deviation after fine-mapping

prob

Posterior inclusion probability

cs

Credible set index within region

lead_r2

R2 value for a lead variant (the one with maximum PIP) in a credible set

alphax

Posterior inclusion probability for the xth single effect (x := 1..L where L is the number of single effects/causal variants specified; default: L = 10).

PHENONAME.SUSIE.snp.filter.tsv and PHENONAME.SUSIE_99.snp.filter.tsv

SNPs that are part of good quality credible sets as reported in PHENONAME.cred.summary.tsv file.

Column

Description

trait

Phenotype

region

Region for which the fine-mapping was run

v

Variant ID (chr:pos:ref:alt)

cs

Running credible set ID within region

cs_specific_prob

Posterior inclusion probability for this CS

chromosome

Chromosome no.

position

Position on the chromosome

allele1

Major allele

allele2

Minor allele

maf

Minor allele frequency

beta

Original association beta

p

Original p-value

se

Original standard error

most_severe

Most severe consequence of the variant

gene_most_severe

Gene corresponding to most severe consequence

FINEMAP outputs

PHENONAME.FINEMAP.config.bgz

Summary of fine-mapping variant configurations from the FINEMAP method.

Column

Description

trait

Phenotype

region

Region for which the fine-mapping was run

rank

Rank of this configuration within a region

config

Causal variants in this configuration

prob

Probability across all n independent signal configurations

log10bf

Log10 Bayes factor for this configuration

odds

Odds for this configuration

k

How many independent signals are in this configuration

prob_norm_k

Probability of this configuration within k independent signals solution

h2

SNP heritability of this solution

#NAME?

95% confidence interval limits of SNP heritability of this solution

mean

Marginalized shrinkage estimates of the posterior effect size mean

sd

marginalized shrinkage estimates of the posterior effect standard deviation

PHENONAME.FINEMAP.region.bgz

Summary statistics on the number of independent signals in each region.

Column

Description

trait

Phenotype

region

Region for which the fine-mapping was run

h2g_snp or h2g

SNP heritability of this region

h2g_sd

Standard deviation of SNP heritability of this region

h2g_lower95

Lower limit of 95% CI for SNP heritability

h2g_upper95

Upper limit of 95% CI for SNP heritability

log10bf

Log10 Bayes factor compared against null (no signals in the region)

prob_xSNP

x columns for probabilities of different numbers of independent signals

expectedvalue

Expectation (average) of the number of signals

PHENOTYPE.FINEMAP.snp.bgz

Summary statistics of variants and what credible set they are most likely to belong to.

Column

Description

trait

Phenotype

region

Region for which the fine-mapping was run

v

Variant

index

Running index

rsid

Variant ID

chromosome

Chromosome no.

position

Position on the chromosome

allele1

Major allele

allele2

Minor allele

maf

Minor allele frequency

beta

Original marginal beta (effect size)

se

Original standard error

z

Original z-score

prob

Posterior inclusion probability

log10bf

Log10 Bayes factor

mean

Marginalized shrinkage estimates of the posterior effect size mean

sd

Marginalized shrinkage estimates of the posterior effect standard deviation

mean_incl

Conditional estimates of the posterior effect size mean

sd_incl

Conditional estimates of the posterior effect size standard deviation

p

Original p-value

csx

Credible set index for given number of causal variants x

Read more about Finemapping

Last updated