Autoreporting results format

The autoreporting files consist of two file types: variant and group reports.

Autoreporting outputs two types of tab-separated results: variant and group reports. Group reports contain information about the groups built around credible sets, with one credible set per row. They have information about the credible set, and are annotated with various annotations.

The variant reports list all of the variants in the group reports' groups. These variants include credible set variants, as well as variants that were LD clumped together witrh the group lead variant. These are also annotated using a set of annotations. For more information about the Autoreporting tool and how it works, see Autoreporting in FinnGen.

Group reports

The columns are:

ColumnDescription

phenotype

Phenotype description

phenotype_abbreviation

Phenotype name

locus_id

Locus id (credible set max PIP variant), formatted as: chrCHROM_POS_REF_ALT

rsids

rsids matching lead variant

Cases

\# of cases in phenotype

Controls

\# of controls in phenotype

chrom

chromosome

pos

position of lead variant

ref

ref allele of lead variant

alt

alt allele of lead variant

pval

p-value of lead variant

lead_r2_threshold

r2 threshold used for LD clumping variants to lead variant.

lead_beta_previous_release

lead variant effect size in previous release

lead_pval_previous_release

lead variant p-value in previous release

lead_most_severe_consequence

lead variant most severe consequence. Taken from finngen variant annotation

lead_most_severe_gene

lead variant gene of the most severe consequence. Taken from finngen variant annotation

lead_mlogp

lead pval, transformed to -log10(pval)

lead_beta

lead effect size

lead_sebeta

lead std error of effect

lead_af_alt

lead alt allele frequency

lead_af_alt_cases

lead alt allele frequency for cases

lead_af_alt_controls

lead alt allele frequency for controls

gnomAD_functional_category

if lead variant had a functional effect (LoF,pLoF etc.), it is reported here.

gnomAD_enrichment_nfsee

lead variant Finnish enrichment against non-Finnish, non-Swedish, non-Estonian European population. from gnomAD 2.1 exome data.

gnomAD_fin.AF

Finnish AF in gnomAD 2.1 exome data

gnomAD_fin.AN

finnish allele number in gnomAD 2.1 exome data

gnomAD_fin.AC

Finnish allele count in gnomAD 2.1 exome data

gnomAD_fin.homozygote_count

Finnish homozygote count in gnomAD 2.1 exome data

gnomAD_fet_nfsee.odds_ratio

Fischer's exact test for enrichment odds ratio in Finnish vs non-Finnish, non-Estonian, non-Swedish European population. from gnomAD 2.1 exome data.

gnomAD_fet_nfsee.p_value

Fischer's exact test for enrichment p-value in Finnish vs non-Finnish, non-Estonian, non-Swedish European population. from gnomAD 2.1 exome data.

gnomad_nfsee.AC

gnomAD 2.1 non-Finnish,-Swedish,-Estonian European population allele frequency

gnomAD_nfsee.AN

gnomAD 2.1 non-Finnish,-Swedish,-Estonian European population allele frequency

gnomAD_nfsee.AF

gnomAD 2.1 non-Finnish,-Swedish,-Estonian European population allele frequency

cs_id

credible set identifier

cs_size

credible set size

cs_log_bayes_factor

credible set lgo10 bayes factor

cs_number

credible set number

cs_region

Genomic region that was fine-mapped

good_cs

see finemapping documentation

credible_set_min_r2_value

minimum r2 to max PIP variant in credible set

start

first variant in group (credible set + LD partners)

end

last variant in group

found_associations_strict

Associations credible set variants have in GWAS catalog. Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead

found_associations_relaxed

Associations group variants have in GWAS catalog. Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead

specific_yefo_trait_associations_strict

Associations credible set variants have in GWAS catalog. If the endpoint was matched to any specific EFO codes beforehand, those associations will show here. Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead

specific_efo_trait_associations_relaxed

Associations group variants have in GWAS catalog. If the endpoint was matched to any specific EFO codes beforehand, those associations will show here.Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead

n_ld_partners_0_8

amount of LD partners with r2 > 0.8

n_ld_partners_0_6

amount of LD partners with r2 > 0.6

The variant reports contain

  • annotated using summary statistics

  • FinnGen

  • assorted additional annotations

  • are compared against the GWAS catalog

They are TSV files and contain one row per variant.

The group reports contain formation per credible set. They are aggregated from the variant reports, with one credible set per row.

For more information, see the release documentation:

/library-green/finngen_R8_analysis_documentation/

Columns can and do vary between releases.

The HGNC symbols in the autoreporting files are version 38, as well as the VEP cache. "The most_severe_gene" and "most_severe" columns in the autoreporting file come from the finngen annotation file. Analysis team will update the release documentation to include this information (including those versions) for future releases.

Read more about Autoreporting in FinnGen, and see also FAQ Do the autoreports report the 95% or 99% credible sets

Last updated