Autoreporting results format
The autoreporting files consist of two file types: variant and group reports.
Autoreporting outputs two types of tab-separated results: variant and group reports. Group reports contain information about the groups built around credible sets, with one credible set per row. They have information about the credible set, and are annotated with various annotations.
The variant reports list all of the variants in the group reports' groups. These variants include credible set variants, as well as variants that were LD clumped together witrh the group lead variant. These are also annotated using a set of annotations. For more information about the Autoreporting tool and how it works, see Autoreporting in FinnGen.
Group reports
The columns are:
Column | Description |
---|---|
phenotype | Phenotype description |
phenotype_abbreviation | Phenotype name |
locus_id | Locus id (credible set max PIP variant), formatted as: chrCHROM_POS_REF_ALT |
rsids | rsids matching lead variant |
Cases | \# of cases in phenotype |
Controls | \# of controls in phenotype |
chrom | chromosome |
pos | position of lead variant |
ref | ref allele of lead variant |
alt | alt allele of lead variant |
pval | p-value of lead variant |
lead_r2_threshold | r2 threshold used for LD clumping variants to lead variant. |
lead_beta_previous_release | lead variant effect size in previous release |
lead_pval_previous_release | lead variant p-value in previous release |
lead_most_severe_consequence | lead variant most severe consequence. Taken from finngen variant annotation |
lead_most_severe_gene | lead variant gene of the most severe consequence. Taken from finngen variant annotation |
lead_mlogp | lead pval, transformed to -log10(pval) |
lead_beta | lead effect size |
lead_sebeta | lead std error of effect |
lead_af_alt | lead alt allele frequency |
lead_af_alt_cases | lead alt allele frequency for cases |
lead_af_alt_controls | lead alt allele frequency for controls |
gnomAD_functional_category | if lead variant had a functional effect (LoF,pLoF etc.), it is reported here. |
gnomAD_enrichment_nfsee | lead variant Finnish enrichment against non-Finnish, non-Swedish, non-Estonian European population. from gnomAD 2.1 exome data. |
gnomAD_fin.AF | Finnish AF in gnomAD 2.1 exome data |
gnomAD_fin.AN | finnish allele number in gnomAD 2.1 exome data |
gnomAD_fin.AC | Finnish allele count in gnomAD 2.1 exome data |
gnomAD_fin.homozygote_count | Finnish homozygote count in gnomAD 2.1 exome data |
gnomAD_fet_nfsee.odds_ratio | Fischer's exact test for enrichment odds ratio in Finnish vs non-Finnish, non-Estonian, non-Swedish European population. from gnomAD 2.1 exome data. |
gnomAD_fet_nfsee.p_value | Fischer's exact test for enrichment p-value in Finnish vs non-Finnish, non-Estonian, non-Swedish European population. from gnomAD 2.1 exome data. |
gnomad_nfsee.AC | gnomAD 2.1 non-Finnish,-Swedish,-Estonian European population allele frequency |
gnomAD_nfsee.AN | gnomAD 2.1 non-Finnish,-Swedish,-Estonian European population allele frequency |
gnomAD_nfsee.AF | gnomAD 2.1 non-Finnish,-Swedish,-Estonian European population allele frequency |
cs_id | credible set identifier |
cs_size | credible set size |
cs_log_bayes_factor | credible set lgo10 bayes factor |
cs_number | credible set number |
cs_region | Genomic region that was fine-mapped |
good_cs | see finemapping documentation |
credible_set_min_r2_value | minimum r2 to max PIP variant in credible set |
start | first variant in group (credible set + LD partners) |
end | last variant in group |
found_associations_strict | Associations credible set variants have in GWAS catalog. Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead |
found_associations_relaxed | Associations group variants have in GWAS catalog. Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead |
specific_yefo_trait_associations_strict | Associations credible set variants have in GWAS catalog. If the endpoint was matched to any specific EFO codes beforehand, those associations will show here. Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead |
specific_efo_trait_associations_relaxed | Associations group variants have in GWAS catalog. If the endpoint was matched to any specific EFO codes beforehand, those associations will show here.Associations are in format ASSOCIATION_NAME|r2_to_lead;ASSOCIATION_NAME|r2_to_lead |
n_ld_partners_0_8 | amount of LD partners with r2 > 0.8 |
n_ld_partners_0_6 | amount of LD partners with r2 > 0.6 |
The variant reports contain
credible set variants
annotated using summary statistics
FinnGen
assorted additional annotations
are compared against the GWAS catalog
They are TSV files and contain one row per variant.
The group reports contain formation per credible set. They are aggregated from the variant reports, with one credible set per row.
For more information, see the release documentation:
/library-green/finngen_R8_analysis_documentation/
Columns can and do vary between releases.
The HGNC symbols in the autoreporting files are version 38, as well as the VEP cache. "The most_severe_gene" and "most_severe" columns in the autoreporting file come from the finngen annotation file. Analysis team will update the release documentation to include this information (including those versions) for future releases.
Read more about Autoreporting in FinnGen, and see also FAQ Do the autoreports report the 95% or 99% credible sets
Last updated