Data description
File naming pattern and file structure
Last updated
File naming pattern and file structure
Last updated
GWAS summary statistics (tab-delimited, bgzipped, genome build 38, tabix index files included) are named as {endpoint}.gz
. For example, endpoint I9_CHD
has I9_CHD.gz
and I9_CHD.gz.tbi
. Note that the results are based on imputed genotype data and produced using SAIGE and that is why the data is not presented as integers but might contain digits.
To learn more about the methods used, see section GWAS.
The {endpoint}.gz
have the following structure:
*)Note that the results are based on imputed genotype dosages and produced using SAIGE and that is why the data is not presented as integers but might contain digits.
Two fine-mapping methods were used:
Fine-mapping results are tab-delimited and bgzipped.
SuSiE results have the following filename pattern:
{endpoint}.SUSIE.cred.bgz
{endpoint}.SUSIE.snp.bgz
FINEMAP results have the following filename pattern:
{endpoint}.FINEMAP.region.bgz
{endpoint}.FINEMAP.snp.bgz
{endpoint}.FINEMAP.config.bgz
To learn more about the methods used, see section Fine-mapping.
SuSiE output files {endpoint}.SUSIE.snp.bgz
have the following structure:
Linkage disequilibrium (LD) was estimated from SISU v3 for each chromosome. Use the tool LDstore (v1.1) for further usage of the bcor files.
ldstore --bcor FG_LD_chr1.bcor --incl-range 20000000-50000000 --table output_file_name.table
To learn more about the methods used, see section LD estimation.
The variant annotation has measures (HWE
, INFO
, ...) listed per batch.
Column name
Description
#chrom
chromosome on build GRCh38 (1-22, X
)
pos
position in base pairs on build GRCh38
ref
reference allele
alt
alternative allele (effect allele)
rsids
variant identifier
nearest_genes
nearest gene name from variant
pval
p-value from SAIGE
beta
effect size estimated with SAIGE for the alternative allele
sebeta
standard deviation of effect size estimated with SAIGE
maf
alternative (effect) allele frequency
maf_cases
alternative (effect) allele frequency among cases
maf_controls
alternative (effect) allele frequency among controls
n_hom_cases
number of homozygous cases*
n_het_cases
number of heterozygous cases*
n_hom_controls
number of homozygous controls*
n_het_controls
number of heterozygous cases*
Column name
Description
trait
endpoint name
region
chr:start-end
v
variant identifier
rsid
rs variant identifier
chromosome
chromosome on build GRCh38 (1-22, X
)
position
position in base pairs on build GRCh38
allele1
reference allele
allele2
alternative allele (effect allele)
maf
minor allele frequency
beta
effect size GWAS
se
standard error GWAS
p
p-value GWAS
mean
posterior expectation of true effect size
sd
posterior standard deviation of true effect size
prob
posterior probability of association
cs
identifier of 95% credible set (-1 = variant is not part of credible set)