Data description
File naming pattern and file structure
Last updated
File naming pattern and file structure
Last updated
GWAS summary statistics (tab-delimited, bgzipped, genome build 38, tabix index files included) are named as {endpoint}.gz
. For example, endpoint I9_CHD
has I9_CHD.gz
and I9_CHD.gz.tbi
.
To learn more about the methods used, see section GWAS.
The {endpoint}.gz
have the following structure:
Two fine-mapping methods were used:
Fine-mapping results are tab-delimited and bgzipped.
SuSiE results have the following filename pattern:
{endpoint}.SUSIE.cred.bgz
{endpoint}.SUSIE.snp.bgz
FINEMAP results have the following filename pattern:
{endpoint}.FINEMAP.region.bgz
{endpoint}.FINEMAP.snp.bgz
{endpoint}.FINEMAP.cred.bgz
To learn more about the methods used, see section fine-mapping.
SuSiE output files {endpoint}.SUSIE.snp.bgz
have the following structure:
Linkage disequilibrium (LD) was estimated from SISU v3 for each chromosome. Use the tool LDstore (v1.1) for further usage of the bcor files.
ldstore --bcor FG_LD_chr1.bcor --incl-range 20000000-50000000 --table output_file_name.table
To learn more about the methods used, see section LD estimation.
Column name
Description
#chrom
chromosome on build GRCh38 (1-22, X
)
pos
position in base pairs on build GRCh38
ref
reference allele
alt
alternative allele (effect allele)
rsids
variant identifier
nearest_genes
nearest gene name from variant
pval
p-value from SAIGE
beta
effect size estimated with SAIGE for the alternative allele
sebeta
standard deviation of effect size estimated with SAIGE
maf
alternative (effect) allele frequency
maf_cases
alternative (effect) allele frequency among cases
maf_controls
alternative (effect) allele frequency among controls
Column name
Description
trait
endpoint name
region
chr:start-end
v
variant identifier
rsid
rs variant identifier
chromosome
chromosome on build GRCh38 (1-22, X
)
position
position in base pairs on build GRCh38
allele1
reference allele
allele2
alternative allele (effect allele)
maf
minor allele frequency
beta
effect size GWAS
se
standard error GWAS
p
p-value GWAS
mean
posterior expectation of true effect size
sd
posterior standard deviation of true effect size
prob
posterior probability of association
cs
identifier of 95% credible set (-1 = variant is not part of credible set)