Conditional analysis of Custom GWAS analyses

This page explains the following:

How is conditional analysis performed for Custom GWAS analyses?
How to get your endpoint analysed?
How to access the data?
What data is available and how is it structured?

Conditional analysis

In short, the pipeline selects regions with variants with p-values smaller than 5e-8, and runs conditional analysis using regenie. The results are collated and copied to green library.

Conditional analysis is done by running association tests on a region with the conditioned variants as covariates in the model. If there are variants in the region with p-value smaller than 1e-6, the conditional analysis is run again with the variant with the lowest p-value added to the conditioning variants. This is performed until either the maximum amount of conditioning variants (10) is reached, or until variants in the region no longer have small enough p-values in the region.

The conditional analysis pipeline is available here: https://github.com/FINNGEN/regenie-pipelines/blob/master/wdl/conditional-analysis/custom_gwas_pipeline/regenie_conditional_custom_gwas.wdl

How to get your endpoint analysed?

You can request your endpoint to be conditionally analysed in the same way you can request for your endpoint to be finemapped. To get your GWAS analysis analysed, send an email to our servicedesk (finngen-servicedesk(at)helsinki.fi), with the following information:

Request the analysis for your endpoint
endpoint name
finngen release
URL to the endpoint in user results Pheweb

For example:

to: finngen-servicedesk(at)helsinki.fi
subject: FinnGen Custom GWAS Conditional analysis request

Dear Service Desk,
Could you perform conditional analysis on the following endpoint:
Endpoint: ENDPOINT
Release: 11
GWAS results at FinnGen User Results Browser: https://userresults.finngen.fi/pheno/ENDPOINT

Best Regards,

User

For now, only release 11 endpoints are available for conditional analysis.

Data access

The data will be available in two places: The green library folder containing your imported GWAS endpoint summary statistic, and in the userresults pheweb browser. Conditional analysis results will be in the bucket /green_library/finngen_R11/sandbox_custom_gwas/PHENOTYPE/conditional/.

Conditional analysis results are automatically uploaded to the userresults pheweb browser.

The conditional analysis results are shown in the region view. To get to a region view, go to your endpoint, and either click on a genome-wide significant variant in the Manhattan plot, or on the table below.

Available files

Conditional analysis results will be in the bucket /green_library/finngen_R11/sandbox_custom_gwas/PHENOTYPE/conditional/

Here is a table describing those files or directories:

Filename

Description

had_results

Contains "True" if conditional analysis resulted in analysis results. Contains "False" if there were no results (i.e. no region outside MHC had any variants with p-value < 5e-8)

CUSTOM_GWAS_sql.merged.txt

Data file used to import conditional regions to a database used by pheweb

PHENOTYPE.independent_snps.txt

File containing the independent signals in the endpoint

data/PHENOTYPE_LOCUS_STEP.conditional

Folder that contains the independent signals in individual regions.

CUSTOM_GWAS_sql.merged.txt

This file is used when importing conditional analysis results to a pheweb database. It is a comma-separated file where every value is quoted. It does not have a header. The columns in the file are:

Column name

Column number

Column description

rel

FinnGen release, number

type

type of finemapping data: one of finemap, conditional, susie

phenocode

endpoint name

chr

Chromosome, number

start

Start of finemapped region, position in basepairs

end

End of finemapped region, position in basepairs

n_signals

Number of signals in region

n_signals_prob

Probability of this number of signals: Not applicable to conditional analysis

variants

variants that the region was conditioned with

path

path to data file in pheweb file storage

PHENOTYPE.independent_snps.txt

This file is a tab-separated file without header. it contains the lead snps of the independent signals in the endpoint. The columns in the file are:

Column name

Column number

Column description

variant id

Variant id in format "chrom_pos_ref_alt"

beta

Effect size of variant

sebeta

Standard error of effect size

mlog10p

P-value of variant, transformed by -log10(p-value)

beta_conditioned

Conditioned effect size

sebeta_conditioned

Conditioned standard error

mlog10p_conditioned

Conditioned transformed p-value

conditioning_variants

Variants the model is conditioned on

data/PHENOTYPE_LOCUS_STEP.conditional

This file contains regenie output for each of the steps the data was conditioned. The file is a tab-separated file with a header. The columns in the file are the following:

Column name

Column description

SNPID

variant identifier

CHR

chromosome

rsid

variant identifier

POS

variant position, in basepairs

Allele1

reference allele

Allele2

alternate allele

AF_Allele2

alternate allele frequency

p.value_cond

Conditioned p-value

BETA_cond

Conditioned effect size

SE_cond

Conditioned standard error

PreviousPheWeb Users Input Validator tool NextPipelines

Last updated 1 year ago

Was this helpful?