Conditional analysis of Custom GWAS analyses

This page explains the following:

  1. How is conditional analysis performed for Custom GWAS analyses?

  2. How to get your endpoint analysed?

  3. How to access the data?

  4. What data is available and how is it structured?

Conditional analysis

In short, the pipeline selects regions with variants with p-values smaller than 5e-8, and runs conditional analysis using regenie. The results are collated and copied to green library.

Conditional analysis is done by running association tests on a region with the conditioned variants as covariates in the model. If there are variants in the region with p-value smaller than 1e-6, the conditional analysis is run again with the variant with the lowest p-value added to the conditioning variants. This is performed until either the maximum amount of conditioning variants (10) is reached, or until variants in the region no longer have small enough p-values in the region.

The conditional analysis pipeline is available here: https://github.com/FINNGEN/regenie-pipelines/blob/master/wdl/conditional-analysis/custom_gwas_pipeline/regenie_conditional_custom_gwas.wdl

How to get your endpoint analysed?

You can request your endpoint to be conditionally analysed in the same way you can request for your endpoint to be finemapped. To get your GWAS analysis analysed, send an email to our servicedesk (finngen-servicedesk(at)helsinki.fi), with the following information:

  1. Request the analysis for your endpoint

  2. endpoint name

  3. finngen release

  4. URL to the endpoint in user results Pheweb

For example:

to: finngen-servicedesk(at)helsinki.fi
subject: FinnGen Custom GWAS Conditional analysis request

Dear Service Desk,
Could you perform conditional analysis on the following endpoint:
Endpoint: ENDPOINT
Release: 11
GWAS results at FinnGen User Results Browser: https://userresults.finngen.fi/pheno/ENDPOINT

Best Regards,

User

For now, only release 11 endpoints are available for conditional analysis.

Data access

The data will be available in two places: The green library folder containing your imported GWAS endpoint summary statistic, and in the userresults pheweb browser. Conditional analysis results will be in the bucket /green_library/finngen_R11/sandbox_custom_gwas/PHENOTYPE/conditional/.

Conditional analysis results are automatically uploaded to the userresults pheweb browser.

The conditional analysis results are shown in the region view. To get to a region view, go to your endpoint, and either click on a genome-wide significant variant in the Manhattan plot, or on the table below.

Available files

Conditional analysis results will be in the bucket /green_library/finngen_R11/sandbox_custom_gwas/PHENOTYPE/conditional/

Here is a table describing those files or directories:

FilenameDescription

had_results

Contains "True" if conditional analysis resulted in analysis results. Contains "False" if there were no results (i.e. no region outside MHC had any variants with p-value < 5e-8)

CUSTOM_GWAS_sql.merged.txt

Data file used to import conditional regions to a database used by pheweb

PHENOTYPE.independent_snps.txt

File containing the independent signals in the endpoint

data/PHENOTYPE_LOCUS_STEP.conditional

Folder that contains the independent signals in individual regions.

CUSTOM_GWAS_sql.merged.txt

This file is used when importing conditional analysis results to a pheweb database. It is a comma-separated file where every value is quoted. It does not have a header. The columns in the file are:

Column nameColumn numberColumn description

rel

1

FinnGen release, number

type

2

type of finemapping data: one of finemap, conditional, susie

phenocode

3

endpoint name

chr

4

Chromosome, number

start

5

Start of finemapped region, position in basepairs

end

6

End of finemapped region, position in basepairs

n_signals

7

Number of signals in region

n_signals_prob

8

Probability of this number of signals: Not applicable to conditional analysis

variants

9

variants that the region was conditioned with

path

10

path to data file in pheweb file storage

PHENOTYPE.independent_snps.txt

This file is a tab-separated file without header. it contains the lead snps of the independent signals in the endpoint. The columns in the file are:

Column nameColumn numberColumn description

variant id

1

Variant id in format "chrom_pos_ref_alt"

beta

2

Effect size of variant

sebeta

3

Standard error of effect size

mlog10p

4

P-value of variant, transformed by -log10(p-value)

beta_conditioned

5

Conditioned effect size

sebeta_conditioned

6

Conditioned standard error

mlog10p_conditioned

7

Conditioned transformed p-value

conditioning_variants

8

Variants the model is conditioned on

data/PHENOTYPE_LOCUS_STEP.conditional

This file contains regenie output for each of the steps the data was conditioned. The file is a tab-separated file with a header. The columns in the file are the following:

Column nameColumn description

SNPID

variant identifier

CHR

chromosome

rsid

variant identifier

POS

variant position, in basepairs

Allele1

reference allele

Allele2

alternate allele

AF_Allele2

alternate allele frequency

p.value_cond

Conditioned p-value

BETA_cond

Conditioned effect size

SE_cond

Conditioned standard error

Last updated