How to run GWAS using regenie unmodifiable pipeline

How to run regenie GWAS analysis using unmodifiable pipelines

Unmodifiable pipelines are predefined workflows that cannot be modified by the user. The advantage of running unmodifiable pipelines compared to the standard modifiable pipelines is that you will get results directly to the green library and the User results PheWeb browser. No download requests are needed, because results of unmodifiable pipeline have been verified not to contain any individual-level data.

Running the regenie unmodifiable pipeline is very similar to running regenie in custom GWAS tool, with just a few small changes. The unmodifiable regenie pipeline can be accessed in the sandbox from

The workflow for running the pipeline is:

First, prepare your phenotype (and possibly covariate) files. The phenotype file should contain columns named FID and IID, and one column for each phenotype included. The FID and IID columns should both contain the FINNGENID, with one line per individual. In case you do not use any custom covariates, there is no need to edit the default covariate file, but otherwise see how to create covariate + phenotype file for GWAS pipeline for more information. Note that you don't need to add covariates in the phenotype file.

Second, create a file with phenotype description(s), one per row. The phenotype description is a text description of the phenotypes in your phenotype file. These will be used when/if the endpoint will be uploaded to the userresults browser. This file should have one line per phenotype column in the phenotype file (described above) and the phenotype descriptions should be in the same order as the phenotype columns in your phenotype file.

Third, fill in the json for the workflow. The options that you are required to edit are:

  • regenie_unmod.pheno_file: This field should state the full path of your phenotype file, created in the first step.

  • regenie_unmod.phenolist: A comma-separated list of your phenotypes. Example value: "endpoint1,endpoint2"

  • regenie_unmod.phenodescriptionlist": This should be changed to the full path of the phenotype description file you created in the second step.

In addition, there are several other options you may wish to edit, depending on your covariates, phenotype or the GWAS model you plan to use:

  • regenie_unmod.covariates: These are the covariates included in the analysis, as a comma-separated list. Only modify this variable if you want to include extra covariates or don't want to include all of the covariates used in the core analyses.

  • regenie_unmod.covariate_file: If you want to include any custom covariates that are not available in the regular covariate file, include your custom covariate file here. This can be set as the same as your pheno_file, in the case where your phenotypes and analysis covariates are contained in one file.

  • regenie_unmod.test: This should be either "additive", "recessive" or "dominant", depending on the type of test you want to use.

  • "regenie_unmod.is_binary": This input should be set to "true" if your endpoint is a binary case-control endpoint, and "false" if it is quantitative.

On completion, the results will be automatically copied to the User results PheWeb browser and the green library bucket specific for each data release: /finngen/library-green/finngen_R[RELEASE]/unmodifiable_pipelines/UnmodifiableRegenieDF[RELEASE]/workflow_id

E.g. for R12, to /finngen/library-green/finngen_R12/unmodifiable_pipelines/UnmodifiableRegenieDF12/workflow_id. The workflow_id will be the pipelines app ID for your job.

Last updated