Launch custom GWAS with CO

A tutorial video about CO, including how to run a custom GWAS, is available from FinnGen data users' meeting 28th June 2022 (28' 55"). The following are transcribed instructions.

Custom GWAS in CO is run in binary mode with an additive model and uses the same pipeline as the custom GWAS GUI and custom GWAS CLI tools.

Open the GWAS page in Cohort Operations. Select case and control cohorts from the drop-down menu.

After the cohorts are selected, CO automatically checks the number of patients having phenotypic and genotypic information and the overlap of the case and control cohorts.

CO also compares user-defined case and control cohorts to existing FinnGen endpoints. The similarity to existing FinnGen endpoints is shown for cases and controls with bar graphs and the Jaccard Index. The bars show the user-defined cohort in green, the FinnGen endpoint in blue, and the overlap between the two in dark green. The Jaccard Index is a measure that shows the percentage similarity between the user-defined cohort and nearby FinnGen endpoints.

Hovering the mouse over the bar will also show exact numbers for the comparison.

In the above example, there are no FinnGen endpoints similar to our case phenotype (excessive earwax), and we can tell since its Jaccard Index is less than 12%. Therefore, it probably would make sense to run a custom GWAS for excessive earwax. The control group is much more similar to FinnGen endpoint control groups (Jaccard Index 75-85%). Similarity of controls is expected, though, because controls are usually all individuals not listed as cases leading to a large overlap between even unrelated control endpoints. Still, if your controls and another FinnGen endpoint's overlap seemingly completely, it may be worth reflecting on.

After exploring the cohort comparisons, if you do decide to run a new GWAS, fill in the form describing your GWAS run. Cohort Operations will pre-fill the fields with automatic descriptions. Check the fields and edit if needed. The analysis name should preferably be short and representative of the cohorts. In some cases, the automatic description for the Analysis name may be too complicated for the pipeline and need to be modified in order to launch the custom GWAS pipeline. Finally, click Run GWAS analysis to start a custom GWAS run. Running a GWAS in CO uses the same pipeline as the custom GWAS GUI and custom GWAS CLI tools.

After completion or failure of the GWAS, it can be found from multiple sources. A successful GWAS can be found from PheWeb a day after it was completed.

Summary files and plots for custom GWAS runs are also available in the green library at https://console.cloud.google.com/storage/browser/finngen-production-library-green/finngen_R<no>/sandbox_custom_gwas/<phenotype_name>. Where <no> is the data freeze number and <phenotype_name> is the Analysis name. From Sandbox v 11.0 onwards the metadata file will be exported to the green library with the summary data. The final number of cases and controls in the GWAS run can be checked from the metadata file.

All custom GWAS results tables are saved in the Cromwell jobs directory in Sandbox /finngen/pipeline/cromwell/workflows/[workflow_name]/[workflow_ID]. For example, /finngen/pipeline/cromwell/workflows/regenie/e76cf4f0-d5d0-49d6-98d6...

They will also appear as a job to the pipelines tool and to the recents section in the Custom GWAS application. Save your run pipeline ID from the pipelines tool front page. This is useful if you later wish to run, for example, a finemapping pipeline for those specific custom GWAS results. See also Tips on how to find a pipeline job ID.

Last updated