Combine cohorts with CO

Cohorts uploaded from Atlas, Genotype Browser, or from a text file can all be uploaded and combined in flexible ways using Operate Cohorts method in Cohort Operations tool.

See also a tutorial video about CO available from FinnGen data users' meeting 25th Jan 2022 recording at 16min35sec and FinnGen data users' meeting 28th June 2022 at 28min55sec.

Tip: Cohorts created with CO can be uploaded and visualized for timeline trajectories per patient, clustered for HeatMap presentation, and overview plots can be created by using Trajectory Visualization Tool (TVT). See an Example workflow for survival time/endpoint analysis.

Combine two heterozygote files into one file

First, use Genotype Browser to build genotype variant cohorts. Then upload Upload cohorts into the Cohort Operations tool.

The example files here are the same cohorts used when uploading cohorts to the Cohort Operations tool. Files contain homozygotes (alternative 1|1 and reference 0|0) and heterozygotes (1|0 and 0|1) for variant 20:46811367:C:G. Here we will combine phased heterozygotes (0|1 and 1|0) to form one file of unphased heterozygotes (either 0|1 or 1|0).

The Operate Cohorts page is under the Cohorts workbench view. The first option "Select a new cohort's entry and exit" is available for cohorts with cohort start and end dates, such as cohorts originating from Atlas (see Operate Atlas cohort with CO). We'll concentrate on the second option to select new cohorts by patients' FinnGen IDs.

To start file combination, drag the first heterozygote file from the right side of the screen to the left. Then select the "or in" operator to allow a person in either file to be in the combined cohort. Then drag the second heterozygote file from right to left to complete the operation. Once finished click Create New Cohort.

The new combined cohort will appear in the Result cohort view. Here you can see the same summary statistics as when uploading cohorts from Atlas or from a file with a number of cases and blue and red bars for males and females. If you need to have also the cohort entry and exit dates see Operate on Atlas cohorts with CO. You can upload the new cohort directly to the workbench of the same session by clicking the Copy new cohort to workbench button.

The upset plot visualizes the distribution of persons in the new cohort. The bars representing persons included in the new cohort are shown in black.

The upset plot is meant to enable visual validation of the newly formed cohort. It can also be used for viewing the data in general.

Here, for example, we can see that for position 20:46811367 there are 26 C|C genotypes, 43 C|G genotypes, 51 G|C genotypes, and 119 G|G genotypes, those are not in excessive earwax cases nor in controls. For all genotypes, there are more persons in control than in case cohorts with similar proportions (7.2-8.4 % in cases for each genotype).

Combine genotype and phenotype cohorts

Complicated cohort operations and merging of cohorts are easily done with the CO. Here we will combine the genotype cohorts from Genotype Browser to the phenotype cohorts from Atlas. We will also combine the unphased heterozygotes created by combining the two heterozygote files with CO above.

For example, let's exclude all genotypes not included in our phenotype cohort of excessive earwax or to the control group. To create a heterozygote cohort (0|1 or 1|0) where the persons are in cases or in control groups use the following settings

The new cohort has 168363 persons that are the remaining number of persons after the exclusion of 91 heterozygous persons without phenotype information (see above).

Settings for homozygous cohorts to exclude persons not included in Excessive earwax cases or controls cohorts are as follows

We are also interested to explore the homo- and heterozygotes for the variant 20:46811367 within persons with excessive earwax diagnosis. Therefore we create similar genotype cohorts as above but include only patients who are included in excessive earwax cases.

For cohort including homo- and heterozygotes for the variant 20:46811367 C and having excessive earwax diagnosis

For cohort including homozygotes for 20:46811367 G (wild type) and having excessive earwax diagnosis

In the workbench view, we may now delete the unneeded genotypes and continue with the newly formed genotype cohorts. Exporting cohorts is also possible from the workbench view.

Last updated