Rare Variant Calling in V3C

The main purpose of the V3C Cluster Plot editing tool is to improve on the automated calling of rare variants. The Affymetrix/Thermofisher software does calling on a batch by batch basis, but for rare variants there is often not enough data for it to do an accurate job.

From Sandbox v10.3 onwards the Cluster Plots viewer V3C tool is also provided within Genotype Browser's Interactive Filters view. It can also be loaded on a separate internet browser page in Sandbox.

To use V3C on a separate internet browser page first you need to locate the intensity file for your variant of interest. You can read the intensity file directly from the /finngen/red library into V3C or you can copy it to your IVM. In either case, you will want to look up the exact chromosomal position of your variant first, as there are millions of variant files in the red library, and searching for a specific one there is next to impossible.

NB: intensity data is sampled. Sampling means that for each genotype only 100 individuals from each genotyping batch will be included in the intensity file. If you use the V3C tool to select individuals with the major genotype you will miss many individuals. Sampling is done to make the size and display time of the files manageable.

NB: intensity files only exist for chip variants.

Step 1:

Open Genotype Browser and search your variant. ClusterPlot viewer V3C will open at the end of the Genotype Browser's Interactive Filters view. If you use V3C through the Genotype browser move to Step 5.

OR

Find the identifier of your variant in the form chr_pos_bp_bp (e.g. 14_81143328_G_T.tsv). The safest place to locate this is in PheWeb. Note that underscores ("_") must be used.

Step 2:

The full path of your variant will then be: /finngen/library-red/finngen_R8/cluster_plot_1.0/data/chrN/variantN.tsv

e.g. for the above example:

/finngen/library-red/finngen_R8/cluster_plot_1.0/data/14/14_81143328_G_T.tsv

Step 3:

Open V3C by entering "firefox finngen_plot.html" in the directory you installed it - usually /home/ivm/FinnGen-Cluster-Plot. (If you have not already installed it, see Install ClusterPlot viewer V3C)

Step 4:

Click "Upload dataset" in the upper left corner and browse to the variant file located in Step 2.

Step 5:

Next select "New Selection" also in the upper left corner:

Step 6:

This enables you to create a polygon that encloses the variants you want to recall. Here we circle the true cluster of heterozygotes on the upper left. If you have questions on how to interpret cluster plots see section Cluster Plots.

Step 7:

After making your selection, click on "Confirm Selection". You can also choose to use filtered from "select filtered" button. Test different coloring options from the menu below the cluster plot.

Step 8:

To analyze only those in the cluster, you can then select "Export selection".

There are many other options available (such as the ability to look batch by batch). Most of these options should be somewhat intuitive. Write us on Slack if you have questions!

Last updated