# Colocalization

### Colocalizations in FinnGen

Our [colocalization](https://finngen.gitbook.io/finngen-handbook/background-reading/colocalization) approach uses the probabilistic model for integrating GWAS and eQTL data presented in eCAVIAR ([Hormozdiari et al. 2016](https://pubmed.ncbi.nlm.nih.gov/27866706/)). Compared to eCAVIAR, we are using SuSiE ([Wang et al. 2019](https://stephenslab.github.io/susie-paper/)) to fine-map our inputs and provide an additional colocalization metric (CLPA).

Our goal is to extract a list of genomic regions that show colocalization between two phenotypes p1 and p2. Further, we assume that the summary statistics of p1 and p2 have been fine-mapped. The fine-mapping output for each phenotype contains three columns: the variant identifier (VAR), posterior inclusion probability (PIP), and the credible set (CS) identifier.

### CLPP

The Causal Posterior Probability (CLPP) is computed between two credible sets cs1 and cs2, with cs1 coming from a given phenotype p1 and cs2 coming from phenotype p2. CLPP is defined as follows: For vectors **x** and **y**, containing the PIP for variants in cs1 and cs2, respectively, CLPP is calculated by

![](https://finngen.gitbook.io/~gitbook/image?url=https%3A%2F%2F3072695768-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252F-MhYL0UTLjqsuIdK0SSO%252Fuploads%252Fgit-blob-16db276970cd01b10d9ec582abf915703ad99c90%252Fimage%2520%28349%29.png%3Falt%3Dmedia\&width=768\&dpr=4\&quality=100\&sign=b0cc915c\&sv=2)

This CLPP calculation is similar to equation 8 in Hormozdiari et al. 2016.

CLPP is dependent on the credible set size. By definition, any credible set size > 1 will yield a CLPP < 1.

#### CLPA <a href="#clpa" id="clpa"></a>

We derived another colocalization metric called causal posterior agreement (CLPA) that is independent of credible set size.

![](https://finngen.gitbook.io/~gitbook/image?url=https%3A%2F%2F3072695768-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252F-MhYL0UTLjqsuIdK0SSO%252Fuploads%252Fgit-blob-92f96cbfb91c2bcc819744735942807ac0bfa3d6%252Fimage%2520%28208%29.png%3Falt%3Dmedia\&width=768\&dpr=4\&quality=100\&sign=7e5f1c32\&sv=2)

The picture below shows how colocalizations are defined.

![](https://finngen.gitbook.io/~gitbook/image?url=https%3A%2F%2F3072695768-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252F-MhYL0UTLjqsuIdK0SSO%252Fuploads%252Fgit-blob-f5a0e62c51068bfa33643e1f7b6c77204ddf7c13%252Fimage%2520%28110%29.png%3Falt%3Dmedia\&width=768\&dpr=4\&quality=100\&sign=d047e4fa\&sv=2)

### Example Comparison

This rough example shows why we mostly use CLPA since it is independent of sample size.

![](https://finngen.gitbook.io/~gitbook/image?url=https%3A%2F%2F3072695768-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252F-MhYL0UTLjqsuIdK0SSO%252Fuploads%252Fgit-blob-b87f7c1f68582aed20b756a735b798d11d85d6a0%252Fimage%2520%2869%29.png%3Falt%3Dmedia\&width=768\&dpr=4\&quality=100\&sign=104a711e\&sv=2)

### Data

The colocalization is performed between FinnGen endpoints as well as between FinnGen endpoints and various QTL resources, as shown in the image below.

![](https://finngen.gitbook.io/~gitbook/image?url=https%3A%2F%2F3072695768-files.gitbook.io%2F%7E%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252F-MhYL0UTLjqsuIdK0SSO%252Fuploads%252Fgit-blob-94f4f1190888d0b48b02bedbab03c9dfcc28cd01%252Fimage%2520%28134%29.png%3Falt%3Dmedia\&width=768\&dpr=4\&quality=100\&sign=5d67d35b\&sv=2)

These resources are listed below:

### FinnGen resources

The SuSiE finemapping results for the release were used as the FinnGen data.

### Expression QTL datasets

* GTEx v8: SuSiE fine-mapping, 49 tissues, donors of mixed ancestry, Aguet et al. (2019, BioRxiv) (49 tissues only involve tissues with a sample size of n >= 50). Fine-mapping performed by Hilary Finucane, Jacob Ulirsch, Masahiro Kanai from the [Finucane Lab](https://www.finucanelab.org/). Effect size interpretation: change in normalised gene expression (sd units) per alternate allele. Normalization = inverse normal transformation.
* EMBL-EBI (European Bioinformatics Institute) [eQTL catalogue datasets](http://www.ebi.ac.uk/eqtl/Datasets/). eQTL data from 24 tissues/cell types, 16 RNAseq sources, 6 Microarray, SuSiE fine-mapping, donors of 88% European ancestry, Kerimov et al. (2020, BioRxiv). For RNAseq data, four quantification methods (gene expression, exon expression, transcript usage, txrevise event usage). Fine-mapping was performed by [Kaur Alasoo and Nurlan Kerimov](https://kauralasoo.github.io/). Effect size interpretation: change in normalised gene expression (sd units) per alternate allele. Normalization = inverse normal transformation.
* [FUSION study](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001048.v2.p1) (RNAseq), muscle and adipose tissue.
* [Kolberg](https://elifesciences.org/articles/58705): mega-analysis of immune cells from the microarray datasets.

### Metabolon QTL datasets

* GeneRISK: 186 lipid species QTLs, SuSiE fine-mapping of Widen et al. (2020), 7632 Finnish samples. Effect size interpretation: change in standard deviation of the lipid species per alternate allele.

### Biomarkers

* &#x20;UK Biobank: 36 continuous endpoints, 57 biomarkers from UKBB prepared by [Finucane lab, 361'194 White British samples](https://www.finucanelab.org/data), SuSiE fine-mapping. Effect size interpretation for quantitative traits: change in standard deviation of the normalized outcome per alternate allele. Effect size interpretation for binary traits increase in log(odds ratios) per alternate allele.

### Post-colocalization QC

Only unique source1-source2-pheno1-pheno2-tissue2-quant2-locus\_id1-locus\_id2 combinations were included in the results. FinnGen endpoints with \_COMORB-definition were left out of the results.

### Acknowledgements

We thank the following people for helping us assembling the QTL resources:

* &#x20;Kaur Alasoo and Nurlan Kerimov provided us the fine-mapped EMBL-EBI eQTL catalogue datasets.
* Hilary Finucane, Jacob Ulirsch, Masahiro Kanai gave us access to their fine-mapped GTEx data.
