How do I make a custom endpoint?

Before you get started

Whether you are a clinician or an analyst, you might have a specific disorder of interest for which you would like to create an endpoint using your own criteria and be able to use that endpoint for downstream analyses. This is, in fact, very straightforward in FinnGen! However, we recommend that you first take a look at the Risteys endpoint browser to see if there already is an endpoint that suits your needs.

Core endpoints

If you found the endpoint that you are interested in from Risteys, keep in mind that in FinnGen we have core and non-core endpoints. GWAS are run only on core endpoints, so if your endpoint is a core endpoint, great! It already has PheWas results you can view in the PheWeb browser, and you only need to follow that link to view and download its summary statistics. If the Risteys or PheWeb browsers are new to you, please take look at sections How to use Risteys as an endpoint browser and How to use PheWeb.

Non-core endpoint

However, if you're interested in a non-core endpoint or if you didn't find your favourite endpoint anywhere in Risteys, you can still create your own endpoint or run your own custom GWAS by following these instructions.

First, you should decide what your criteria are. How you would like to define your endpoint, and how specific would you like it to be?

If you are not familiar with Finnish National Health Registers and the register codes used in them, see sections Finnish Health registers and International and Finnish Health code sets.

Selecting Health Register and Register codes

We recommend selecting register codes with help of medical professionals, who better understand how codes are used and how register data was collected for specific codes.

Finnish health code sets

It is especially important to understand explicitly Finnish health code sets. Some register codes are Finnish-specific, meaning that they differ from their corresponding international codes, so it's important to check that you are using the correct codes from the correct code sets. For example, Finnish versions of ICD-8/9/10 are not the same as the US clinical versions, nor do they map exactly to the WHO ICD (although the Finnish versions do extend and modify the WHO versions). Take a look at the section Register code translation files available in Sandbox to find out where you can find code translation files if you aren't comfortable in Finnish. From those files you can find the condition(s) you are interested in, and the codes that map on to them.

When you are certain about what registers and register codes you would like to use to define your endpoint, you can create your custom endpoint using Atlas, or by using R in the Sandbox if you prefer (and have more experience).

Create a custom endpoint using Atlas

  1. Launch Atlas by going to the Applications menu in Sandbox and selecting FinnGen > Atlas.

  2. In the section How to define a cohort in Atlas you can find detailed instructions on defining your case and control cohorts for analysis. These cohorts can then be used to define an endpoint, and you can run, for example, a GWAS using FinnGen's Custom GWAS tool or Cohort Operations tool to analyse them.

When using Atlas for creating an endpoint, you can use registers that are in the Detailed longitudinal data file found in Sandbox. Other registry data files that are in Sandbox are not yet implemented for use by Atlas, and thus if you wish to use them you will have to use R.

Detailed longitudinal data contains the following Finnish Health Registers:

  • Hilmo inpatient data and specialist outpatient data

  • Primary care data

  • KELA drug purchases and reimbursement data

  • Cancer register data

  • Cause of death register data

For more information, see the section How do I run a GWAS of a phenotype I create myself. See also General workflows for the most common analyses researchers are conducting in the Sandbox with the FinnGen data.

Creating your own endpoint using R

Creating an endpoint using other Registers in Sandbox will require some programming skills and experience using R. We have code snippets of R and Python that can help you with this in /finngen/library-green/scripts/code_snippets. The section How to check case counts from the data shows several example commands that you can use. Though this will require more effort, after you've created an endpoint successfully, you can run any analyses you like with it (see How do I run a GWAS of a phenotype I create myself). Also be careful using R to select not only the codes of interest but also the code set - e.g. in ICD10 F29 = psychosis but in the ICPC2 code set F29 = eye discomfort.

Last updated