Analysis covariates

This page has been last updated for R11.

Sandbox directory

Analysis covariates are available in the following Sandbox directory:

/finngen/library-red/finngen_R[RELEASE]/analysis_covariates

Data files

The analysis covariate file is a tab-separated, gzip-compressed text file that contains covariate and endpoint data for each sample. The file contains three sets of columns:

  • column 1: Sample ID

  • columns 2 to N: covariates including principal components, usually ~30 columns

  • columns N+1 to N+1+number of endpoints: individual's phenotype status for each FinnGen endpoint

The covariate file does not contain FinnGen genotypes for individuals with non-Finnish ancestry. For more complete phenotype data see the phenotype files.

Often users will subset this file in R to run their own analyses and/or add additional analysis columns.

Columns

Column name

Description

FINNGENID

Sample ID

AGE_AT_DEATH_OR_END_OF_FOLLOWUP

Age of sample at death or end of followup

batch

batch

n_var

Number of genotyped variants

chip

Chip used for genotyping

IS_AFFY

Whether the sample was genotyped using Affymetrix v1 chip

IS_FINNGEN1_CHIP

Whether the sample was genotyped using Finngen v1 chip

IS_FINNGEN2_CHIP

Whether the sample was genotyped using Finngen v2 chip

IS_AFFY_V2

Whether the sample was genotyped using Affymetrix v2 chip

IS_AFFY_V2P2

Whether the sample was genotyped using Affymetrix v2, p2 chip

IS_AFFY_V3

Whether the sample was genotyped using Affymetrix v3 chip

AGE_AT_DEATH_OR_END_OF_FOLLOWUP2

AGE_AT_DEATH_OR_FOLLOWUP*AGE_AT_DEATH_OR_FOLLOWUP

BATCH*

Whether the sample was part of that genotyping batch. Can be used to control for batch-specific effects in analysis.

BL_*

See endpoint data documentation

SEX

See endpoint data documentation

*_IRN

Inverse rank-normalized quantitative endpoints

PC*

Individual's PCA value for that component

*

Individual's phenotype status (0/1/NA) for the FInnGen endpoint

Further information

The covariate file is used for GWAS and other analyses. The following covariates are used in FinnGen's core GWAS analyses:

  • Age

  • Sex

  • First 10 principal components

  • Genotyping batch ( Finngen 1 or 2 chip and legacy genotyping batch)

Note: This file is usually released a little later than the phenotype files as it needs the PCA results to be created.

Last updated