Analysis covariates
This page has been last updated for R11.
Sandbox directory
Analysis covariates are available in the following Sandbox directory:
/finngen/library-red/finngen_R[RELEASE]/analysis_covariates
Data files
The analysis covariate file is a tab-separated, gzip-compressed text file that contains covariate and endpoint data for each sample. The file contains three sets of columns:
column 1: Sample ID
columns 2 to N: covariates including principal components, usually ~30 columns
columns N+1 to N+1+number of endpoints: individual's phenotype status for each FinnGen endpoint
The covariate file does not contain FinnGen genotypes for individuals with non-Finnish ancestry. For more complete phenotype data see the phenotype files.
Often users will subset this file in R to run their own analyses and/or add additional analysis columns.
Columns
Column name | Description |
FINNGENID | Sample ID |
AGE_AT_DEATH_OR_END_OF_FOLLOWUP | Age of sample at death or end of followup |
batch | batch |
n_var | Number of genotyped variants |
chip | Chip used for genotyping |
IS_AFFY | Whether the sample was genotyped using Affymetrix v1 chip |
IS_FINNGEN1_CHIP | Whether the sample was genotyped using Finngen v1 chip |
IS_FINNGEN2_CHIP | Whether the sample was genotyped using Finngen v2 chip |
IS_AFFY_V2 | Whether the sample was genotyped using Affymetrix v2 chip |
IS_AFFY_V2P2 | Whether the sample was genotyped using Affymetrix v2, p2 chip |
IS_AFFY_V3 | Whether the sample was genotyped using Affymetrix v3 chip |
AGE_AT_DEATH_OR_END_OF_FOLLOWUP2 | AGE_AT_DEATH_OR_FOLLOWUP*AGE_AT_DEATH_OR_FOLLOWUP |
BATCH* | Whether the sample was part of that genotyping batch. Can be used to control for batch-specific effects in analysis. |
BL_* | See endpoint data documentation |
SEX | See endpoint data documentation |
*_IRN | Inverse rank-normalized quantitative endpoints |
PC* | Individual's PCA value for that component |
* | Individual's phenotype status (0/1/NA) for the FInnGen endpoint |
Further information
The covariate file is used for GWAS and other analyses. The following covariates are used in FinnGen's core GWAS analyses:
Age
Sex
First 10 principal components
Genotyping batch ( Finngen 1 or 2 chip and legacy genotyping batch)
Note: This file is usually released a little later than the phenotype files as it needs the PCA results to be created.
Last updated