Principal components analysis (PCA) data
This page has been last updated for R11.
Sandbox directory
Principal components analysis (PCA) data is available in the following Sandbox directory:
/finngen/library-red/finngen_R[RELEASE]/pca_1.0
Data files
The following types of files are available:
PCA files
Sample list files
Please refer to the readme file in the sandbox directory for full details of the available data.
PCA files
PCA files are in the /data
subdirectory.
finngen_R[RELEASE].eigenval.txt
: List of eigenvalues of PCA. The nth line contains the eigenvalue for the nth principal component.finngen_R[RELEASE].eigenvec.txt
: Principal components of all samples used in the analysis.finngen_R[RELEASE]_eigenvec.var
: Loading of each variant.
Columns in the finngen_R[RELEASE].eigenvec.txt
file:
Column | Description |
FID | Family ID (same as IID for FG) |
IID | Sample ID |
PCn | Value of the nth principal component |
Columns in the finngen_R[RELEASE]_eigenvec.var
file:
Column | Description |
#CHROM | Chromosome number |
ID | Variant name |
MAJ | Major allele |
NONMAJ | Non-major allele |
PCn | Loading of the nth principal component |
Sample list files
Sample list files are in the /data
subdirectory.
These files may have a single column of sample IDs or they may be in the Fam format. A file in the Fam format contains two tab-separated columns that both have the same sample IDs.
finngen_R[RELEASE]_unrelated.txt
: Fam file of unrelated samples used for final PCA.finngen_R[RELEASE]_related.txt
: Fam file of samples related to the previous group projected onto their PC space.finngen_R[RELEASE]_rejected.txt
: Fam file of rejected samples.finngen_R[RELEASE]_duplicates.txt
: List of duplicate samples.finngen_R[RELEASE]_total_ethnic_outliers.txt
: List of samples not of Finnish ancestry.finngen_R[RELEASE]_final_samples.txt
: Fam file of all samples included in analysis.
Further information
See also: FAQ Where can I find a list of inrelated individuals in FinnGen?
Last updated