Genome build used in FinnGen

Human reference genome build GRCh38/hg38 is used for FinnGen genotype data. All alleles are coded according to the GRCh38/hg38 build reference alleles, and all alternative alleles are left-aligned in respect to the original variant position.

Chip array datasets generated based on previous reference genome builds are lifted-over to the GRCh38/hg38 build according to this protocol

Genotype reporting

  • Imputed alleles are coded 0 or 1, separated with ‘|’ (phased): 0 is for reference/wild type (WT) allele and 1 is for alternative allele.

  • In raw chip data alleles are coded 0 or 1, separated with ‘/’ (unphased): 0 is for reference/wild type (WT) allele and 1 is for alternative allele.

Therefore, genotypes can be of the form:

Post-Imputation Genotype Files (phased):

  • 0|1 or 1|0 heterozygotes

  • 1|1 homozygotes

  • 0|0 WT homozygotes

  • .|. missing data

Raw Chip data (unphased)

  • 0/1 heterozygotes

  • 1/1 homozygotes

  • 0/0 WT homozygotes

  • ./. missing data

Click here to read how to work with variant level data in the Sandbox

Last updated