EA3 study: Fatty liver disease study and data in Sandbox

If you wish to use EA3 data, please familiarize yourself with these instructions.

Sandbox directory

Latest version

FLD status and lab measurement data from Helsinki, Auria, Tampere and Kuopio biobanks:

/finngen/library-red/EA3_FLD_3.0/

Previous versions

FLD status and lab measurement data from Helsinki, Auria and Kuopio biobanks:

/finngen/library-red/EA3_FLD_2.0/

FLD status only from Helsinki, Auria, Tampere and Kuopio biobanks:

/finngen/library-red/EA3_FLD_1.0/

Study description

Expansion Area 3 (EA3) studies aim to acquire additional clinical data centered around specific diseases through biobanks.

Aim: The fatty liver (FLD) deep phenotyping study aimed to capture information on fatty liver and cirrhosis status and biomarkers related to liver diseases, metabolic state, alcohol consumption, as well as biopsy and elastography results to assess more detailed steatosis/cirrhosis status. Fatty liver disease status was assessed from abdominal imaging reports using a text mining algorithm. Fatty liver disease was selected as an EA3 study trait due to low usage of ICD codes for the fatty liver diseases in existing registry data collections. This study aimed to provide more accurate information on fatty liver and cirrhosis status and fatty liver subtype definitions in real world data settings.

Methods: Data was captured from four biobanks, Auria, Helsinki, Tampere, and Eastern Finland biobanks. The search algorithms were prepared by Samu Kurki and tested and validated first in HUS Hospital data lake. First phase data mining was performed using text mining for fatty liver and cirrhosis status from imaging reports from electronic health records. In addition, laboratory database searches were done for relevant biomarkers (blood glucose, lipids, liver enzymes, markers for alcohol consumption etc), as well as for BMI.

Biopsy and elastography measurements will be acquired on next phase during 2023.

Data: The FLD data has been placed in the Sandbox and is available for analysis. FLD samples are from DF10. Data collected from 4/6 hospital biobanks and collection time is 31.5.2012-31.5.2022.

Summary of the data (Oct 2023):

Last updated