Atlas vs BigQuery cohorts

There are different ways to construct a cohort within Sandbox. You can construct your cohort by following steps here wit Atlas in Sandbox. You can also construct your cohort using BigQuery data tables.

For the cohort constructed basically using same criteria in Atlas and BigQuery, you will see that there will be difference in sample numbers due to the following reasons.

One of the reasons is that not all medical codes present in service sector detailed longitudinal data are present in the medical vocabularies.

Imagine, you are looking for Asthma ICD-8 diagnosis code 49320 present in service sector BigQuery table. The same ICD-8 diagnosis code is absent in the medical vocabularies thereby will not pull any such FINNGENIDs in cohort constructed in Atlas.

Another reason is that when a visit (refers to a row in service sector data) is considered the current OMOP conversion takes into account all the medical codes in that visit.

Imagine, you are looking for Asthma ICD-10 diagnosis code J45.9 present in the service sector BigQuery table as CODE1 and I10 as CODE2. The current OMOP conversion process takes in the combination of both CODE1 and CODE2 together thereby making the code J45.9+I10 which is absent in medical vocabularies and will also be absent in the cohort constructed in Atlas.

Another reason is that when important information about which medical vocabulary a code should belong to is missing then OMOP conversion will miss such visits.

Last updated