There are two variations of this example. The first example gets the data for a list of FINNGENIDs and adds English descriptions of codes to data by using FinnGenUtilsR package, and the second example reads the list of FINNGENIDs from a text file having a column titled FINNGENID (as one of the columns).
Get all the data for a list of FINNGENID's
In this version, the full data is fetched for a given list of FINNGENID's (which may be a single id as well).
# helper function# returns a data frame with all the data for given FINNGENID'sget_all_data<-function(...){ finngenids <-list(...) sql<-paste0( "SELECT * ", "FROM finngen-production-library.",".sandbox_tools_r10.", "finngen_r10_service_sector_detailed_longitudinal_v2 ", "WHERE FINNGENID IN (", paste0("'", finngenids, "'", collapse =","), ")" ) tb <-bq_project_query(projectid, sql)} # give a list of FINNGENIDs here (these are fake ID's)tb <-get_all_data('FG00001', 'FG00002', 'FG00003')# extend with code translation by using FinnGenUtilsR packagedf <- FinnGenUtilsR::fg_bq_append_code_info_to_longitudinal_data( projectid, tb,"finngen-production-library.medical_codes.fg_codes_info_v1")df_with_codes <-bq_table_download(df)%>%arrange(EVENT_AGE)%>%mutate(AGE =round(EVENT_AGE)) %>%select(FINNGENID, AGE, SOURCE, name_en, CODE1, CODE2, CODE3)# # for a small data set we can use an interactive table# DT::datatable(df_with_codes)# this works even with large data setsdf_with_codes
Get all the data for a file of FINNGENID's
The second variation reads the FINNGENID's from a file which is more convenient if you have many of them. FINNGENIDs are given as a tab-separated text file (.tsv) where there must be a column named "FINNGENID". This can be a file generated by other FinnGen tools, e.g. Cohort Operations.
# create an example data file with fake FINNGENID'sfinngenids <-tribble(~FINNGENID, ~OTHER_COLUMN, ~YET_ANOTHER_COLUMN,'FG00002', 1, 2,'FG00003', 3, 4,'FG00004', 5, 6)write_tsv(finngenids, "FINNGENIDS.TSV")get_all_data<-function(...){ finngenids <-read_tsv(filename, show_col_types =FALSE)%>%pull(FINNGENID) sql<-paste0( "SELECT * ", "FROM finngen-production-library.",".sandbox_tools_r10.", "finngen_r10_service_sector_detailed_longitudinal_v2 ", "WHERE FINNGENID IN (", paste0("'", finngenids, "'", collapse =","), ")" ) tb <-bq_project_query(projectid, sql)} # give FINNGENIDs heretb <-get_all_data("FINNGENIDS.TSV")df <- FinnGenUtilsR::fg_bq_append_code_info_to_longitudinal_data( projectid, tb,"finngen-production-library.medical_codes.fg_codes_info_v1")df_with_codes <-bq_table_download(df)%>%arrange(EVENT_AGE)%>%mutate(AGE =round(EVENT_AGE)) %>%select(FINNGENID, AGE, SOURCE, name_en, CODE1, CODE2, CODE3)# # for a small data set we can use an interactive table# DT::datatable(df_with_codes)# this works with larger data setsdf_with_codes
Location of the script's in the Sandbox
Below is a path to the R Markdown file in Sandbox. This R Markdown file can be converted to an html page using "knit" command. The R Markdown is a plain text file, so you can copy code snippets that you need.