Case study – UpSet plot

Suppose you'd like to visualize some common comorbidities for asthma in FinnGen population. Perhaps the most intuitive method to see them is an UpSet plot.

There are two important parameters that shape the UpSet plot:

  • nsets, the number of sets to look at (default nsets = 5)

  • nintersects, the number of intersections to plot (default nintersects = 40)

If nintersects is set to NA, all intersections will be plotted.

# function to get a character vector of FINNDENIDs for a group of endpoints

# function to get character vector of FINNGENID's for a list of endpoints
get_endpoint_ids <- function(...){
  endpoints <- list(...)
  sql<-paste0(
    "SELECT DISTINCT FINNGENID ",
    "FROM finngen-production-library.sandbox_tools_r10.endpoint_cohorts_r10_v1 ",
    "WHERE ENDPOINT IN (", paste0("'", endpoints, "'", collapse = ","), ")" 
  )
  tb <- bq_project_query(projectid, sql)
  df <- bq_table_download(tb) 
  pull(df, FINNGENID) 
}

asthma_cohort <- get_endpoint_ids('J10_ASTHMA', 'ASTHMA_MIXED')
chronic_rhinitis_cohort <- get_endpoint_ids('J10_CHRONRHINITIS')
sinusitis_cohort <- get_endpoint_ids('J10_SINUSITIS', 'J10_CHRONSINUSITIS', 'J10_UPPERINFEC')
reflux_cohort <- get_endpoint_ids('K11_REFLUX')
obesity_cohort <- get_endpoint_ids('ASTHMA_OBESITY')
sleep_apnea_cohort <- get_endpoint_ids('G6_SLEEPAPNO')

all <- unique(c(asthma_cohort, chronic_rhinitis_cohort, sinusitis_cohort, reflux_cohort, obesity_cohort, sleep_apnea_cohort))

asthma_cohort <- as.numeric(factor(asthma_cohort, levels = all))
chronic_rhinitis_cohort <- as.numeric(factor(chronic_rhinitis_cohort, levels = all))
sinusitis_cohort <- as.numeric(factor(sinusitis_cohort, levels = all))
reflux_cohort <- as.numeric(factor(reflux_cohort, levels = all))
obesity_cohort <- as.numeric(factor(obesity_cohort, levels = all))
sleep_apnea_cohort <- as.numeric(factor(sleep_apnea_cohort, levels = all))

inputList <- list(
  ASTHMA = asthma_cohort,
  RHINITIS = chronic_rhinitis_cohort,
  SINUSITIS = sinusitis_cohort,
  REFLUX = reflux_cohort,
  OBESITY = obesity_cohort,
  SLEEP_APNEA = sleep_apnea_cohort
)

UpSetR::upset(
  UpSetR::fromList(inputList),
  nsets = 6,
  nintersects = NA, 
  order.by = 'freq', 
  mb.ratio = c(0.7, .3), 
  text.scale = c(1.3, 1.3, 1.3, 0.8, 1.3, 1),
  set_size.show = TRUE, 
  set_size.angles = 0,
  number.angles = 35,
  set_size.scale_max = 240000
)

Location of the script's in the Sandbox

Below is a path to the R Markdown file in Sandbox. This R Markdown file can be converted to an html page using "knit" command. The R Markdown is a plain text file, so you can copy code snippets that you need.

/finngen/library-green/scripts/code_snippets/BigQuery_Templates_release.Rmd

Last updated