Case study – Tornado plot

There is a known gender difference in the age of schizophrenia onset. It is clearly visible in FinnGen data as well. The following R code will produce a tornado plot visualizing this difference in FinnGen population.

# get FINNGENID and the age at first schizophrenia diagnose
sql<-paste0(
  "SELECT FINNGENID, MIN(AGE) AS ENTRY_AGE ",
  "FROM finngen_r10_service_sector_detailed_longitudinal_v2.",
    "sandbox_tools_r10.",
    "finngen_r10_endpoint_cohorts_v1 ",
  "WHERE ENDPOINT = 'F5_SCHZPHR' ",
  "GROUP BY FINNGENID " 
)

tb <- bq_project_query(projectid, sql)
df <- bq_table_download(tb) 

# get the gender from baseline
df <- left_join(df, df_baseline, by = "FINNGENID") 

df <- df %>%
  mutate(AGE = round(ENTRY_AGE)) %>%
  mutate(AGE = cut(AGE, breaks = seq(from = 0, to = 110, by = 5))) %>%
  select(FINNGENID, AGE, SEX) %>%
  group_by(AGE,SEX) %>%
  summarise(N = n(), .groups = "drop") %>% 
  mutate(AGE = fct_drop(AGE)) %>% 
  drop_na(AGE,SEX) %>% 
  complete(AGE, SEX, fill = list(N = 0)) %>% 
  group_by(SEX) %>%
  mutate(ALL = sum(N)) %>%
  ungroup() %>%
  mutate(PR = round(100 * N/ALL, 3)) %>% 
  mutate(PR = ifelse(SEX == 'female', PR, -PR))

ggplot(df, aes(x = PR, y = AGE, fill = SEX)) +
  geom_bar(data = subset(df, SEX == 'female'), stat = "identity") +
  geom_bar(data = subset(df, SEX == 'male'), stat = "identity")+
  scale_x_continuous(labels = abs, limits = c(-15,15)) +
  theme_bw() +
  labs(
    title = "Schizophrenia age of onset",
    x = "Percentage",
    y = "Age group",
    fill = "Gender"
  )

Location of the script's in the Sandbox

Below is a path to the R Markdown file in Sandbox. This R Markdown file can be converted to an html page using "knit" command. The R Markdown is a plain text file, so you can copy code snippets that you need.

/finngen/library-green/scripts/code_snippets/BigQuery_Templates_release.Rmd

Last updated