Getting Started with R

What R is used for

The programming language R and its graphical interface RStudio are commonly used for statistics and data analysis. FinnGen installs the freely available versions of these tools in the Sandbox. FinnGen does not provide help debugging R scripts.

Ways to learn R

You should learn and develop R scripts on your own personal machine, not in the Sandbox. You can download R and RStudio to your own personal computer for this purpose. Often one uses mock or fake data on your personal computer to develop basic scripts and then upload these scripts to the Sandbox.

Free online courses

  • EdX taught by Harvard biostats professor Rafael Irizarry (Although you can pay for the certificate, the modules are each free)

  • datacamp (The first sessions “Introduction for R” are free, but more advanced courses have a monthly subscription fee.)

  • Coursera (choose R programming)

Free online R books

R is best used with the addition of packages and libraries to suit your specific needs. The vast majority of these are free and open source. For a basic group of packages that cover file, string, and data manipulation, along with improved data visualization, many people use the Tidyverse.

Many basic libraries and packages are already installed on the Sandbox and you can see the full list of these within RStudio. If you want additional packages you can send an email to humgen-servicedesk@helsinki.fi.

For those with University of Helsinki affiliation these classes are also available:

  • Basics of Statistics and R I and II, MAT21001 and MAT21002 (Online or in-person)

  • Genome-wide Association Studies, LSI34002 (Matti Pirinen).(Notes for this class include many examples in R and are freely available - here. )

Click here to read more about R libraries in Sandbox

Last updated