Can I select only the columns needed for my analysis to import into RStudio?

FinnGen files take up a lot of memory, so it's advisable to understand what you're importing beforehand and will save you time and hassle. You can select what columns you'd like to import for your analysis (e.g. in the Sandbox Terminal Emulator) and save them in your working directory using a command like:

zcat /path/yourFileName.txt.gz | cut -f 1,3,10-50,80 | gzip > selectedColumns.txt.gz

This command selects columns 1, 3, from 10 to 50, and 80 from "yourFileName.txt.gz" and exports them to a new gzipped file.

Last updated