How to download results from your IVM
After running an analysis you may want to download the results so that you can use them in presentations and publications.
Note. Do not take screenshots containing information from Red Data or its derivative! All graphs or figures you want to download will also go through the official download process. Also, any data pertaining to less than 5 individuals can never be downloaded from the Sandbox.
Share individual-level data
If you need to share individual-level data with another Sandbox user within the Sandbox environment, please view the section on sharing individual-level data.
What to consider when placing a download request
After placing a download request, a person on our data security team checks each file.
Please give them as much information as possible to make the job easy and therefore faster for you. Any information that is not clear will require clarification by email before your results can be downloaded.
If you are downloading code, please make sure any example individual IDs have been removed. Your code should be purely commands and should not contain any table views of the data or any other calculated estimates from the data, e.g. in a comment.
This is a human-staffed activity that is only provided from 8am-4pm Helsinki business days, click here for the current time in Helsinki. Note that the results from a GWAS run through the Custom GWAS tool are automatically moved to the green access, so this can be a faster way to get downloadable GWAS results.
Steps to make a request
Step 1:
First, you must copy the file(s) you wish to download to the "red" bucket of your organization. To keep your organization's data organized and easy to find, we recommend that users copy files into their own directory in the "red" bucket.
Please follow these steps to copy files to the "red" bucket using the gsutil command:
Open the "Applications" menu
Click the "Terminal Emulator"
Use the following command to copy <your file(s)> into the "red" bucket in <your folder> directory:
gsutil cp <your file(s)> gs://fg-production-sandbox-<sandbox number>-red/<your folder>/.
You can get the <sandbox number> from your sandbox URL. For example, if the sandbox URL is https://sandbox.finngen.fi/fg-production-sandbox-6/vm then the <sandbox number> is 6, and the "red" bucket is gs://fg-production-sandbox-6-red. Please remember to include the trailing '/' to create the <your folder> directory. If the folder(s) in the path doesn't exist, thegsutil cp
command will create it.
Please follow these steps to copy files to the "red" bucket using the File Manager:
Open the "Applications" menu
Click the "File Manager"
Right-click a file and select "Upload to red bucket" from the pop-up menu. Please note that when using the File Manager files can only be copied to the "red" bucket root directory.
Step 2:
Navigate to the /finngen/red directory in the File Manager by first selecting "File System" from the left, then the "finngen", and finally the "red" directory. You can also type "/finngen/red" in the File Manager navigation bar at the top. The /finngen/red directory is a read-only copy of the "red" bucket.
Step 3:
Right-click the file or directory you wish to download and select the "Request file download" option. Please follow the instructions in the form that is opened.
Note: The opened form will ask for an analysis proposal index. This is a mandatory field. Sandbox users are expected to be familiar with the research plan associated with analyses made in the Sandbox. These research plans should follow the Finngen Scientific plan. You can check the exact title of your analysis proposal from the FinnGen SharePoint here. You can read more about the FinnGen analysis proposals and when you need to submit one from here.
Step 4:
Once your download request has been handled you will receive an email notification with the decision from user admin. If your file was approved, the file will be found in your organization's "greendownloads" bucket. Email will also contain additional instructions how to download it from there. If your file was rejected, user admin will provide you with a reason for it and possibly a suggestion how to improve your file. The email will also contain further instructions of export rules.
Once your file is hosted in the cloud
There are two ways to download it to your local system:
1. Command line based access with gsutil
Please see instructions at Accessing green data with google cloud SDK. The direct path to the file, provided in the email, is all you need.
2. Web browser based access
Please see instructions at Accessing green data using google cloud console. You will be able to form a URL from the file path provided in the email.
Example: If, for instance, you are working in FIMM Sandbox no. 6 and your file is called test.txt
, the path of the released file that we provide to you will be something like this:
gs://fg-production-sandbox-6_greendownloads/<download-request-id>/test.txt.
The path consists of your organization’s SB greendownloads
bucket (fg-production-sandbox-6
for FIMM analysts), the download request id
(provided in the notification email), and then the actual file name (test.txt
).
You can form a web URL from this path as follows: https://console.cloud.google.com/storage/browser/fg-production-sandbox-6_greendownloads/<download-request-id>/
When you go to this URL you can see the file that was released to you. You can download the file by clicking on the three dots on the right-hand-side of the file. We recommend using Google Chrome for web browser access to released files.
Note that downloading data from the Sandbox green bucket (for instance gs://fg-production-sandbox-6_green
) is deprecated since Sandbox version v7.1.2. If user needs to access old download request files from Sandbox green bucket please contact the service desk (finngen-servicedesk@helsinki.fi).
How to do multi-file downloads in Sandbox
1. Multi-file downloads from files located in the IVM
If you wish to download multiple files at once from your /home/ivm/, please pack them into a zip or tar file and request for download of the entire package. Please note that the tar file should not contain additional packages inside.
To make a .tar file via the command line:
tar czf all_my_files.tar.gz /path/to/all_my_files/*
Please make sure you add a description of each file and that you have clearly stated the smallest N used for each file in the package. You can either write this into the Description field in the download request window or add a readme explaining these details into the package.
2. Multi-file downloads from a folder in the Red bucket
If you wish to download multiple files from your red bucket folder, please pack them into a zip or tar file and request for download of the entire folder.
To zip a folder that you already have in the Red bucket for a download request, please run the following command from the terminal in your IVM. Then you can request for download on that zipped folder from the red bucket.
Step 1:
To tar a folder that is in the /finngen/red/ please run the following command on your terminal:
Substitute <your_folder.tar> and <your_folder> with your folder name.
Step 2:
Then copy back the tar file to the red bucket:
Substitute <your_folder.tar> and <your_folder> with your folder name. If the folder(s) in the path doesn't exist, the gsutil cp
command will create it.
Then in your folder, you will see the <your folder.tar>. Right-click on that and request file download.
Please make sure you add a description of each file and that you have clearly stated the smallest N used for each file in the package. You can either write this into the Description field in the download request window or add a readme explaining these details into the package.
3. Multi-file download using modifiable workflow in Pipelines tool
You may use modifiable workflow called "Tar input files" in Pipelines tool to make a tar of all files you wish to download. When the tar is ready, you can place a download request directly from /finngen/pipeline/<path-to-your-workflow-output>/<your.tar.gz>
For more information, please see: How to use the Pipelines tool
Last updated