How to run survival analysis using GATE unmodifiable pipeline

How to run survival analysis using GATE unmodifiable pipeline

Unmodifiable pipelines are predefined workflows that cannot be modified by the user. Advantage of running unmodifiable pipelines compared to modifiable pipelines is that you will get results directly to the green library and the User results PheWeb browser. No download requests are needed, because results of unmodifiable pipeline have been verified not to contain any individual-level data.

Running the GATE unmodifiable pipeline is very similar to running GATE in regular pipelines, with just a few small changes. The unmodifiable GATE pipeline can be accessed in the sandbox from

The workflow for running the pipeline is: First, prepare a phenotype file for your endpoint. The phenotype file should contain the columns FID and IID for FINNGENIDs, as well as the endpoint column and its survivaltime column. If you are using any covariates not included in the analysis covariates file, include them here as columns as well. These new covariates can not have the same names as the analysis covariates. See How to run GWAS using GATE (survival models) for more information.

Second, create a file with phenotype description(s), one per row. These will be used when the endpoint is uploaded to the userresults browser. These should be in the same order that the endpoints in your phenotype list are going to be.

Third, fill in the input json in the unmodifiable pipeline json: gate.phenolist: A comma-separated list of your endpoints. Example value: "endpoint1,endpoint2" gate.phenodescriptionlist: The phenotype description file you created in the second step. gate.pheno_file: The phenotype file you created in the first step. gate.covariates: The covariates used in the analysis are given here, as a comma-separated list. By default this column contains the analysis covariates.

The results will be automatically copied to the User results PheWeb browser and the green library bucket specific for each data release: /finngen/library-green/finngen_R[RELEASE]/unmodifiable_pipelines/UnmodifiableGATEDF[RELEASE]/workflow_id E.g. for R12, to /finngen/library-green/finngen_R12/unmodifiable_pipelines/UnmodifiableGATEDF12/workflow_id. The workflow_id will be the pipelines app ID for your job.

Alternatively, you can access the results outside the sandbox by navigating (using a web browser) to https://console.cloud.google.com/storage/browser/finngen-production-library-green/finngen_RX/unmodifiable_pipelines/UnmodifiableGATEDFX where X is the FinnGen data freeze you ran the GWAS for (e.g. 12). You will need to log in with the same google account with which you access the FinnGen sandbox. In this page, look for the folder corresponding to your GWAS job ID and open it to access the GWAS output files.

Last updated