Pipelines

Pipeline is a file of

  • code conducting multiple analyses

  • filtering procedures

  • and/or file conversations, those are designed to run in a logical order automatically (i.e. piped together).

Pipeline is for large scale analysis with parallelization and custom sized virtual machines. The pipeline facility of the FinnGen sandbox uses Google’s Genomics pipelines mechanism and is therefore based on the Cromwell/WDL system.

Cromwell

Cromwell is a Workflow Management System (hidden behind web UI). It manages VMs, keeps a log of tasks, allows the resubmission of partial jobs etc. Cromwell configuration limits parallel number of jobs.

  • Max concurrent workflows=20

  • Max workflow launch-count=10.

Sandbox users can formulate their batch jobs in the WDL job control language.

See following documentations about Pipelines in FinnGen Sandbox:

The status and information about existing pipelines are available in Sandbox.

In Sandbox select Applications -> FinnGen -> Pipelines to see the list of existing pipelines and their status.

There is an option to run pipelines also from command line.

To use pipelines from command line open Terminal from Application menu or from sidebar at bottom (Red arrows).

In the terminal you can enter your pipeline code. At the Sandbox desktop you find Demo_pipeline.wdl that you can use as a base of your code (Green arrow).

Last updated