Version 1
: Received: 30 January 2020 / Approved: 31 January 2020 / Online: 31 January 2020 (05:15:01 CET)
How to cite:
Spjuth, O.; Capuccini, M.; Carone, M.; Larsson, A.; Schaal, W.; Novella, J.A.; Stein, O.; Ekmefjord, M.; Di Tommaso, P.; Floden, E.; Notredame, C.; Moreno, P.; Emami Khoonsari, P.; Herman, S.; Kultima, K.; Lampa, S. Approaches for Containerized Scientific Workflows in Cloud Environments with Applications in Life Science. Preprints2020, 2020010378 (doi: 10.20944/preprints202001.0378.v1).
Spjuth, O.; Capuccini, M.; Carone, M.; Larsson, A.; Schaal, W.; Novella, J.A.; Stein, O.; Ekmefjord, M.; Di Tommaso, P.; Floden, E.; Notredame, C.; Moreno, P.; Emami Khoonsari, P.; Herman, S.; Kultima, K.; Lampa, S. Approaches for Containerized Scientific Workflows in Cloud Environments with Applications in Life Science. Preprints 2020, 2020010378 (doi: 10.20944/preprints202001.0378.v1).
Cite as:
Spjuth, O.; Capuccini, M.; Carone, M.; Larsson, A.; Schaal, W.; Novella, J.A.; Stein, O.; Ekmefjord, M.; Di Tommaso, P.; Floden, E.; Notredame, C.; Moreno, P.; Emami Khoonsari, P.; Herman, S.; Kultima, K.; Lampa, S. Approaches for Containerized Scientific Workflows in Cloud Environments with Applications in Life Science. Preprints2020, 2020010378 (doi: 10.20944/preprints202001.0378.v1).
Spjuth, O.; Capuccini, M.; Carone, M.; Larsson, A.; Schaal, W.; Novella, J.A.; Stein, O.; Ekmefjord, M.; Di Tommaso, P.; Floden, E.; Notredame, C.; Moreno, P.; Emami Khoonsari, P.; Herman, S.; Kultima, K.; Lampa, S. Approaches for Containerized Scientific Workflows in Cloud Environments with Applications in Life Science. Preprints 2020, 2020010378 (doi: 10.20944/preprints202001.0378.v1).
Abstract
Containers are gaining popularity in life science research as they provide a solution for encompassing dependencies of provisioned tools, simplify software installations for end users and offer a form of isolation between processes. Scientific workflows are ideal for chaining containers into data analysis pipelines to aid in creating reproducible analyses. In this manuscript we review a number of approaches to using containers as implemented in the workflow tools Nextflow, Galaxy, Pachyderm, Argo, Kubeflow, Luigi and SciPipe, when deployed in cloud environments. A particular focus is placed on the workflow tool’s interaction with the Kubernetes container orchestration framework.
workflows; containers; cloud computing; Kubernetes; big data; reproducibility
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.