Submitted:
14 August 2024
Posted:
15 August 2024
You are already at the latest version
Abstract
Keywords:
1. Statement of Need
2. The Scripts and How to Use Them
2.1. Managing Instances for Workshops
- instancesNamesFile.txt — contains the names of the instances to be created and managed. Only the name of this file can be changed if preferred. This file must contain only one instance name per line, and each instance name must start with an alphabetic character followed by alpha-numeric characters, hyphens (-) or underscores (_) only.
- resourcesIDs.txt — contains a set of space-separated "key value" pairs that specify the AWS resources to use in creating each instance and related resources. This is the contents of the resourcesIDs.txt file we use for the Cloud-SPAN Genomics course [18]:
| KEYWORD | VALUE examples (Cloud-SPAN’s for Genomics course using instance domain names) | |
| ## NB: "key value" pairs can be specified in any order | ||
| imageId | ami-07172f26233528178 | ## NOT optional: instance template (AMI) id |
| instanceType | t3.small | ## NOT optional: processor count, memory size, bandwidth |
| securityGroupId | sg-0771b67fde13b3899 | ## NOT optional: should allow ssh (port 22) communication |
| subnetId | subnet-00ff8cd3b7407dc83 | ## optional: search vpc in AWS console then click subnets |
| hostZone | cloud-span.aws.york.ac.uk | ## optional: specify to use instance domain names |
| hostZoneId | Z012538133YPRCJ0WP3UZ | ## optional: specify to use instance domain names |
- As shown in this example, a resourcesIDs.txt file can have comments in addition to the "key value" pairs to specify. The "key value" pairs can be specified in any order, but each key word must be the first item in a line and its corresponding value the second item in the same line. The key words in the example must be used but they are NON-case sensitive. The three Not optional "key value" pairs must be specified.
- The values all are validated. The value of imageId is validated to correspond to an AMI in your AWS account or to a public AMI available in the AWS region on which your are running the scripts. The value of instanceType is validated to be a valid AWS instance type. The values of securityGroupId, subnetId, hostZone and hostZoneId are validated to exist in your AWS account.
- The key word subnetId and its value are optional. If not specified, the scripts will try to obtain a subnetID from your AWS account. We have successfully tested the scripts to obtain and use a subnetID running the scripts with a personal AWS account and with an institutional AWS account (see details in section Validating the Workshop Environment below).
- The key words hostZone and hostZoneId and their values are optional. If specified and valid, each instance will be accessed using a domain name which will look like this: instance01.cloud-span.aws.york.ac.uk, where instance01 is just an example of a specified instance name and cloud-span.aws.york.ac.uk is the base domain name (hostZone) in the example. If hostZone and hostZoneId and their values are not specified, each instance will be accessed using the public IP address or the generic domain name allocated by AWS which will look like this: 34.245.22.106 or ec2-34-245-22-106.eu-west-1.compute.amazonaws.com.
-
tags.txt — contains a set of space-separated "key value" pairs to tag instances and related resources upon creation. This file is optional. If specified, it must contain only one "key value" pair per line. Up to 10 "key value" pairs are processed. Examples:group BIOLproject cloud-spanstatus prodpushed_by manual
2.2. Running the Scripts
-
(1) validating the contents of the scripts configuration files (instancesNames.txt passed as parameter, tags.txt if found, and resourcesIDs.txt) as described above and shown in Figure 1. If no problem is found in these files, the option to continue with the configuration detected, regarding managing or not managing domain names to access instances, is displayed for the user to confirm or cancel the run. If there is a problem with the files, messages (not shown) are displayed that point out the specific problem/s in each file and the run is aborted.In Figure 1, the configuration detected corresponds to managing domain names to access instances, that is, hostZone and hostZoneId and valid values were specified and found in the resourcesIDs.txt file and were validated.If hostZone and hostZoneId are not specified, the option to continue looks like this:--> NO base domain name was FOUND.Each instance to be created will be accessed with the IP address or the generic domain name providedby AWS, which look like this: 34.245.22.106 or ec2-34-245-22-106.eu-west-1.compute.amazonaws.com.Would you like to continue (y/n)?:
- (2) creating login keys.
- (3) creating instances, each configured to use one of the login keys created in the previous step.
- (4) creating instances domain names as mapped to the respective instance IP addresses (AWS randomly allocates IP addresses to instances when instances are launched for the first time or started after having been stopped) — THIS STEP is only run if hostZone and hostZoneId and valid values were specified in the resourcesIDs.txt file.
- (5) configuring each instance both to enable the csuser account (used by workshop participants) to be logged in and to change the instance host name to the instance name (the default host name is the instance IP address) regardless of whether domain names are to be managed or not. This step is not shown in the figure.
2.3. Using Instances and Customising AMIs

2.3.1. Login to Instances When Domain Names Are NOT Managed
| ssh -i login-key-instance01.pem csuser@34.245.22.106 | #### where 34.245.22.106 is just an example IP address |
| ssh -i login-key-instance01.pem ubuntu@34.245.22.106 | #### the IP address of each instance will vary |
| ... |
2.4. Customising the Login Account of Workshop Participants
- create an instance using the Cloud-SPAN (CS) Genomics AMI and login as the ubuntu user.
- create and initialise your new user account (as described in the tutorial)
- edit the script /home/ubuntu/bin/usersAuthorizedKeys-activate.sh to replace the string csuser with the name of your new account — this script copies the file /home/ubuntu/.ssh/authorized_keys to the csuser account.
- delete or edit the file /etc/motd (message of the day) — it contains a screen welcome message that includes the Cloud-SPAN name.
- create a new AMI from the updated instance.
3. The Scripts Design and Implementation
3.1. The Scripts Execution Flow — Overview
| check_theScripts_csconfiguration | "$1" || { message "$error_msg"; exit 1; } |
| aws_loginKeyPair_create.sh | "$1" || { message "$error_msg"; exit 1; } |
| aws_instances_launch.sh | "$1" || { message "$error_msg"; exit 1; } |
| if [ -f "${1%/*}/.csconfig_DOMAIN_NAMES.txt" ]; then ### %/* gets the inputs directory path | |
| aws_domainNames_create.sh | "$1" || { message "$error_msg"; exit 1; } |
| fi | |
| aws_instances_configure.sh | "$1" || { message "$error_msg"; exit 1; } |
| exit 0 | |
3.2. Creating and Deleting Instances and Related Resources
| aws ec2 create-key-pair --key-name $loginkey --key-type rsa ... | ### invoked by aws_loginKeyPair_create.sh |
| aws ec2 run-instances --image-id $resource_image_id ... | ### invoked by aws_instances_launch.sh |
| aws ec2 delete-key-pair --key-name $loginkey ... | ### invoked by aws_loginKeyPair_delete.sh |
| aws ec2 terminate-instances --instance-ids $instanceID ... | ### invoked by aws_instances_terminate.sh |
3.3. Scripts Communication
3.4. Configuring, Stopping and Starting Instances
| csuser@csadmin-instance:~ | ||
| $ ls courses/instances-management/outputs/instances-creation-output/ | ||
| instance01-ip-address.txt | instance02-ip-address.txt | instance03-ip-address.txt |
| instance01.txt | instance02.txt | instance03.txt |
| csuser@csadmin-instance:~ | |
| $ lginstance.sh courses/instances-management/outputs/login-keys/login-key-instance01.pem csuser | |
| logging you thus: | ### this and the next line are displayed by lginstance.sh |
| ssh -i courses/instances-management/outputs/login-keys/login-key-instance01.pem csuser@3.253.59.74 | |
| ... | ### instance welcome message |
| csuser@instance01:~ | ### instance prompt |
| $ | |
3.5. Validating the Target Workshop Environment
3.6. Overview of the Online Tutorial
- how to open an AWS account and how to configure it both with programmatic access with the AWS CLI and with a base domain name from which to create instances (sub) domain names.
- how to configure a terminal environment with the scripts and the AWS CLI on Linux, MacOS, and Windows (Git Bash), or in the AWS CloudShell, a browser-based Linux Bash terminal.
- how to configure and run the scripts to manage instances for a workshop, manage late registrations and cancellations, and some troubleshooting.
- how to create, manage and configure Amazon Machine Images (AMIs) which serve as templates to create AWS instances.
- the organisation and workings of the scripts.
4. Conclusions
5. Availability of Source Code and Requirements
- Project name: Cloud-SPAN (https://cloud-span.york.ac.uk/ — https://github.com/Cloud-SPAN)
- Project home page (the Bash scripts): https://github.com/Cloud-SPAN/aws-instances.
- Operating system(s): Linux, Windows, MacOS.
- Programming language: Bash Shell.
- Other requirements: Bash version 5.0 or higher. Windows users must install Git Bash. MacOS users must install or update Bash. The online tutorial provides instructions for Windows and MacOS users to do so [14]. The AWS CLI must be installed and configured. The online tutorial provides instructions for Linux, Windows and MacOS users to install and configure the AWS CLI [27].
- License: MIT Licence.
Availability of Test Data
Ethical Approval
Competing Interests
Funding
Author’s Contributions
Acknowledgements
Abbreviations
References
- Michael P. Cummings, G.G.T. Broader incorporation of bioinformatics in education: opportunities and challenges. Briefings in Bioinformatics 2010.
- Nicola Mulder.; others. The development and application of bioinformatics core competencies to improve bioinformatics training and education. PLOS COMPUTATIONAL BIOLOGY 2018.
- Data Carpentry (2014). https://datacarpentry.org/. Accessed: 2024-07-24.
- Data Carpentry. Genomics Workshop Setup: Using the lessons with Amazon Web Services (AWS) (2023). https://datacarpentry.org/genomics-workshop/index.html#option-a-recommended-using-the-lessons-with-amazon-web-services-aws. Accessed: 2024-07-24.
- Google. PaaS vs. IaaS vs. SaaS vs. CaaS: How are they different? https://cloud.google.com/learn/paas-vs-iaas-vs-saas. Accessed: 2024-07-24.
- Afgan, E.; Sloggett, C.; Goonasekera, N.; Makunin, I.; Benson, D.; Crowe, M.; Gladman, S.; Kowsar, Y.; Pheasant, M.; Horst, R.; Lonie, A. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud. PLOS ONE 2015, 10, 1–20. [Google Scholar] [CrossRef] [PubMed]
- Connor, T.R.; Loman, N.J.; Thompson, S.; Smith, A.; Southgate, J.; Poplawski, R.; Bull, M.J.; Richardson, E.; Ismail, M.; Thompson, S.E.; Kitchen, C.; Guest, M.; Bakke, M.; Sheppard, S.K.; Pallen, M.J. CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community. Microbial Genomics 2016, 2. [Google Scholar] [CrossRef] [PubMed]
- OpenInfra Foundation. Open Stack: The Most Widely Deployed Open Source Cloud Software in the World. https://www.openstack.org/. Accessed: 2024-07-24.
- Engelberger, F.; Galaz-Davison, P.; Bravo, G.; Rivera, M.; Ramírez-Sarmiento, C.A. Developing and Implementing Cloud-Based Tutorials That Combine Bioinformatics Software, Interactive Coding, and Visualization Exercises for Distance Learning on Structural Bioinformatics. Journal of Chemical Education 2021, 98, 1801–1807. [Google Scholar] [CrossRef]
- Poolman, T.M.; Townsend-Nicholson, A.; Cain, A. Teaching genomics to life science undergraduates using cloud computing platforms with open datasets. Biochemistry and Molecular Biology Education 2022, 50, 446–449, [https://iubmb.onlinelibrary.wiley.com/doi/pdf/10.1002/bmb.21646]. [Google Scholar] [CrossRef] [PubMed]
- Google. Colaboratory. https://colab.research.google.com/. Accessed: 2024-07-24.
- Posit Software, PBC (formerly RStudio, PBC). posit cloud. https://posit.cloud/. Accessed: 2024-07-24.
- Cloud-SPAN Project. Automated Management of AWS Instances (2023). https://cloud-span.github.io/cloud-admin-guide-v2q. Accessed: 2023-10-25.
- Cloud-SPAN Project. Automated Management of AWS Instances: Precourse Instructions (2024). https://cloud-span.github.io/cloud-admin-guide-v2q/docs/miscellanea/precourse-instructions.html. Accessed: 2024-06-26.
- Cloud-SPAN Project. Automated Management of AWS Instances: Configure an Instance to Become AMI (2023). https://cloud-span.github.io/cloud-admin-guide-v2q/docs/lesson02-managing-aws-instances/03-ami-management.html#configure-an-instance-to-become-ami. Accessed: 2024-07-24.
- Cloud-SPAN Project. Automated Management of AWS Instances: Troubleshooting (2023). https://cloud-span.github.io/cloud-admin-guide-v2q/docs/lesson02-managing-aws-instances/02-instances-management.html#troubleshooting. Accessed: 2024-07-24.
- Cloud-SPAN Project. Automated Management of AWS Instances: Setting Up Your Cloud and Terminal Environments (2023). https://cloud-span.github.io/cloud-admin-guide-v2q/docs/lesson01-setting-work-envs. Accessed: 2024-07-24.
- Cloud-SPAN Project. Cloud-SPAN Genomics Course. https://cloud-span.github.io/00genomics/. Accessed: 2024-07-24.
- Cloud-SPAN (2021). https://cloud-span.york.ac.uk/. Accessed: 2024-07-24.
- Clous-SPAN on GitHub (2021). https://github.com/Cloud-SPAN. Accessed: 2024-07-24.
- AWS. Amazon EC2 key pairs and Amazon EC2 (Linux) instances (2023). https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html. Accessed: 2024-07-24.
- Wittig, A.; Wittig, M. Amazon Web Services in Action: Third Edition; Manning Publications Co., 2023.
- Winkler, S. Terraform in Action; Manning Publications Co., 2021.
- Cloud-SPAN Project. Automated Management of AWS Instances: Unforseen Instance Management (2024). https://cloud-span.github.io/cloud-admin-guide-v2q/docs/lesson02-managing-aws-instances/02-instances-management.html#unforseen-instance-management. Accessed: 2024-07-24.
- Amazon Web Services. Launch an instance from a launch template. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-templates.html. Accessed: 2024-07-24.
- Amazon Web Services. Launch an instance from a launch template: Example with AWS CLI. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/launch-instances-from-launch-template.html#launch-instance-from-launch-template. Accessed: 2024-07-24.
- Cloud-SPAN Project. Automated Management of AWS Instances: Configure Your Terminal Environment (2024). https://cloud-span.github.io/cloud-admin-guide-v2q/docs/lesson01-setting-work-envs/03-configure-terminal.html. Accessed: 2024-06-26.
| 1 | University of York - Information Technology (IT) Services - Research IT. |
| 2 | University of York - IT Services. |
| 3 | University of York - Research Software Engineering Team. |
| 4 | University of York - School of Physics, Engineering and Technology. |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).