-
Notifications
You must be signed in to change notification settings - Fork 16
4. Using Dockerised Knetminer with Amazon AWS and ElasticBeanStalk
Knetminer can be deployed on AWS Beanstalk via CLI or Beanstalk section on the AWS Console.
To kick start a Knetminer deployment on AWS Beanstalk at command line, you'll need to install the AWS Beanstalk CLI.
- Use the setup scripts (provided by AWS) to install Beanstalk CLI.
- Setup and configure the CLI on your machine. During the configuration step, an existing AWS Beanstalk application can be used or a new application can be created.
To use a dataset of your choice for your AWS KnetMiner deployment, perform the following:
- Upload your dataset directory to a S3 bucket. The dataset directory should follow the convention explained in this section.
- Make the bucket public, or use the appropriate IAM policy (to enable listing and reading of the S3 bucket), which need to be attached to the Beanstalk IAM role that's being used.
Assuming that you've downloaded the KnetMiner gitHub, you'll need to need to perform the following instructions to customise and deploy your KnetMiner.
You'll need to edit Beanstalk S3 configuration file to use the right S3 bucket for your dataset. Perform the following to open the configuration file and use an editor of your choice (we use vim, here):
cd knetminer/common/quickstart/aws
vi .ebextensions/01_s3.config
Then, change the S3 bucket path from
aws s3 cp s3://knetminer-testing-bucket/arabadopsis/ /home/ec2-user/knetminer-dataset --recursive
to
aws s3 cp s3://<MY-KNETMINER-BUCKET/<DATASET-FOLDER>/ /home/ec2-user/knetminer-dataset --recursive
Depending on the dataset size, you'll need to pick an appropriate AWS instance type for Beanstalk to use to deploy KnetMiner. AWS instance types, their specifications, and pricing for different AWS regions can be found at https://aws.amazon.com/ec2/pricing/on-demand/ . Below are some sample instance types to pick with different CPU and MEMORY configurations.
INSTANCE-TYPE | vCPUs | MEMORY |
---|---|---|
t2.medium | 2 | 4 GiB |
m4.large | 4 | 8 GiB |
m4.xlarge | 8 | 16 GiB |
m4.2xlarge | 2 | 32 GiB |
Edit the Beanstalk instance configuration file to use the correct instance type (in .ebextensions/00_instance.config
, as shown below)
cd knetminer/common/quickstart/aws
vi .ebextensions/00_instance.config
Add the appropriate instance type value:
InstanceType: <INSTANCE-TYPE>
Example:
InstanceType: m4.xlarge
Add or delete EC2KeyName entity. This is OPTIONAL and required only to login (via SSH) to the Beanstalk EC2 instance for troubleshooting. Delete this line if SSH login to the instance isn't required.
EC2KeyName: <SSH-KEY-NAME>
Example:
EC2KeyName: mysshkeyname
When using a predefined dataset in the knetminer gitHub, add a command entity in the Dockerrun JSON file.
vi Dockerrun.aws.json
Change the
"Entrypoint": "./runtime-helper.sh"
}
to
"Entrypoint": "./runtime-helper.sh",
"Command": "arabidopsis /root/knetminer-dataset"
}
To deploy via AWS Console, you'll need to prepare a Zip file with the relevant and appropriate Docker files, along with the customisation/configuration files, described above.
cd knetminer/common/quickstart/aws
zip -r code.zip .
Then, log on to Beanstalk section on the AWS Console. You may proceed by either creating a new application, or selecting an application that already exists. In the selected application, start a new environment by clicking on the 'Actions' button on the right hand side of the page and selecting 'Create environment'. Use the following value configuration within the New Environment wizard.
- Environment: 'Web server environment'
- Environment name: User friendly name (E.g: knetminer-test)
- Domain: User friendly DNS prefix (E.g: knetminer-test)
- Platform: Preconfigured platform -> Docker
- Application code: Select 'Upload your code' and select the code.zip file created above.
You can proceed with creating a new environment in the selected AWS Beanstalk application. This step will provision AWS resources (instance, load balancer) and the KnetMiner Docker container is added to the AWS instance launched.
eb create
This will prompt for:
- unique environment name - provide a userfriendly name (e.g: knetminer-test)
- DNS CNAME prefix - can be left with default value
- load balancer type - can be left with default value
The deployment process will provision any AWS resources (instance, load balancer) and the KnetMiner Docker container will be added to the AWS instance launched. This normally takes ~15 minutes, depending on the dataset size.
Login to Beanstalk section in the AWS Web Console, browse to the application, and newly launched environment, and find the relevant URL (e.g: knetminer-test.eu-west-2.elasticbeanstalk.com). Copy the URL suffix, with /client (e.g: knetminer-test.eu-west-2.elasticbeanstalk.com/client), to browse the KnetMiner UI.
The deployed KnetMiner AWS Beanstalk environment can be terminated via use of either the AWS console, or the beanstalk CLI, as follows:
Within the Beanstalk section of AWS Console, browse to the environment page, click on the 'Actions' button (on the right hand side of the page) and select 'Terminate environment' from the drop-down list.
eb terminate <environment-name> # e.g: eb terminate knetminer-test
For further help, please refer to the following link