Skip to content

4. Using Dockerised Knetminer with Amazon AWS and ElasticBeanStalk

Joe edited this page Sep 27, 2019 · 5 revisions

Using Dockerised Knetminer with Amazon AWS and ElasticBeanStalk

Knetminer can be deployed on AWS Beanstalk via CLI or Beanstalk section on the AWS Console.

Prerequisites

Installing AWS Beanstalk CLI - ONLY FOR CLI APPROACH

To kick start a Knetminer deployment on AWS Beanstalk at command line, you'll need to install the AWS Beanstalk CLI.

  • Use the setup scripts (provided by AWS) to install Beanstalk CLI.
  • Setup and configure the CLI on your machine. During the configuration step, an existing AWS Beanstalk application can be used or a new application can be created.

Dataset and Permissions

To use a dataset of your choice for your AWS KnetMiner deployment, perform the following:

  • Upload your dataset directory to a S3 bucket. The dataset directory should follow the convention explained in this section.
  • Make the bucket public, or use the appropriate IAM policy (to enable listing and reading of the S3 bucket), which need to be attached to the Beanstalk IAM role that's being used.

Deployment

Assuming that you've downloaded the KnetMiner gitHub, you'll need to need to perform the following instructions to customise and deploy your KnetMiner.

Beanstalk S3 configuration

You'll need to edit Beanstalk S3 configuration file to use the right S3 bucket for your dataset. Perform the following to open the configuration file and use an editor of your choice (we use vim, here):

cd knetminer/common/quickstart/aws
vi .ebextensions/01_s3.config

Then, change the S3 bucket path from

aws s3 cp s3://knetminer-testing-bucket/arabadopsis/ /home/ec2-user/knetminer-dataset --recursive

to

aws s3 cp s3://<MY-KNETMINER-BUCKET/<DATASET-FOLDER>/ /home/ec2-user/knetminer-dataset --recursive

Beanstalk EC2 instance configuration

Depending on the dataset size, you'll need to pick an appropriate AWS instance type for Beanstalk to use to deploy KnetMiner. AWS instance types, their specifications, and pricing for different AWS regions can be found at https://aws.amazon.com/ec2/pricing/on-demand/ . Below are some sample instance types to pick with different CPU and MEMORY configurations.

INSTANCE-TYPE vCPUs MEMORY
t2.medium 2 4 GiB
m4.large 4 8 GiB
m4.xlarge 8 16 GiB
m4.2xlarge 2 32 GiB

Edit the Beanstalk instance configuration file to use the correct instance type (in .ebextensions/00_instance.config , as shown below)

cd knetminer/common/quickstart/aws
vi .ebextensions/00_instance.config

Add the appropriate instance type value:

InstanceType: <INSTANCE-TYPE>

Example:

InstanceType: m4.xlarge

Add or delete EC2KeyName entity. This is OPTIONAL and required only to login (via SSH) to the Beanstalk EC2 instance for troubleshooting. Delete this line if SSH login to the instance isn't required.

EC2KeyName: <SSH-KEY-NAME>

Example:

EC2KeyName: mysshkeyname

Edit Docker run file - ONLY FOR PREDEFINED DATASET

When using a predefined dataset in the knetminer gitHub, add a command entity in the Dockerrun JSON file.

vi Dockerrun.aws.json

Change the

  "Entrypoint": "./runtime-helper.sh"
}

to

  "Entrypoint": "./runtime-helper.sh",
  "Command": "arabidopsis /root/knetminer-dataset"
}

Create a new AWS Beanstalk environment - via AWS Console

Prepare the code zip file

To deploy via AWS Console, you'll need to prepare a Zip file with the relevant and appropriate Docker files, along with the customisation/configuration files, described above.

cd knetminer/common/quickstart/aws
zip -r code.zip .

Then, log on to Beanstalk section on the AWS Console. You may proceed by either creating a new application, or selecting an application that already exists. In the selected application, start a new environment by clicking on the 'Actions' button on the right hand side of the page and selecting 'Create environment'. Use the following value configuration within the New Environment wizard.

  • Environment: 'Web server environment'
  • Environment name: User friendly name (E.g: knetminer-test)
  • Domain: User friendly DNS prefix (E.g: knetminer-test)
  • Platform: Preconfigured platform -> Docker
  • Application code: Select 'Upload your code' and select the code.zip file created above.

Create a new AWS Beanstalk environment - via CLI

You can proceed with creating a new environment in the selected AWS Beanstalk application. This step will provision AWS resources (instance, load balancer) and the KnetMiner Docker container is added to the AWS instance launched.

eb create

This will prompt for:

  • unique environment name - provide a userfriendly name (e.g: knetminer-test)
  • DNS CNAME prefix - can be left with default value
  • load balancer type - can be left with default value

Browsing Knetminer UI

The deployment process will provision any AWS resources (instance, load balancer) and the KnetMiner Docker container will be added to the AWS instance launched. This normally takes ~15 minutes, depending on the dataset size.

Login to Beanstalk section in the AWS Web Console, browse to the application, and newly launched environment, and find the relevant URL (e.g: knetminer-test.eu-west-2.elasticbeanstalk.com). Copy the URL suffix, with /client (e.g: knetminer-test.eu-west-2.elasticbeanstalk.com/client), to browse the KnetMiner UI.

Delete KnetMiner environment

The deployed KnetMiner AWS Beanstalk environment can be terminated via use of either the AWS console, or the beanstalk CLI, as follows:

AWS Console

Within the Beanstalk section of AWS Console, browse to the environment page, click on the 'Actions' button (on the right hand side of the page) and select 'Terminate environment' from the drop-down list.

AWS Beanstalk CLI

eb terminate <environment-name> # e.g: eb terminate knetminer-test

For further help, please refer to the following link