Topic Covered:

In this R-tutorial, we explain how to use the awsConnect package to launch, terminate EC2 instances on Amazon's Cloud service.

Download and install the package

  1. Make sure the Amazon's AWS CLI is installed on the machine you want to use this package on. For detailed information on how to do so, consult AWS CLI setup guide.
  2. Then using the devtools R-package, run the following command to install awsConnect

    devtools::install("https://github.com/lalas/awsConnect/")
    
  3. Finally load the library, with

    library(awsConnect)
    

Configuring and using the package

As mentioned elsewhere in the package documentation, the first thing to do once the package is installed is to setup the environment variables. If you are using RStudio, then environment variables defined in the .bash or .profile files are not available within RStudio. For instance

Sys.getenv("PATH")
## [1] "/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/usr/local/git/bin:/usr/texbin"

does not contains the path aws bin; and other environment variables such as aws_access_key_id; the aws_secret_access_key; and the default region are empty. To fix this problem; we issue the following commands:

add.aws.path("~/.local/lib/aws/bin")
Sys.setenv(AWS_ACCESS_KEY_ID = aws_access_key)
Sys.setenv(AWS_SECRET_ACCESS_KEY = aws_secret_key)

The above 3-lines of code assumes that the path to aws binaries is ~/.local/lib/aws/bin, that your aws access key and secret keys are stored in R-variable called aws_access_key and aws_secret_key respectively.

Creating and Managing Key-Pairs for EC2

Amazon EC2 Key Pairs (commonly called an "SSH key pair"), are needed to be able to Launch and connect to an instance on AWS cloud. If no Key pairs are used when launching the instances, the user will not be able to ssh login into his/her instance.

Amazon offers 2 options: 1. The possibility of using it's service to create a key-pair for you; where you download the private key to your local machine 2. Uploading your own public key to be used.

Creating and Deleting key-pairs for a specific region

Assuming the name of the key we want to create is MyEC2 and we want to be able to use it with instances launched in US West (N. California) region, we will issue this command:

create.ec2.key(key.name = "MyEC2", region = "us-west-1")

This will create a file called MyEC2.pem in the ~/.ssh directory. Simiarly, in order to delete an existing key, named MyEC2 from a specific region, we use

delete.ec2.key(key.name = "MyEC2", region = "us-west-1")

and the file ~/.ssh/MyEC2.pem will be deleted

Uploading and using your own public key

The second method that Amazon web services allows, is to use your own public/private key pairs. Assuming that the user has used ssh-keygen command (or something similar) to generate the public/private key pairs; he/she can then upload their public key to a specific region using:

upload.key(path_public_key = "~/.ssh/id_rsa.pub", key_name = "MyEC2", region = "us-east-1")

where id_rsa.pub is the file that contains the protocol version 2 RSA public key for authentication.

Launch, Stop, and Terminate On-Demand EC2 cluster in a specific region with default security group

Now that we have created (or uploaded) the required key to AWS Cloud, we are ready to launch EC2 instances. We start by launching a cluster (of 2 nodes); in the US East (N. Virginia) region.

cl <- startCluster(ami = "ami-7376691a",instance.count = 2,instance.type = "t1.micro", key.name = "MyEC2", region = "us-east-1")

using the default security group, since we didn't specify the security.grp parameter. As of the time of writing this tutorial, ami-7376691a is an Amazon Machine Instance that runs off Ubuntu-13.10 (64bit), and has R version 3.1.0 and RStudio-0.98.501 installed on it. To stop this cluster, we use

stopCluster(cluster = cl, region = "us-east-1")

and to terminate this cluster, we use:

terminateCluster(cluster = cl, region = "us-east-1")

Defining default region using environment variables

Instead of always specifying the region parameters when calling various functions in the awsConnect package; the user can use the environment variable AWS_DEFAULT_REGION to set his/her default region. For example, to set up the default region to be US East (N. Virginia); we use

Sys.setenv(AWS_DEFAULT_REGION = "us-east-1")

where, us-east-1 is the ID for a specific region. To find out the IDs' of the different regions; use the following command:

regions <- describe.regions()

And to find out the availability zone and their status for a specific regions, find out its ID and use the following command:

describe.avail.zone(regions$ID[3])

Launch Spot Instances

Spot Instances allow you to name your own price for Amazon EC2 computing capacity. You simply bid on spare Amazon EC2 instances and run them whenever your bid exceeds the current Spot Price, which varies in real-time based on supply and demand.

Notes: * Spot Instances perform exactly like other Amazon EC2 instances while running. They only differ in their pricing model and the possibility of being interrupted when the Spot price exceeds your max bid.

  • You will never pay more than your maximum bid price per hour.

  • If your Spot instance is interrupted by Amazon EC2, you will not be charged for any partial hour of usage. For example, if your Spot instance is interrupted 59 minutes after it starts, we will not charge you for that 59 minutes. However, if you terminate your instance, you will pay for any partial hour of usage as you would for On-Demand Instances

  • You should always be prepared for the possibility that your Spot Instance may be interrupted. A high max bid price may reduce the probability that your Spot Instance will be interrupted, but cannot prevent interruption.

With this in mind; in order to launch a spot instance, we would:

  1. To find out the most recent spot price of a machine of a specific type (e.g. cr1.8xlarge) in a particular region we use the following command:

    get.spot.price(instance.type = "cr1.8xlarge", region = "us-east-1")
    

    Note: if we had defined the environment variable AWS_DEFAULT_REGION; we could have omitted the region parameters in the previous command to find out the most recent spot price history for the default region.

  2. Once we obtain the most recent spot price, we can bid for a spot instance, by providing a price higher than or equal to the spot price returned in the previous step. For instance, assuming that the spot price for the t1.micro is \$0.0031/hr; we would use the following command:

    spot.instances <- req.spot.instances(spot.price = 0.0032, ami = "ami-7376691a", instance.count = 1, instance.type = "t1.micro",key.name = "MyEC2")
    

    to bid for a t1.micro instance in our default region.

  3. Obtain the request IDs for the instances using:

    Spot.Instance.Req.IDs <- spot.instances[,"SpotInstanceReqID"]
    
  4. Find out the status of the bidding request, using the IDs obtained in the previous step

    describe.spot.instanceReq(Spot.Instance.Req.IDs)
    
  5. Assuming the statusCode returned from the previous command is fulfilled; then we can obtain the instanceIDs that were launched as a result of our bidding request using the following command:

    Instance.IDs <- describe.spot.instanceReq(Spot.Instance.Req.IDs)[,"InstanceID"]
    
  6. (OPTIONAL STEP) By default, Amazon EBS-backed instance root volumes have the DeleteOnTermination flag set to true, which causes the volume to be deleted upon instance termination. This might be problematic, since a spot instance could be terminated at anytime, as stated above. In order to fix this default behaviour, assuming we have the Instance IDs of the spot instances which we launched (as we did in the previous step), we can use the following command

    rootVol.DeleteOnTermination(Instance.IDs[1])
    

    in order not to delete the root volume of the first instance, once it's terminated.

Finally, to get information about running instances, running in the default region, we would use the following command:

describe.instances()

Published

02 August 2014 by Lalas

Tags


blog comments powered by Disqus