Kyo Lee

Open-Source Cloud Blog

Category: AWS

Docker Strange Dev: How I learned to stop worrying and love the VM

A couple of months ago, my laptop died. I remember the day I brought it home for the first time. Still in college. I made a big decision—despite of my marginal finances—to go with the top model. “It should last at least five years,” I convinced myself. Seven years later, I got the news from the certified Apple repair person, “Well, you should just buy a new laptop than trying to fix this one.” So I did.

Now, I have a new laptop, which is currently placed on a pedestal. Watching how fast applications open and close on this laptop makes me cry. The future has arrived. I tell myself, “No! I will not f*** this one up this time! I will never install any software on this laptop ever!” Of course, I am in denial. This pure awesomeness will eventually decay. I predict the rate of decay exponentially accelerated by the number of software installed on the laptop (Disclaimer: No supporting data exists for this assertion—other than my paranoia).

I begin searching for the answer to the quest: Install no software. So I try Docker.

Docker_logo

The noticeable difference between using Docker, which runs Linux container, and using the virtual machine (VM) is speed.

Both options provide the ability to isolate software’s running environment from the base operating system. Thus both deliver the desirable paradigm: Build once and run everywhere. And both address my concern—that the base operation system must remain minimal and untouched. However, Docker stands out from the classic VM approach by allowing the cloud application—which is a virtual image with desired software installed—to run within milliseconds. Compare that to the minutes of time it takes to boot a VM instance.

Speed_Racer

What significance does this difference make?

Docker removes a big chunk of mental roadblock for developers. Thanks to Docker’s superior response time, developers can barely identify the perceptual distinction between running applications on a virtual image (a virtual container, to be precise) and running applications on the base OS. In addition to its responsiveness, Docker appeals to the developers by obsoleting the tedious procedures of booting the VM instance and managing the instance’s life cycle. Unlike previous VM-centric approach, Docker embraces the application-centric design principal. Docker’s command line interface (CLI) below asks us two simple questions:

1. Which image to use?

2. What command to run?

docker run [options] <image> <command>

With Docker, I can type single-line commands to run cloud applications, similar to other Linux commands. Then I will have my databases servers running. I will have my web servers running. I will have my API servers running. And I will have my file servers running. All these are done by using single-line commands. The entire web-stack can now be running on my laptop within seconds. The best part of all this? When I’m done, there will be no trace left on my laptop—as if they never existed. The pure awesomeness prevails.

docker-whaleeuca_new_logo

Running Eucalyptus Console on Docker

 

For those who want to check out Eucalyptus Console to access Amazon Web Services, here are the steps to launch Eucalyptus Console using Docker on OS X.

Step 1. Install Docker on your laptop

Here is a great link that walks you through how to install Docker on OS X:

http://docs.docker.com/installation/mac/

Step 2. Pull Eucalyptus Console Docker image repository

Run the command below to pull Eucalyptus Console Docker images (it will take some time to download about 1.5 G image files):

docker pull kyolee310/eucaconsole

Run the command below to verify that the eucaconsole images have been pulled:

docker images

Screen Shot 2014-09-14 at 10.52.11 PMFor those who want to build the images from scratch, here is the link to the Dockerfile used:

https://github.com/eucalyptus/dockereuca

Step 3.  Update Docker VM’s clock

When running Docker on OS X, make sure that OS X’s clock is synchronized properly. A skewed clock can cause problems for some applications on Docker. In order to fix this issue, you will need to log into the Docker VM and synchronize the clock manually.

You can SSH into Docker VM using the command below:

boot2docker ssh

Once logged in, run the command below to sync the clock:

sudo ntpclient -s -h pool.ntp.org

Run the command below to verify that the clock has been sync’ed

date

One more patching work to do is to create an empty “/etc/localtime” file so that you can link your OS X’s localtime file to Docker VM’s localtime file at runtime:

sudo touch /etc/localtime

Exit the SSH session:

exit

This issue is being tracked here:

https://github.com/boot2docker/boot2docker/issues/476

Screen Shot 2014-09-14 at 10.53.04 PM

Step 4. Launch Eucalyptus Console via Docker

Run the command below to launch Eucalyptus Console on Docker:

docker run -i -t -v /etc/localtime:/etc/localtime:ro -p 8888:8888 kyolee310/eucaconsole:package-4.0 bash

It’s a shame, but running a live “bash” session is not Docker’s way of doing things, but excuse me for the moment until I figure out how to run Eucalyptus Console properly without using the “service” command.

The command above will open a bash shell session for the eucaconsole image, then run the command below to launch Eucalyptus Console:

service eucaconsole start

Step 5.  Open Eucalyptus Console on a browser

Run the command below to find out the IP of Docker VM:

boot2docker ip

, whose output would look like:

    The VM’s Host only interface IP address is: 192.168.59.103

Using the IP above, access Eucalyptus Console at port 8888:

ex. http://192.168.59.103:8888/

Screen Shot 2014-09-14 at 8.54.09 PM

In order to access AWS, you will need to obtain your AWS access key and secret key. Here is the link by AWS on howto:

http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSGettingStartedGuide/AWSCredentials.html

Step 6. Log into AWS using Eucalyptus Console

Screen Shot 2014-09-14 at 8.54.28 PM

DevOps Culture — Fail Fast on Eucalyptus

sandiegozoo
At a meetup event down in San Diego, California, Eucalyptus had a chance to meet Sander van Zoest (@svanzoest), the VP of technology at OneHealth (http://www.onehealth.com/), who is also the organizer of the San Diego DevOps group (http://www.meetup.com/sddevops/). Sander and his team at OneHealth have been using Eucalyptus cloud for some time. Asked why OneHealth runs Eucalyptus in-house, Sander had some interesting stories to say about dealing with health-related data and the company’s DevOps engineering culture.

onehealth

Due to the strict regulations on Protected Health Information (PHI), OneHealth needs to take extra strong measures if they are to provide the services on AWS; Sander spent a good amount of time explaining to us how demanding it is to satisfy the regulations. Such barriers make things complicated to push any personal identifiable health information to the cloud.

For the AWS case, the very specific barrier was that AWS provides no legal protection when storing sensitive data in the cloud storage space. For instance, it is required by HIPAA and HITECH regulations that One Health needs to be able to promise a 72-hour response time to inform their customers about the breach of the data, should it ever happen, and provide an ETA to identify and patch the security hole that caused the breach.

Sander points out that at the moment, AWS does not guarantee such protections/services. For this reason, OneHealth’s production environment is deployed at Rackspace’s co-location since it provides HIPAA Business Associate Addendums. However, it is noted that given the evolving nature of the public cloud, it is very “cloudy” to predict how things are going to change in the near future. The recent announcement by AWS on CloudHSM (http://aws.typepad.com/aws/2013/03/aws-cloud-hsm-secure-key-storage-and-cryptographic-operations.html) — although it doesn’t cover the legal protection — is a good indicator showing AWS’s interest in providing secure storage service as moving forward.

cloudy2

What this uncertain, “cloudy” future means for engineering at OneHealth is employing a variety of infrastructure environments to take advantage of each platform while staying flexible. It becomes essential to design OneHealth’s services and applications to be deployable on bare-metal systems at Rackspace (production environment), AWS (sandbox/staging environment), Eucalyptus (in-house continuous integration and testing environment), and engineers’ laptops using Vagrant (development and testing environment).  (http://www.vagrantup.com/).

Under such heterogeneous systems, from its production down to the engineer’s laptop, the development environment — the OS, dependencies, configurations, etc — needs to be kept uniform via virtualization and automation, allowing seamless pushing of new code from the laptop up to the production. For handling the life cycle of machines and VM instances, the engineers at OneHealth are big fans of Chef (http://www.opscode.com/chef/), which makes the configuration management portable on any infrastructure platforms. For virtual machines, the instance images are prepared via debian preseed files while leveraging a open source tool VeeWee (https://github.com/jedi4ever/veewee).

chefchef_icon

At OneHealth, the philosophy of DevOps is deeply embedded in every aspect of its development and operation. The concept of DevOps was not new to many engineers who brought in the ideas of “Infrastructure as Code” and “Commit Often and Fail Fast” from previous companies such as MP3.com and Joost.

Speaking of DevOps culture, one fun fact Sander mentioned — which goes against intuition for many traditional IT shops — was that the operation team at OneHealth likes to take down the instances and rebuild them regularly. The recycling of the instances ensures the “freshness” of the deployed services and applications. The operation engineers should be more concerned if an instance’s uptime was longer than, say, 30 days because it meant that the content of the instance was outdated, possibly containing unfixed bugs or security issues. If the deployment setup was doing what it was supposed to be doing, then it should have killed the outdated instance and brought up a new instance with the latest updates.

The same goes for the development environment. It would be much better to refresh the dev environment instances with frequent relaunching and reconstructing than having the developers working on a stale dev environment, which turns out to be more harmful for the development. Plus, this destroy-and-rebuild enforcement encourages the developers to consistently check in the code to a version-controlled code repository, allowing early detection of conflicts in code.

All of these procedures, bringing together datacenter automation and configuration management, are part of a very new movement in software development now labeled as “DevOps”. The DevOps folks often joke around and say even a few years ago, the terminology didn’t even exist, but now, DevOps has become the most sought-after practice in IT. All thanks to the wide spread of cloud computing, giving birth to the programmable infrastructure.

euca_new_logo

Introducing Metaleuca

Nuclear-Devil-Horns

What is Metaleuca?

Metaleuca is a bare-metal provision management system that interacts with open-source software Cobbler via EC2-like CLI.

Using Metaleuca, users can communicate with Cobbler to self-provision a group of bare-metal machines to boot up with new, fresh OS images. The main appeal of Metaleuca is that it allows users to manage the bare-metal machines like EC2’s virtual instances via the command-lines that feel much like ec2-tools, or euca2ools.

euca_new_logo

ec2_logocobbler_logo

Metaleuca Command-Line Tools

Metaleuca consists of a set of command-lines that mirror some of the command-lines in ec2-tool or euca2ools. The list below shows a number of the core command-lines used in Metaleuca:

  • metaleuca-describe-profiles – Describe all the profiles provided in Cobbler
  • metaleuca-describe-systems – Describe all the bare-metal systems registered in Cobbler
  • metaleuca-reboot-system – Reboot the selected bare-metal system
  • metaleuca-run-instances – Initiate the provision sequence on the selected bare-metal systems
  • metaleuca-describe-instances – Describe the statuses of the provisioned bare-metal systems
  • metaleuca-terminate-instances – Terminate the bare-metal systems, returning them back to the resource pool

Metaleuca Configuration

Prior to installing Metaleuca, it is required that you have already configured Cobbler to provision a group of bare-metal machines in your datacenter. If you are new to Cobbler, please visit the Cobbler’s homepage http://cobbler.github.com/ for more information on how to set up Cobbler.

Once you have Cobbler running in the datacenter, you will need to install Metaleuca on a Ubuntu machine, or virtual machine. The installation guide for Metaleuca is provided at:

https://github.com/eucalyptus/metaleuca/wiki/Metaleuca-Installation-Guide

Metaleuca Walkthrough

Now that you have Metaleuca configured in the datacenter, let’s go over a scenario where you will want to launch two bare-metal instances with fresh CentOS 6.3 images.

First, you might want to use the command “metaleuca-describe-systems” to survey all the available systems registered in Cobbler.

Picture 3.png 03-30-32-133

Metaleuca allows users to directly select which bare-metal machines to provision — by using the machines’ IPs. However, those who are familiar with AWS will contest that this approach is not how virtual machines are provisioned on EC2; rather than specifying the IPs, the AWS users are to provide the number of instances to launch. For this reason, here, we will cover the EC2’s approach to provision the instances.

In Metaleuca, first you will need to find out which ‘profile‘ is set to install the CentOS 6.3 image. In Cobbler, a profile maps to a preconfigured ‘kickstart‘ file that contains the netboot instructions on which OS to install when a machine boots up and initiates PXEBOOT. In other words, you may compare the profiles on Cobbler to the instance images on EC2. In Metaleuca, you can display the available profiles using the command ‘metaleuca-describe-profiles‘:

Picture 7

Let’s say that the profile “qa-centos6u3-x86_64-striped-drives” is what we want to use.

Next, you will want to determine which “system-group” you want the machines to be selected from. In Metaleuca, the bare metal machines can be grouped into different resource pools. For instance, in our QA system at Eucalyptus, which utilizes Metaleuca, we partitioned the machines in the datacenter into 6 groups: qa00, qa01, dev00, dev01, test00, and test01. Such grouping allows us to provide semantics on the machine pools based on their usages, in which we created the resource allocation policy for the users, who are mainly developers and QA engineers. The command “metaleuca-describe-system-groups” displays all the machines and the groups accordingly:

Picture 8

However, keep in mind that not all machines in the list will be available to be provisioned; some of the machines might be in use by other users. Thus, you will want to run the command “metaleuca-describe-system-groups -f” to discover which machines are free to use. Fortunately, at this moment, among those 6 system-groups mentioned above, the group ‘test01′ has 2 machines available, which are labeled as “FREED” in the screen shot below:

Picture 6

Once you figure out the profile and the system-group availability information, you are ready to provision the bare-metal machines. The command “metaleuca-run-instances” takes input of how many instances you want to launch, which profile to use, which group to call from, and finally, the user string to mark the machines:

./metaleuca-run-instances -n 2 -g test01 -p qa-centos6u3-x86_64-striped-drives -u kyo_machines_for_demo

Picture 10

And, similar to ec2-tools and euca2ools, you can monitor the progress of the provisioned machines using the command “metaleuca-describe-instances -u“:

Picture 11

Notice that the instances are at the “pending” state at the moment. Soon, in about 8 minutes after launching, the instances will be shown as “running”:

Picture 12

At that point, you may ssh into the bare metal machines to verify that they are up and running with the fresh CentOS 6.3 OS installed.

Later, the command “metaleuca-describe-system-user -u” comes in handy when you want to find out which machines are provisioned under your name:

Picture 13

When you are done with the machines, you may “free” the machines so that they can return to the resource pool; the command for that case is “metaleuca-terminate-instances -u“:Picture 14

When running the command “metaleuca-describe-instance -u“, you will notice that the machines have been successfully freed:

Picture 15

Metaleuca as an Open Source Project

BARE METAL FOIL2

Metaleuca was evolved out of the internal usage — the development and test environment for engineers — at Eucalyptus, and is now available as an open source project on Github under the Apache license. The goal of the project is to complete the integration of Metaleuca into the Eucalyptus system so that it can be served as a “bare-metal only” zone in Eucalyptus. Your contribution is much appreciated!

Check out the project at:

https://github.com/eucalyptus/metaleuca

\m/

Allow Everyone to Launch Your Web-Service: Open Source Web-Service using GIT and AWS

Open Source is the future of software.

The statement above is self-evident at this point of history — I mean, have you checked out www.microsoft.com/opensource recently?

Then, what does open-source mean when it comes to web-services?

Open Source has been a well-understood concept among the web-application community due to the neat feature “View Source” that came with almost all web-browsers. Copy-and-pasting a chunk of HTML code from one website to another was a de facto standard way of developing websites. Whether admitted it or not, web-application development is deeply rooted in open source culture.

But, where should this open source culture be heading to? Should we keep maintain the culture of copy-and-pasting others’ code with no telling? Or, rather try to embrace this open source culture and explore the possibilities that arise from sharing code?

Then, it occurred to me that, “What if we could share the entire code for web-services?”

Imagine if anyone could launch web-services such as E-bay, Craigslist, Amazon, Angie’s list, Yelp, Instagram, Twitter, etc on their own instances on AWS. Anyone could instantly replicate all the functionality and services provided by such known web-services at its micro scale. If the big name web-services are the department stores in downtown, these open-source web-services on AWS instances are the mom-and-pop stores for the local community, or they could be the black market since these services could form and disappear in a rather unpredictable manner — similar to the life cycle of AWS instances. 😉

Think about the scenario where a person can launch his/her own e-bay online auction service on an AWS instance for a school fund-raising event for a month. The person might also want to launch an instance of Craigslist for the same event, or Instagram and Twitter to keep people engaged during this period. Then, once the event is over, all the services can go away as if nothing ever happened.

Welcome to the era of Cloud Applications.

To test out the concept, I used the application “Open QA“, a web-service that is designed to display the test results of the development progress at Eucalyptus. The goal is to allow any Eucalyptus community member to launch the same service on AWS instances.

First, the web-service has to be broken down into two bodies, the service layer and the data. Then, each body of the web-service needs to be available in the open, allowing anyone to download the code and convert the AWS instances to Open QA web-service instances.

This is where GIT comes in very nicely. For public repositories, GIT will allow you create as many branches as you want, which can serve as read-only code repositories for anyone on the Internet without hassle.

The service layer part of Open QA is available at:

https://github.com/eucalyptus/open-qa

The data for Open QA is available at:

https://github.com/eucalyptus/open-qa-frontend-data

Upon launching an instance on AWS, a user can simply convert the raw Amazon Linux instance into the Open QA web-service by executing 4 commands below:

sudo yum -y install git
git clone git://github.com/eucalyptus/open-qa.git
cd ./open-qa/script/
./open_qa-installer.py -t amazon -e <your_email>

At this point, the AWS instance is running a HTTPD service with Open QA PHP code in its /var/www/html directory.

However, with only the service layer installed, the instance is missing the data, thus has no content to display.

Notice that the data for Open QA is available at a different GitHub repository. The user can supply the service with the actual Open QA data by following 5 commands below:

cd ~
git clone git://github.com/eucalyptus/open-qa-frontend-data.git
cd ./open-qa-frontend-data
sudo tar -zxvf ./cache_storage_for_open_qa.tar.gz
sudo cp -r ./cache_storage_for_open_qa/* /var/www/html/webcache/.

Now, the instance is completely converted to run the Open QA web-service with the up-to-date test results.

The data for Open QA is, for now, updated at every 12:01am — pushed to the public GIT repository from the QA server at Eucalyptus HQ. In order to keep the data in sync, the AWS instances are scheduled to periodically pull the latest data via “git pull” on the data directory and copy over the new data to /var/www/html/webcache directory, which can be set as a cron job via “sudo crontab -e” on the instances.

This proof-of-concept case with Open QA application is to demonstrate how GIT and AWS can be used to make an open-source web-service to be launched by anyone with the AWS account. The separation of the service layer and the data for the web-service allows the update to take place independently; for some other web-applications, it might be desirable that the user supplies his/her own customized data, thus only utilizing the service layer.

If anyone ever dreams of launching 10,000 instances of a web-application in the cloud, it must be done through the open source scenario above where the application pulls the needed service and data at its own pace and for its own purpose, which is to serve locally.

%d bloggers like this: