Kyo Lee

Open-Source Cloud Blog

Pigeons on a Euca: Eucalyptus Cloud Monitoring Mobile App via Twitter

Being a system administrator is the easiest job in the building when the system is working; no one questions your presence nor existence. Your tasks are highly under-appreciated during the time of peace, yet you do not mind for such vanity since you’d rather be reading blogs and watching youTubes in serenity. Every once in a while, an idiot cracks an ancient joke, “hey, aren’t you supposed be working?”. But, I’m working, you imbecile employ-of-the-month.

However, the curse begins once you step out of the building, entering the realm of unknowns, far-disconnected from the comfort of your Macbook and Wi-Fi. While staring at endless tail-lights on a freeway, having a long walk on the beach, or being queued behind 7 shopping carts at a grocery market, your mind begins to wonder, “how’s my system doing?”

Since the day 1, you have set up numerous layers of email-notification alarms, but it’s never enough; “getting the e-mails” only means “it’s too late”. Always there is an urge of logging in. But you can’t. You are cut off. You are trapped; the lady in front of you just pulled out a checkbook while the sign clearly says “credit or cash only.” You begin to panic. You compulsively refresh emails on your smartphone, but no answers. Silence is deafening. No news is never the good news. The only exit is when the system whispers in your ear: “Have no worries. Everything is working… For now.”

Now, your concerning days are over. In the midst of the 3G wilderness, the application Pigeons on a Euca will deliver you peace and tranquilly that are comparable to those of a laptop on VPN. Of course, that is if you are equipped with a smartphone at all times and the system is running Eucalyptus Cloud.

The trick is to run a periodic Cloud monitoring app via Twitter.

Instead of being passively notified by emails when there arise problems in the system, you can set up the application Pigeons on a Euca that runs a small script that actively “tweets” the status of the cloud for you and your co-sys.admins to follow.

Screenshot of Pigeons on a Euca on iPhone

Here are the requirements:

  • Have a twitter account opened for this application.
  • Have a machine, or a virtual machine, running Linux with network capability.
  • Have the cloud admin’s credentials.
  • Have a smartphone with Twitter Client App installed.

Current (Beta) Features**:

  • In every 1 min, it tweets the status-change on running instances*.
  • In every 10 min, it tweets the number of currently running instances in the cloud*.
  • In every 10 min, it tweets the number of newly-launched instances in the cloud*.
  • In every 10 min, it tweets each availability zone information

* These features rely on the new version of euca2ools (v 2.0)

** The application is highly configurable so that more reporting can easily be added when needed.

Instructions on How to Set up the Application Pigeons on a Euca

First, you need to open a new twitter account.

1. Go to twitter.com and sign up for a new account; if you already have an account with Twitter, you are going to need a new email address to create a new twitter account for this application.

2. As soon as the new twitter account is open, check the “Protect my Tweets” box on the Tweet Privacy section so that your tweets do not accidently get broadcast to the public channel.

3. Apply for a developer account at dev.twitter.com.

4. After opening the twitter developer account, you need to create an application at dev.twitter.com.

5. Fill out the application details form. There is no requirement on what you put in the name and description boxes. For the website box, you may specify any working web URL of your choice; it won’t matter for this application.

6. After creating the application, on the “settings” page, change the access level to be “Read and Write

7. Click the button “Change this Twitter application’s settings” at the bottom of the page to apply the change. It might take a few minutes for the change to be applied.

8. On the “Details” page, verify that the access level is changed to “Read and write“. After seeing that the change has taken place, click on the button “Create my access token” at the bottom of the page.

9. Go to the page “OAuth tool” and verify the consumer key and access token are generated. You will need these keys to configure the application Pigeons on a Euca later.

10. At this point, your twitter account is configured to receive script-generated tweets from the application Pigeons on a Euca.

Second, after the twitter account is ready, you need to set up the machine where the application Pigeons on a Euca will be running on.

1. Install a perl module “Net::Twitter::Lite” on your Linux box.

You may install the module from source by visting the link:

http://search.cpan.org/~mmims/Net-Twitter-Lite-0.10004/lib/Net/Twitter/Lite.pm

Or, for UBUNTU distributions, such as Lucid, you may simply add the line:

deb http://ubuntu.mirror.cambrium.nl/ubuntu/ lucid main universe

to “/etc/apt/source.list”.

Then, install the perl module twitter-lite-perl by using the commands:

apt-get update
apt-get install libnet-twitter-lite-perl

2. Install the latest version of “euca2ools” (v 2.0)

Visit the website (http://open.eucalyptus.com/downloads) for detailed instructions on how to install the latest euca2ools.

For UBUNTU distributions, such as Lucid, you add the line:

deb http://downloads.eucalyptus.com/software/euca2ools/2.0/ubuntu lucid universe

to “/etc/apt/source.list”

Then, install euca2ools by using the commands:

apt-get update
apt-get install euca2ools

Third, after all the necessary modules and tools are installed on the machine, now you can finish setting up the application on the machine.

1. Download the tarball pigeons_on_a_euca.tar.gz from the project repository:

https://projects.eucalyptus.com/redmine/projects/pigeons-on-a-euca/files

or

https://github.com/eucalyptus/pigeons_on_a_euca

2. Untar the tarball at a directory of your choice:

tar zxvf pigeons_on_a_euca.tar.gz

3. On the “my applications” page on dev.twitter.com, copy the lines in the “OAuth tool” section, as shown in the Step 9. of the first instruction set.

4. Change the lines in the file “./pigeon_on_a_euca/pigeon_cage/key/o_auth_settings.key” with your account’s actual values.

5. Perform a quick check to validate the setups so far by running the commands:

cd ./pigeon_on_a_euca/pigeon_cage

perl ./tweet_it_away.pl ./tweets/mytest.tweet

6. Check the twitter account to verify that the line “this is a test” was tweeted. Also notice the lock sign on the tweet that indicates the security level is private.

7. After verifying the test tweet, go to the directory “./pigeons_on_a_euca/credentials” and store your Eucalyptus cloud’s admin credentials.

8. Verify that you can talk to your Eucalyptus cloud via the admin credentials by running the commands:

cd ./pigeons_on_a_euca/credentials

source eucarc

euca-describe-availability-zones verbose

9. At this point, the application is all set to run. Do a quick check by running the main script:

perl ./activate_the_pigeons.pl

10. Check the twitter account to verify that the status of instances running on the cloud are being tweeted.

11. To run the main script in the background, do:

nohup perl ./activate_the_pigeons.pl > ./stdout.log >> ./stderr.log &

12. To monitor the run:

tail -f stdout.log

Last, install any Twitter Client App on your smartphone and follow the account that you created above.

Now you have an mobile application that keeps you updated with the status of the cloud.

Warning: The amount of tweets generated by the application might be overwhelming; at its maximum rate, it will upload 350 tweets per hour. It is recommended that you and your co-workers open a separate twitter account exclusively for receiving tweets from this application.

And, if you decide to modify the script, please be aware of the hourly limit of the tweet updates, which is set to be 350 tweets per hour. Carefully limit your tweets so that the application maintains consistent tweet-ability.

Thank you for your interest in the application, and feel free to contribute and share.

Kyo

A Developer Walks through Cloud

Skip Directly to [Instruction on How to Run the Video Processing Prototype]

A Developer Walks through Cloud

IMG_1562

1. Little Phone, Big Cloud

A few months ago, a phrase caught my attention: “Instagram for Video”. It was an interesting idea for a mobile application. As a software designer, I dug into the idea, soon to realize one major implementation challenge.

IMG_1564

It turns out that video is a collection of pictures–many, many pictures. Given the standard 24 frames-per-second rate, even an one-minute-long video would be comprised of 1440 pictures, which meant image-processing of 1440 pictures on a mobile phone. That is a lot of pictures for a small battery in your mobile phone to handle.

IMG_1567

There is an alternative way to the scenario; let’s consider moving the image-processing task over to a remote machine that is bigger, stronger, and meaner. In this scenario, the mobile phone could upload the video to a server via the internet, process it remotely, and retrieve the processed video back in a seamless fashion.

IMG_1568

However, there is one absolutely-crucial requirement in this scenario; we are going to need a big, big, big machine–big enough to handle millions of requests once this killer application goes viral (go big or go home). There is only one answer to this type of demand: “the Cloud.”

Luckily, there is an open-source cloud available; Eucalyptus is an open-source Infrastructure-as-a-Service cloud platform whose APIs are compatible with the ones with Amazon’s EC2. This makes Eucalyptus an ideal in-house cloud application development platform. It guarantees that once my killer application runs on Eucalyptus, it will also run on EC2 with no modifications required, thus creating a truly portable cloud application with the world-wide deployability.
IMG_1570

2. IaaS Cloud

For those who are not familiar with the IaaS clouds, let me to take you to a quick walkthrough to the cloud.

Eucalyptus and Amazon’s EC2 offer “Infrastructure-as-a-Service” cloud platforms. It means that a cloud-user can request, “Hey cloud, I need 5 machines with full network connectivity and access to the storage,” then within minutes, the user will have the complete system ready for use.

IMG_1572

Take this concept little further; instead of requesting machines for generic purposes, the cloud-user could have specified which machines to serve as what purposes at the creation. For instance, using the example in this article, the cloud-user could have asked, “Hey cloud, I want one machine to work as a collector while the rest as image-processors, and have them process my cat video immediately!” Then, the cloud would have brought up a network of machines with the specific tasks assigned to each machine, and they would have worked on processing of the cat video right away. Once the processing was complete, the machines would have been self-terminated, leaving only the processed cat video behind.

IMG_1575

3. App on the Cloud

Let’s go back to the video-processing application on the cloud. Here I will cover some major design considerations when developing applications on the cloud.

3.1. Parallelism and Elasticity

Designing an application on a distributed system requires a process to be broken down into small tasks. Then, one must identify the tasks that can bring parallelism into the process. In this video-processing application, the process can be broken down into 3 major steps: decoding the video into images, processing the images, and encoding the processed images back to a video. Given these breakdowns, the natural approach is to distribute the image-processing task over multiple machines and assign a single machine to perform the encoding and decoding.

IMG_1577

One important characteristic of the cloud that you must realize at the core of the design is the elasticity of the cloud. The elasticity is what differentiates the cloud applications from the traditional distributed applications. Traditionally, in a distributed computing environment, the number of nodes N in the system is a static value that is unchangeable during a job. However, in the cloud environment, there is no bound in the number N, theoretically the number N is limitless. This means that at any given point during the job, the system should expect the number N to grow, or even shrink in some cases. For instance, in our video-processing application, we could initially start with 5 machines assigned to be image-processing nodes, however in the middle of the processing, we should be able to add 5 more nodes to boost the productivity. Taking such advantage of the elasticity must be considered at the design level of the application.

IMG_1579

3.2. Prototype

Following is the overview of the prototype of the video-processing application in the cloud.

For more detailed instructions, please go to the page [Instruction on How to Run the Video Processing Prototype]

The goal of the prototype is to demonstrate a cloud application that performs image-processing tasks in a distributed fashion. The application takes input of a video file, performs image-processing in parallel, and when it terminates, the processed video file is stored in a known storage location provided.

For the simplicity of the prototype, let’s assume that there is a machine that works as a file server that has an apache web-service running in the open, which is accessible from the cloud. In other words, any virtual instances(nodes) spawned on the cloud will have access to the files on the file server via download(wget). Given this setup, for instance, when we trigger the collector node, it can download the input video file from the file server to start the process.

IMG_1581

For the prototype, we need to construct two types of nodes: collector node and image-process node. However, before I go into further details, I must explain what takes places when the cloud-user requests an instance from the IasS cloud.

When the cloud-user asks the cloud, “Hey cloud, I need one machine,” the user is required to specify the image of the machine. In other words, the cloud-user must request, “Hey cloud, I need one machine with the RHEL 6.1 image that I have prepared for this video-processing prototype.” Then, the cloud will bring up a virtual instance that is flashed with the specified RHEL 6.1 image. Since users can prepare and upload images of their choices to the cloud, the possibilities are limitless on what you want the instances to do or to become.

IMG_1583

For this particular prototype, I prepared a single image that would be used by all nodes. I took a generic Ubuntu Karmic image as the base image and modified its ‘rc.local’ script, which is the default script that gets executed automatically when the image boots up. The modified ‘rc.local’ script is set to read a line from the ‘user-data’ field, which get passed to the instance from the cloud-user at the creation. This small modification allows me to control the rolls of the instances with having only one image. For example, I can request, “Hey cloud, I want one instance with my special Ubuntu image and have it run the script ‘collector.pl'”, then later, I can ask, “Hey cloud, I want another instance with the same image, but this one will run the script ‘processor.pl’.”

The requests in the example above would look like the below. Notice using the same image ID ’emi-9BD01749′, but different ‘user-data’ values (-d).

First request to bring up a collector:

euca-run-instances emi-9BD01749 -k mykey0 -n 1 -g group0 -t c1.medium -d “collector.pl”

Second request to bring up a processor:

euca-run-instances emi-9BD01749 -k mykey0 -n 1 -g group0 -t c1.medium -d “processor.pl”

IMG_1585

In the prototype, the actual requests contain more information than just a script name. The first request looks like,

euca-run-instances emi-9BD01749 -k mykey0 -n 1 -g group0 -t c1.medium -d “collector.pl 192.168.7.77 [lovemycat.avi]”

This command translates to, after the instance boots up, it downloads the specified script ‘collector.pl’ from the file server at ‘192.168.7.77’ via wget and execute the script. The purpose of the script ‘collector.pl’ is to turn the instance into the collector node for the video-processing application. First, the script installs all the necessary softwares via apt-get commands in Ubuntu; it uses various open-source softwares for the encoding and decoding tasks. It also installs the NFS server to create a shared directory where the processing nodes can access. Second, it downloads the target video file ‘lovemycat.avi’ from the file server at ‘192.168.7.77’ (for the convenience of the prototype, the file server is designed to provide all the external file resources to the instances). Then, the collector node decodes the avi file into a collection of JPEG images. These image files are stored in the shared directory opened up by the NFS server. Now, the collector node waits for the image files to be processed by the processing nodes. The collector node’s job is to periodically scan the shared directory for the progress.

IMG_1587

After the collector node enters the stage where it idles and scans, the next step is to start a group of the processing nodes by requesting,

euca-run-instances emi-9BD01749 -k mykey0 -n 3 -g group0 -t c1.medium -d “processor.pl 192.168.7.77 [10.219.1.2 neon.scm]”

As result, 3 instances will boot up, download the specified script ‘processor.pl’ from the file server at ‘192.168.7.77’, and convert themselves into the image-processing nodes. It installs the opens-source image-processing software GIMP and the NFS client. It performs NFS-mount to the shared directory of the collector node, whose IP is at ‘10.219.1.2’. Then, these processing nodes will start picking up image files from the shared directory and perform image-processing using GIMP according to the script ‘neon.scm’.

The syntax of the user-data for this image is:

-d “<script> <file_server_IP> [ <arguments_for_script> ]”.

IMG_1588

Now, here is one crucial design decision that compliments the elasticity of the cloud. The work-unit for the image-processing is set to be 20 images at a time. This means that each node is only allowed to grab a chunk of 20 images at a time to perform image-processing. Under this policy, the processing nodes must frequently inquire the collector node for a small amount of work, instead of pre-determining the complete workload for each processing node prior to the beginning of the processing. This approach allows more processing nodes to be added to the system at any moment, thus taking full advantage of the elasticity.

IMG_1590

When the processing nodes discover that there are no more images to be processed, they will be self-terminated, freeing up the computing resources for the cloud. When the collector node learns that all the images have been processed, it wakes up and encodes the images to a new video file. The final AVI file will be uploaded to the storage location belongs to the cloud-user. Eucalyptus and EC2 offer S3 storage units that allow such operation, however I will skip the details for later.

IMG_1592

This prototype demonstrates how a complex operation, such as distributed video-processing, can be automated using the cloud. However, the automation is just a tip of the iceberg for the cloud. The raw power of the cloud comes from the ability to instantly replicate the application in a massive scale across the world. Such capability of the cloud contributes to the recent booming development in Software-as-a-Service (SaaS) solutions.

IMG_1595

Extra. Links to Processed Videos

Using Invert Filter –

Using Edge Filter –

Using Motion Blur Filter –

Related. Links to Project Home Page –


%d bloggers like this: