Kyo Lee

Open-Source Cloud Blog

Tag: testing

A Developer’s Story on How Eucalyptus Saved the Day

This is a short story on how a UI developer at Eucalyptus was able to use Eucalyptus to save his time on development of Eucalyptus.

travis-mascot-200px

The agenda of the day is to set up Travis CI for the latest Eucalyptus user console, Koala. Quoted from Wikipedia, “Travis CI is a hosted, distributed continuous integration service used to build and test projects hosted at GitHub.” In other words, we want to set up an automated service hook on Koala’s GitHub repository so that whenever developers commit new code, “auto-magic” takes places somewhere in the Internet, which ensures that the developers did not screw things up by mistake, which, in turn, allows us developers to sit back and enjoy a warm cup of “post-commit-victory” tea while our thoughts are drifting away on the sea of Reddit.

But, before arriving at such Utopia, first things first. Must read the instructions on Travis CI.

Luckily, Travis CI put together a nice and comforting set of documentations on how to hook a project on Travis CI (http://about.travis-ci.org/docs/user/getting-started/) as well as its impressive, pain-free registration interface. Things are as easy as clicking buttons for the first few steps so far.

And, of course, nothing ever comes that easy. Now I am looking at the part where I need to create the XML configuration file for Koala’s build procedures. Done with the button-clicking. Time to put down the cup of tea because now I got some reading and thinking to do.

A few minutes after, I am stunned by the line below:

“Travis CI virtual machines are based on Ubuntu 12.04 LTS Server Edition 64 bit.”

Oh, bummer.

The main development platform for Koala has been on CentOS 6, meaning Koala’s build dependencies and scripts have been targeted toward running on CentOS 6 environment. It means that I need to go over the build procedures and dependency setting so that Koala can be built and tested on Ubuntu 12.04. But, first, where do I find those Ubuntu machines?

Then, the realization, ‘Wait a second here. I have Eucalyptus.’

I open up a browser and log into the Eucalyptus system for which I have been using as backend to develop Koala. I launch a couple of Ubuntu 12.04 instances. Within a minute, I have 2 fresh instances of Ubuntu 12.04 virtual machines up and running.

Immediately I log in to the first instance and start installing Koala to validate the build procedures on Ubuntu 12.04 environment. Along the way, I discover various little issues in this new environment and tweak things around to fine-tune Koala’s build procedures. Once felt ready, I log into the other Ubuntu instance to verify the newly adjusted build procedures under its fresh setting. More mistakes and issues are captured, and more adjustments are made. Meanwhile, the first instance has been shut down and a new Ubuntu instance has been brought up. With this new instance, I am able to rinse and repeat the validation of the build procedures. Of course, there are some mistakes again. They get fixed and adjusted. Meanwhile, another instance goes down and comes up fresh.

The juggling of the instances lasts a couple more times until the build procedures are perfected. Now I am confident that Koala will build successfully on Ubuntu 12.04 environment. Commit the new build XML script to GitHub. It’s time for the warm cup of “post-commit-victory” tea.

images-tea

[Cloud Application] Run Eucalyptus UI Tester on your Mac using Vagrant

Eucalyptus User Console

Initially, this blog was written to be a technical blog that describes the instructions on how to run Eucalyptus UI Tester (se34euca) on your Mac using Vagrant and Virtual Box. However, writing this blog has made me reiterate the benefits on running or developing applications on a virtual machine.

Background: Automated Tester As An Application (ATAAA)

When developing software, there is a need for having an automated test suite readily available; with the click of a button, a developer should be able to run a sequence of automated tests to perform a speedy sanity check on the code that is being worked on.

Traditionally, a couple of in-house machines would be dedicated to serve as automated testers, shared by all developers. In such setup, it would require the developers to interact with the tester machines over VPN, which can get quite hectic sometimes — especially for those developers who like to hang out at a local coffee shop.

Now, with Vagrant and Virtual Box, you can have your own personal automated tester as a “cloud application” running on a laptop. In this scenario, when the code is ready for testing, you can quickly run a set of automated tests on your laptop by launching a virtual image that has been pre-configured to be the automated tester for the project/software. When finished, the virtual instance can be killed immediately to free up the resources on the laptop.

Screen shot 2013-07-09 at 9.46.04 PM

Benefits of Running Applications on a Virtual Instance

As mentioned in the introduction, while preparing this Eucalyptus UI Tester to run as a cloud application, I rediscovered the appreciation for using virtual machines as part of the software development environment. The fact that the application runs on a virtual image brings the following benefits: contain-ability, snapshot-ability, and portability of the application.

1. Contain-ability

Running the application on a virtual instance means that no matter how messy dependencies the application requires, they all get to be installed on a contained virtual environment. This means that you get to keep your precious laptop clean and tidy, protecting it from all those unwanted unstable, experimental packages.

2. Snapshot-ability

When working with a virtual instance, at some point, you should be able to stabilize the application, polish it up to be a known state, and take a snapshot of the virtual image in order to freeze up the moment. Once the snapshot is taken and preserved, you have the ability to bring the application back to the such known state at any time. It’s just like having a time machine.

groundhogday

3. Portability

When working with a team or a community, the portability of the application on a virtual image might be the most appealing benefit of all. Once you polish up the application to run nicely on a virtual image, then the promise is that it will also run smoothly on any other virtual machines out there — including on your fellow developers’ laptops as well as on the massive server farms in a data center, or in the cloud somewhere. Truly your application becomes “write once, run everywhere.”

Screen shot 2013-07-09 at 9.47.51 PM

Running Eucalyptus UI Tester on Your Mac Laptop via Vagrant

If you would like to run Eucalyptus UI Tester from scratch, follow the steps below:

1. Installing Vagrant and Virtual Box on Mac OS X in 5 Steps

and

2. Installing Eucalyptus UI Tester on CentOS 6 image via Vagrant

If you would like to run Eucalyptus UI Tester from the pre-baked Vagrant image, follow the steps below:

1. Installing Vagrant and Virtual Box on Mac OS X in 5 Steps

then

3. Running PreBaked Eucalyptus UI Tester Image using Vagrant

, and see 4. Creating a New Vagrant Package Image if you are interested in creating a new image via Vagrant.

Instructions

1. Installing Vagrant and Virtual Box on Mac OS X in 5 Steps

https://github.com/eucalyptus/se34euca/wiki/Installing-Virtual-Box-and-Vagrant-on-Mac-OS-X

2. Installing Eucalyptus UI Tester on CentOS 6 image via Vagrant

https://github.com/eucalyptus/se34euca/wiki/Installing-se34euca-on-Centos-6

3. Running PreBaked Eucalyptus UI Tester Image using Vagrant

https://github.com/eucalyptus/se34euca/wiki/Running-PreBaked-se34euca-Image-using-Vagrant

4. Creating a New Vagrant Package Image

https://github.com/eucalyptus/se34euca/wiki/Creating-a-New-Vagrant-Package-Image

euca_new_logo

Beyond Continuous Integration: Locking Steps with Dev, QA, and Release

Continuous integration: the practice of frequently integrating one’s new or changed code with the existing code repository [wikipedia]

In this blog we will talk about how the continuous integration process was put in place for the new component, Eucalyptus User Console, in order to collaborate the efforts among the dev, QA and release teams throughout the development cycle of Eucalyptus 3.2.

Backgrounduserconsoleconponentview

Eucalyptus User Console is a newly introduced component in Eucalyptus, whose main goal is to provide an easy-to-use, intuitive browser-based interface to the cloud users, thus assisting in the dev/test cloud deployments among IT organizations and enterprises. Eucalyptus User Console consists of two components: javascript-based client-side application and Tornado-based user console proxy server.

Early Involvement

The first phase of the development was to come up with a quick prototype to demonstrate how the user console would work under the given initial design of the architecture (see the Eucalyptus Console components layout diagram above). As soon as the prototype was evaluated and its feasibility was verified, the release team started creating the packages for two major Linux OS platforms: Ubuntu and Centos/RHEL.

The early involvement of the release team turned out to be the best help any developers or QA engineers could ask for; since the very beginning stage of the development, the release team was able to provide invaluable information that served as guardrail for the fast-moving development. Such information included advising on how the files should be named and organized and identifying which dependencies should or should not be used in order to meet the requirements for various Linux distributions. Dealing with such issues at the later stage of the development would have been undoubtedly a major pain in the back-end.

jenkins_logo

Further more, the release team was able to ensure that the development of the new user console would never go off the track against the Linux distro requirements by setting up the automated daily package-building process using Jenkins — which utilizes the VM resources from our Release cloud that runs on Eucalyptus.

Keeping Up With Eucalyptus

Setting up the automated process to build the packages would allow the release team to keep an eye on the progress of the user console’s development in terms of the ability to build the packages according to the constraints set by the Linux distributions. However, it would not guarantee whether the newly built packages contain the version of the user console that works with the current, up-to-date Eucalyptus cloud that was also in development.

Thus, the challenge was to ensure that the latest built user console packages work with the latest built Eucalyptus throughout the development.

In order to solve this issue, the QA team created a testunit that automatically installs the latest user console packages on a newly built Eucalyptus. Then, the testunit was added to the main test sequences used by the Eucalyptus 3.2 development in our automated QA system, making the installation of the latest user console packages accessible by all developers at Eucalyptus.

This setup encouraged a failure in the user console package installation to be seen by any developers throughout the development, thus allowing the failure to be detected fast and reported with quickness.

Screen shot 2012-12-10 at 5.50.02 AM

The testunit ui_setup can be seen in action above in the table which displays the results of the test sequence ran by the automated QA system.. Check out the link below for more details of this testunit:

https://github.com/eucalyptus-qa/ui_setup

Circle of Trust

As the user console evolved out of its prototype state and took the form of a more product-like shape, the QA team was working in parallel, figuring out how to set up the automated testing process for the user console. The blog here talks in detail about how Selenium was used to create the automated web-browser testing tools, se34euca.

big-logo

In the mid-stage of the development, as the features of the user console started functioning in reasonably stable manners, 3 automated tests were added — incrementally — to ensure that the working state of the user console throughout the development.

Screen shot 2012-12-10 at 6.41.28 AMThose 3 tests are:

  1. user_console_view_page_testhttps://github.com/eucalyptus-qa/user_console_view_page_test
  2. user_console_generate_keypair_testhttps://github.com/eucalyptus-qa/user_console_generate_keypair_test
  3. user_console_launch_instance_test https://github.com/eucalyptus-qa/user_console_launch_instance_test

These automated tests were to ask the 3 simple questions below on a daily basis:

  1. Can the user log in and see all the landing pages on the latest user console?
  2. Can the user generate a new keypair using the latest user console?
  3. Can the user launch a VM instance using the latest user console?

Of course, it would be possible, and desirable, to ask more questions in a more complicated fashion. However, during the rapid development phase, asking those 3 simple questions on a daily basis, turned out to be sufficient, and effective, to understand whether something terrible had happened to the user console or not.

traffic_light

The goal of these automated tests at this stage of the development was not to detect every little defect in the product. Not too soon at the moment.

The main purpose is rather to serve as an indicator for the developers, QA engineers, and release engineers to assure ourselves that the change that went in the code earlier today did not ruin the delicate trust among the three groups, meaning that the build, installation, and configuration procedures are still in tact. Having such assurance in check by mechanical means has made the three groups extremely effective in discovering issues during the development since it allowed each member to narrow down exactly what was responsible for the defects in a finely reduced time frame, which was in hours, rather than days or weeks.

Guardrail For Development

Having the automated package build process and the automated installation/configuration process in place at the early stage of the development was proven to be extremely useful; rather than agreeing on the written procedures, the dev, QA, and release team materialized such agreements into the actual implementation, and put them into work by using various automated mechanics that run on a daily basis. Therefore, throughout the development, we were able to witness and assure ourselves that we were making progress in accordance with the plan and our self-imposed restrictions.

Check out the Eucalyptus Open QA webpage to see the continuous integration at Eucalyptus in action:

Eucalyptus Open QA (beta) – http://ec2-50-112-61-121.us-west-2.compute.amazonaws.com/open_qa.php

TCP Dumpster: Monitoring TCP for Eucalyptus User Console

This is the part III of the Eucalyptus Open QA blog series that cover various topics on the quality assurance process for Eucalyptus’s new user console.Eucalyptus User Console

On this blog, we would like to share the information on how we monitors the traffic on the user console proxy, using the Linux command ‘tcpdump‘ and its rendering application ‘tcpdumpster‘, to derive and understand the behaviors of users when interacting with the user console.

Background

Eucalyptus user console consists of two components: javascript-based client application and Tornado-based user console proxy. When logged in, the client-side application, which runs on a user’s web-browser, polls the user’s cloud resource data at a certain interval, and the user console proxy, located in between the cloud and the users, relays the requests originated from the client applications.

userconsoleconponentview

Recalling from the first blog of the series, our challenging question was, when 100+ users are logged into the Eucalyptus user consoles at the same time, would the user console proxy be able to withstand the traffic that was generated by those 100+ users? Plus, how do we ensure the user experience under such heavy load?

The answer to the questions above was provided in details here.

The short answer is to generate 100+ user traffic using the automated open-source web-browser testing tool, Selenium, while manually evaluating the user experience on the user console.

However, prior to answering the questions above, first we needed to establish a way to quickly, yet effectively monitor the traffic between the clients and the proxy in order to make observations on the patterns and behaviors of the traffic.

TCP Dump

tcpdump‘ is a standard tool for monitoring the TCP traffic on Linux. For instance, if the user console proxy was running on the port 8888 on the machine 192.168.51.6, monitoring the traffic on the port 8888 can be as simple as running the command below on the Linux terminal at 192.168.51.6:

tcpdump port 8888

Then, this command will “dump” out the information on every packet that crosses the port 8888 on the machine 192.168.51.6. However, the information generated by this command is just too overwhelming; such information would fly by on the terminal screen as soon as the user consoles start interacting with the proxy. There had to be a better way to render the output of the command ‘tcpdump‘.

 TCP Dumpster

At Eucalyptus, using the automated QA system, a new, up-to-date Eucalyptus system is constantly installed and torn down within a day or two life span (check out here to see the Eucalyptus QA system in action). For this reason, we needed to come up with a quick way to set up the monitoring application on the machine where the proxy was installed. Plus, we would like to have all necessary monitoring information displayed on a single HTML page for a quick glance, thus making it easier for the observer to apply intuition on understanding the big picture. As a result, ‘tcpdumpster‘ was born.

Picture 96

The application ‘tcpdumpster‘ runs on the same machine where the proxy is installed. It runs the Linux command “tcpdump port 8888” and parses its output into a list file. This list tracks 8 attributes of the TCP traffic:

  • Unique connections, based on IP
  • Unique connections, based on Port
  • Connection count, per second
  • Connection count, averaged over a minute
  • Connection count, in total
  • Packet length, per second
  • Packet length, averaged over a minute
  • Packet length, in total

With those 8 attributes displayed on a single HTML page, which can be accessed via:

http://192.168.51.6/tcpdumpster.php

, we were able to make some interesting observations on the behaviors of the traffic as the user console starts interacting with the proxy.

TCP Dumpster Examples

The graph below is showing the traffic pattern for 7 minutes, generated by a user logged in to the user console.

Picture 18

Notice the first peak that represents the log-in of the user, followed by the periodic peaks that show the polling of the cloud resource data, and user actions can be seen in the blobs among the peaks.

The graph below is showing the traffic pattern as more selenium-based automated scripts are activated to simulate a large amount of users.

Picture 46

The first block shows when 1 and 2 Selenium scripts are active, and the second block shows when 6 and 12 Selenium scripts are active (check out here to learn how Selenium was used). When graphed for averaged over a minute, the differences between the stages become more visible:Picture 47

When graphed all together, along with the connection data, they look below:

Picture 45

tcpdumpster‘ turns out to be very useful when validating if a newly written selenium script is behaving correctly. The graph below shows the selenium script that launches a new instance, waits until the instance is running, then terminates the instance, waits a few minutes, and repeats:

Picture 81

And, of course, ‘tcpdumpster‘ is very handy when you are running a longterm test; it allows me to set up the test, go to sleep, and wait up the next day to check out the results. The graph below shows how the proxy was able to withstand the constant ‘refresh’ operations from multiple connections for longer than 5 hours:

Picture 94

Now, can you guess what is going on in the graph below?

Picture 105

Check out the GitHub link below and try out ‘tcpdumpster‘ on your own Eucalyptus user console proxy to find out for yourself:

https://github.com/eucalyptus/tcpdumpster

Cloud App. Design: Create a Flexible Automated Web-UI Testing Tool using Selenium

In this article, I will go over the technical details on how in Eucalyptus, we used Selenium to simulate a large number of cloud user workload in order to ensure the quality of user experience in the new Eucalyptus user console.

As covered in my previous blog, Eucalyptus is coming out with a new user console, that is browser-based and intuitive to use, thus playing a key role in promoting the cloud adoption among IT organizations and enterprises.  But, the challenge was to ensure that this brand-new user console would be ready for handling the real-world workload when released out in the wild. The answer to this challenge was to simulate the activities of 150 cloud users using Selenium, an open source tool for automating web application testing.

How to Automate an Online User

The first step was to download Selenium IDE for Firefox. Selenium IDE is a must-to-have GUI tool for automating clicks and input-submits on a web application. After installing Selenium IDE on your computer, you can start Selenium IDE from Firefox’s Tools menu:

When started, Selenium IDE opens up its own separate window:

Notice the red dot on the top-right corner of Selenium IDE. When clicked, Selenium IDE will start recording all the activities you perform on the web-browser — every link you click and every input you type on the browser will be recorded as command-lines on Selenium IDE as seen below:

What Selenium IDE allows you to do is to replay the recorded activities, such as clicking and typing, on the browser in the exact same order that they were performed.

TIP.

But, soon you will notice that when replayed on Selenium IDE, it tends to fly through all the clicks in lighting speed so that the replayed activities often result in failures — the browser and web application cannot keep up with the speed of the clicks performed by Selenium IDE.

In order to prevent such cases, you will need to manually step through the record activities and insert various “pause-and-check” points using ‘waitForElementPresent’ command. For instance, when there is a command ‘click link=Delete’, I would put ‘waitForElementPresent link=Delete’ command prior to the click command to ensure that the page will fully loaded and the link ‘Delete’ is indeed present on the page before allowing Selenium IDE to execute the command ‘click link=Delete’. Later I learned that for every ‘click’ command, it is always a good habit to throw in the ‘waitForElementPresent’ command.

After verified that the recorded action is repeatable via Selenium IDE at its full speed, the next step is to export the action into a Selenium Python WebDriver format:

The result of the export above is a script file that describes the recorded Selenium action in a Python’s unittest format:

Once have the script exported, you can run the recorded action on a remote Selenium server without having to open up a web-browser. In other words, now you can simulate an online user doing the exact same recorded action on a web-browser by simply running the Python script generated by Selenium IDE.

Remote Selenium Server Configuration

Before running the script, you will need to configure a machine to run a remote Selenium server, which will behave like a web-client. The steps are on a Ubuntu machine, you will execute the following commands:

sudo apt-get -y update
sudo apt-get -y install default-jre
sudo apt-get -y install xvfb
sudo apt-get -y install firefox
sudo apt-get -y install python-pip
pip install selenium
Xvfb :0 -ac 2> /dev/null &
nohup java -jar selenium-server-standalone-2.25.0.jar &
export DISPLAY=:0


After running the commands above, you will have a Ubuntu machine capable of running the exported Python Selenium script, which then, simulates an online user opening up a Firefox browser and performing the recorded clicks.

Creating a Flexible, Reusable Testing Tool

Now, your task is to produce many exported Python Selenium scripts for all activities on the web application that will be used as building blocks for creating different user behaviors and workflows.

The first collection of Python Selenium scripts I produced was to visit every single landing page on the Eucalyptus user console. The second collection of Python Selenium scripts was to create cloud resources under the default setting. Having those two sets of Python Selenium scripts allowed me to construct complicated user interactions on the web application. For instance, with a bit of shuffling of the scripts, I could build up a user scenario where the online user would visit the keypair page, create a new keypair, visit the dashboard page, visit the security group page, create a new security group, revisit the dashboard page, and so on.

The next task was to consolidate all the scripts into one library file, getting rid of static values in the variables and breaking down the actions in the scripts into functions. Having a such unified library enables test-writers to stitch and arrange these functions together to construct whole new user scenarios as needed.

When examining the Eucalyptus user console test framework se34euca, you will see that the main library file ‘lib_euca_ui_test.py’ contains the functions that are exported from Selenium IDE, where each function describes a very specific action to perform on the web console. The files ‘testcase_*.py’ list the arrangements of those functions to form simple, or complex user behaviors. Finally, the files ‘runtest_*.py’ are the executables of those test cases that take input of the target web console environment.

Cloud Application

Now that you have a way to convert a Ubuntu machine into a Selenium server and have the Selenium test framework checked into a GitHub repository, you have a way to launch the Selenium test as a cloud instance — using se34euca as an example, the steps are:

Step 1. Launch a cloud instance on a Ubuntu image.
Step 2. Convert the Ubuntu image into a Selenium server by running the configuration commands above, or running the installer in se34euca.
Step 3. Git clone se34euca.
Step 4. Run the test case of your choice.
Step 5. Terminate the instance when the test is finished.

Of course, you can easily automate the step 2, 3, and 4 to wrap the entire process into a single scripted operation. Then, with a help of a cloud infrastructure, such as AWS or Eucalyptus, simulating 150 user can be as simple as launching 150 instances to run the script on each instance by feeding the parameter as user-data.

Code Reference

For those who are interested in creating a framework for testing your own web application, please feel free to check out the Eucalyptus user console test framework se34euca at:

https://github.com/eucalyptus/se34euca

for a reference, and leave a comment if you have any questions or suggestions.

Implication of Fragmentation in Linux

At Eucalyptus we have been proud to say,

“Eucalyptus Cloud runs on almost all major Linux distros: Ubuntu, Debian, CentOS, Red Hat, Fedora, etc.. You name it, we will support it!”

The crowd erupts in cheers, with occasional tears of joy. There will be a parade later.

Yes, it sounds wonderfully majestic as it should be.

When developing open-source software, you need to support all major, stable Linux platforms so that the software can reach out to every single open-source enthusiast who is often loyal to a certain flavor of Linux.

But, what does all this mean to software developers?

A nightmare.

A nightmare accompanied with a horrible migraine.

In the ideal software utopia, all Linux platforms behave in an identical manner; you should be able to run things on Fedora in the same exact way you run those things on Ubuntu. After all, they are all “Linux”, aren’t they?

Welcome to the harsh reality called “fragmentation” in the software world.

It is true that all those distros feel like Linux; they share the same core — Linux kernel — and provide the same level of abstraction which can be described as “Linux experience.”

But, in reality, no two Linux distros are never the same.

The biggest problem with this inconvenient truth is that no one knows for sure what the exact differences are when going from one distro to another distro.

Let’s say we use CentOS as a default Linux distro for developing and testing software. If every function and feature works well on CentOS, can we safely assume that the software will also behave nicely with other Linux distros, such as Red Hat, Ubuntu, and Debian?

The answers to this simple question always fall somewhere in between “Maybe”, “It depends”, “Possibly yes”, “Theoretically it should”, “What is Linux?”, and “NO!”

The only assured way to discover the correct answer is to run tests on the software under all distros.

At present Eucalyptus officially supports three main Linux distros — Ubuntu, Redhat, and CentOS — and is working on adding two more distros: Debian and Fedora. Every time a new distro is added, we are to repeat the entire set of test suite for the new distro. If running a whole set of test suite takes X amount of resources, supporting 5 distros would mean 5X resources. Even further, per distro, we support two of the latest versions — for instance, for CentOS, Eucalyptus supports its version 5 and 6. This additional requirement brings up the total amount testing resources to be 10X.

When translated to the operation cost, if running a complete set of test suite under one Linux environment takes one day, due to the fragmentation of Linux distros, we need to add nine more days of testing in order to completely cover all corner cases in various Linux distros.

The real world implication of this nightmare is that whenever a little tweak goes into Eucalyptus, it might take up to 10 days, in the worst case scenario, to ensure ourselves that this seemingly innocent tweak will not bring down the house under some other Linux distros in strange, unpredictable ways.

In school we are taught to celebrate diversity, but they often forget to emphasize the beauty in simplicity. Handling the issue of the Linux fragmentation remains to be one of many challenges that Eucalyptus has to overcome.

Quality Flow in Eucalyptus

When being interviewed at Eucalyptus, one is often asked, “when do you stop testing software?” This is not a trick question. As a matter of fact, this is not even a question; there is only one answer, and everyone knows it.

You never stop testing.

In Eucalyptus, we take the answer above very much literally.

Introduction

Eucalyptus has always displayed strong interest in creating an innovative, state-of-the-art software development workflow that supports four value-driven development practices: continuous integration, continuous testing, extreme automation, and open development. Based on these core principals, Eucalyptus engineering has designed an advanced quality control process, backed by a versatile automated QA system, whose function is to maintain the quality standard in Eucalyptus that is persistent and unchallengeable.

The graph below shows the overview of Eucalyptus software development workflow:

Graph 1. Quality Flow of Eucalyptus Software Development

Overview

Eucalyptus quality control process ensures that there exists monotonic increase in quality in the code as traveling through quality assurance check-points, known as Quality Gates, represented as green bars in Graph 1. Each quality gate is a set of automated operations performed by Eucalyptus’s automated QA system, which manages testing of Eucalyptus in a completely automated fashion — including bare-metal provisioning, installation and configuration of Eucalyptus, sequencing of automated tests, and providing feedback via various medium.

Eucalyptus quality control process is not just a quality assurance process; it is fully integrated into Eucalyptus software development. The process combines continuous integration and continuous testing into a unified workflow, which is powered by extreme automation via the QA system. The goal is to get rid of mundane tasks for developers so that they can focus on tasks that they are good at doing — developing a cloud infrastructure system of the highest quality.

Advantages

Guarantees in Quality

The branch #master (https://github.com/eucalyptus/eucalyptus) is the face of Eucalyptus open development model; every Eucalyptus engineer and community member checks out from the branch #master for development, experiment, or contribution. Thus, it is absolutely crucial that at all time, the quality of the branch #master is set to meet the standard of a functional cloud infrastructure system. In other words, at any given moment, a person should be able to check out from the branch #master and immediately start developing on top of the existing system without wasting time wondering if the system has any major problems that may influence further development. It is Eucalyptus’s responsibility to provide such guarantees in quality in the open branch #master. It is such guarantees in quality that encourage the community members to actively engage with Eucalyptus since it “defrictionalizes” contributions from merging into the main body of the code.

Commit Often and Fail Fast

In the early days, without formal guidance put in place, Eucalyptus has experienced a fair share of growing pain where development branches have drifted far, far away. The only known, effective solution to the “drifting development branch” problem is to have developers commit and check out the code as often as possible (continuous integration). However, the challenge in this practice is on how to minimize the effort that goes into this checking-in and checking-out process. The more tedious and complicated this process is, the more a developer is inclined to delay the process, causing his or her branch to be drifted far, far away. On the other hand, overly simplified commit-and-checkout process will severely jeopardize the quality of the code since a such process would likely permit unchecked mistakes to sneak into the main body of the code.

Eucalyptus quality control process attempts to address this issue by automating the entire “code-commit to quality-validation” process with a click of a button (extreme automation). As seen on TV, a developer can “just click it and forget it” when committing the code – that is until he or she gets a notification email indicating whether the commit has passed or failed to merge into the branch #master. The goal is to encourage developers to commit frequently by providing an easy mechanism that automatically performs quality verification on the code and returns fast feedback.

Procedures

In Eucalyptus software development workflow, a commit from a development branch to the branch #master is a two-step process. As seen in Graph 1., there is no direct path that leads to #master. Every commit must go through the aggregation branch called #testing. When the code on a development branch – from the far left stage on the graph – is committed to #testing, leaving the hand of a developer, the automated quality assurance process – represented as “Quality Gate 2” in the graph – will determine whether the code is allowed to move to #master. If the code fails to meet the quality standard enforced by Quality Gate 2, the branch #testing will be reverted to remove the history of the failed code. The purpose of this action is to bring the state of the branch #testing back to the last known stable quality. This action allows other developers to continuously commit to #testing even after previous commits in #testing fail to enter the branch #master.

The automated tests triggered by Quality Gate 2 is set to consume about 20 QA machines running tests for 5 hours. The cost of operation for Quality Gate 2 is very high. Therefore, there is a need for setting up a pre-commit check-up process — represented as “Quality Gate 1” in the graph above — in order to perform quick spot-checks on development branches. Such preliminary check-up process aims to minimize failures caused by preventable mistakes when committing to #testing.

Challenges

1. Scale and Resources

Ideally, every commit to the branch #testing must be examined and validated independently. However, in reality, with the limited resources and the frequency of commits, it becomes impossible to process all the commits on time if examined one at a time. Let’s say, in order to perform all the necessary automated tests kicked off by Quality Gate 2, it takes 20 QA machines running tests for 5 hours. Given this resource requirement, processing 5 commits individually would take 100 machines running tests for 5 hours in parallel, or the entire day if those commits end up completing for the resources. It is not difficult to see how fast the wait-line in the queue will grow for pending commits.

The current solution to this problem is batch-processing of commits at a scheduled time. There will be 2 or 3 times a day when Quality Gate 2 opens up to provide the opportunities for pending commits to “try out” for the branch #master. All the commits will be bundled together and go through the automated quality assurance testing as a group. If it passes, then all the commits will enter the branch #master. But, if it fails, then all the commits are rejected together, notifying each developer who submitted the commit accordingly. By examining the test-results report, attached to the notification email, the developers should communicate with each other to determine exactly what went wrong in the system and learn whose commit was responsible for the failure, which will allow the other, non-responsible developers to retry the commit.

2. Quality of Automated Tests

Eucalyptus quality control process counts on an extensive range of automated tests that cover all the essential functionality of a cloud infrastructure system. Those automated tests are designed to capture the accurate view of the target system. At the end of testing, there must be a clear “passed” or “failed” signal for each automated test with a high level of reliability. Any automated test that lacks such reliability provides no value in Eucalyptus quality control process since it would introduce noise when trying to examine the overall quality of the system.

To make the matters more complex, in agile-like development, today’s feature tests are tomorrow’s regression tests. In other words, in a such fast-paced development cycle, the automated tests written for testing new features in this release cycle will be used for detecting regressions in the next release cycle. In such continuous testing environment, combined with agile development, there exists a close relationship between the quality of the features and the quality of the automated tests; poorly written tests for a feature would result in a poorly maintained feature since such tests would fail to detect regressions as the development progresses. Thus, in order to develop good quality features, Eucalyptus quality control process must provide rapid ways to:

  • Ease the maintenance of existing automated tests (for detecting regressions)
  • Ramp up the maturity time of newly created automated tests (for validating new features)

Eucalyptus’s approach to these challenges is to develop a sophisticated testing framework, called “eutester“(https://github.com/eucalyptus/eutester), in order to enhance the automated test writing and maintenance experience.

3. Extremeness in Extreme Automation

Eucalyptus quality control process is glued together by complete automation. The entire process is put on auto-pilot that is guided by various automated pieces and modules working in accordance. Each independent component should have the ability to make decisions based on its own intelligence, meaning that it knows how to configure itself, run tasks, detect failures, examine the condition of the failures, recover from the failures, gather information, communicate with other components, and etc.. A collection of such intelligent components is essential to keep the quality process in motion with minimal interruptions. Now, the challenge is, how do we make it think like a QA engineer?

FYI. Eucalyptus is currently hiring talented Quality Assurance engineers who share the same vision and will power to attack these challenges (http://www.eucalyptus.com/careers/jobs).

%d bloggers like this: