Quality Flow in Eucalyptus
When being interviewed at Eucalyptus, one is often asked, “when do you stop testing software?” This is not a trick question. As a matter of fact, this is not even a question; there is only one answer, and everyone knows it.
You never stop testing.
In Eucalyptus, we take the answer above very much literally.
Eucalyptus has always displayed strong interest in creating an innovative, state-of-the-art software development workflow that supports four value-driven development practices: continuous integration, continuous testing, extreme automation, and open development. Based on these core principals, Eucalyptus engineering has designed an advanced quality control process, backed by a versatile automated QA system, whose function is to maintain the quality standard in Eucalyptus that is persistent and unchallengeable.
The graph below shows the overview of Eucalyptus software development workflow:
Graph 1. Quality Flow of Eucalyptus Software Development
Eucalyptus quality control process ensures that there exists monotonic increase in quality in the code as traveling through quality assurance check-points, known as Quality Gates, represented as green bars in Graph 1. Each quality gate is a set of automated operations performed by Eucalyptus’s automated QA system, which manages testing of Eucalyptus in a completely automated fashion — including bare-metal provisioning, installation and configuration of Eucalyptus, sequencing of automated tests, and providing feedback via various medium.
Eucalyptus quality control process is not just a quality assurance process; it is fully integrated into Eucalyptus software development. The process combines continuous integration and continuous testing into a unified workflow, which is powered by extreme automation via the QA system. The goal is to get rid of mundane tasks for developers so that they can focus on tasks that they are good at doing — developing a cloud infrastructure system of the highest quality.
Guarantees in Quality
The branch #master (https://github.com/eucalyptus/eucalyptus) is the face of Eucalyptus open development model; every Eucalyptus engineer and community member checks out from the branch #master for development, experiment, or contribution. Thus, it is absolutely crucial that at all time, the quality of the branch #master is set to meet the standard of a functional cloud infrastructure system. In other words, at any given moment, a person should be able to check out from the branch #master and immediately start developing on top of the existing system without wasting time wondering if the system has any major problems that may influence further development. It is Eucalyptus’s responsibility to provide such guarantees in quality in the open branch #master. It is such guarantees in quality that encourage the community members to actively engage with Eucalyptus since it “defrictionalizes” contributions from merging into the main body of the code.
Commit Often and Fail Fast
In the early days, without formal guidance put in place, Eucalyptus has experienced a fair share of growing pain where development branches have drifted far, far away. The only known, effective solution to the “drifting development branch” problem is to have developers commit and check out the code as often as possible (continuous integration). However, the challenge in this practice is on how to minimize the effort that goes into this checking-in and checking-out process. The more tedious and complicated this process is, the more a developer is inclined to delay the process, causing his or her branch to be drifted far, far away. On the other hand, overly simplified commit-and-checkout process will severely jeopardize the quality of the code since a such process would likely permit unchecked mistakes to sneak into the main body of the code.
Eucalyptus quality control process attempts to address this issue by automating the entire “code-commit to quality-validation” process with a click of a button (extreme automation). As seen on TV, a developer can “just click it and forget it” when committing the code – that is until he or she gets a notification email indicating whether the commit has passed or failed to merge into the branch #master. The goal is to encourage developers to commit frequently by providing an easy mechanism that automatically performs quality verification on the code and returns fast feedback.
In Eucalyptus software development workflow, a commit from a development branch to the branch #master is a two-step process. As seen in Graph 1., there is no direct path that leads to #master. Every commit must go through the aggregation branch called #testing. When the code on a development branch – from the far left stage on the graph – is committed to #testing, leaving the hand of a developer, the automated quality assurance process – represented as “Quality Gate 2” in the graph – will determine whether the code is allowed to move to #master. If the code fails to meet the quality standard enforced by Quality Gate 2, the branch #testing will be reverted to remove the history of the failed code. The purpose of this action is to bring the state of the branch #testing back to the last known stable quality. This action allows other developers to continuously commit to #testing even after previous commits in #testing fail to enter the branch #master.
The automated tests triggered by Quality Gate 2 is set to consume about 20 QA machines running tests for 5 hours. The cost of operation for Quality Gate 2 is very high. Therefore, there is a need for setting up a pre-commit check-up process — represented as “Quality Gate 1” in the graph above — in order to perform quick spot-checks on development branches. Such preliminary check-up process aims to minimize failures caused by preventable mistakes when committing to #testing.
1. Scale and Resources
Ideally, every commit to the branch #testing must be examined and validated independently. However, in reality, with the limited resources and the frequency of commits, it becomes impossible to process all the commits on time if examined one at a time. Let’s say, in order to perform all the necessary automated tests kicked off by Quality Gate 2, it takes 20 QA machines running tests for 5 hours. Given this resource requirement, processing 5 commits individually would take 100 machines running tests for 5 hours in parallel, or the entire day if those commits end up completing for the resources. It is not difficult to see how fast the wait-line in the queue will grow for pending commits.
The current solution to this problem is batch-processing of commits at a scheduled time. There will be 2 or 3 times a day when Quality Gate 2 opens up to provide the opportunities for pending commits to “try out” for the branch #master. All the commits will be bundled together and go through the automated quality assurance testing as a group. If it passes, then all the commits will enter the branch #master. But, if it fails, then all the commits are rejected together, notifying each developer who submitted the commit accordingly. By examining the test-results report, attached to the notification email, the developers should communicate with each other to determine exactly what went wrong in the system and learn whose commit was responsible for the failure, which will allow the other, non-responsible developers to retry the commit.
2. Quality of Automated Tests
Eucalyptus quality control process counts on an extensive range of automated tests that cover all the essential functionality of a cloud infrastructure system. Those automated tests are designed to capture the accurate view of the target system. At the end of testing, there must be a clear “passed” or “failed” signal for each automated test with a high level of reliability. Any automated test that lacks such reliability provides no value in Eucalyptus quality control process since it would introduce noise when trying to examine the overall quality of the system.
To make the matters more complex, in agile-like development, today’s feature tests are tomorrow’s regression tests. In other words, in a such fast-paced development cycle, the automated tests written for testing new features in this release cycle will be used for detecting regressions in the next release cycle. In such continuous testing environment, combined with agile development, there exists a close relationship between the quality of the features and the quality of the automated tests; poorly written tests for a feature would result in a poorly maintained feature since such tests would fail to detect regressions as the development progresses. Thus, in order to develop good quality features, Eucalyptus quality control process must provide rapid ways to:
Ease the maintenance of existing automated tests (for detecting regressions)
- Ramp up the maturity time of newly created automated tests (for validating new features)
Eucalyptus’s approach to these challenges is to develop a sophisticated testing framework, called “eutester“(https://github.com/eucalyptus/eutester), in order to enhance the automated test writing and maintenance experience.
3. Extremeness in Extreme Automation
Eucalyptus quality control process is glued together by complete automation. The entire process is put on auto-pilot that is guided by various automated pieces and modules working in accordance. Each independent component should have the ability to make decisions based on its own intelligence, meaning that it knows how to configure itself, run tasks, detect failures, examine the condition of the failures, recover from the failures, gather information, communicate with other components, and etc.. A collection of such intelligent components is essential to keep the quality process in motion with minimal interruptions. Now, the challenge is, how do we make it think like a QA engineer?
FYI. Eucalyptus is currently hiring talented Quality Assurance engineers who share the same vision and will power to attack these challenges (http://www.eucalyptus.com/careers/jobs).