Software testing tasks can be laborious and time-consuming to do manually. Nowadays, anyone taking their business seriously has automatic testing integrated into the process of development. There are tons of good reasons to do so. Developers cottoned on to the advantages of test automation pretty early and now there is also a growing number of marketers and customers who follow this way of procedure, thus improving their software product and increasing their sales.
This post is the first of a few that we plan to publish here in the next couple of weeks. We´ll be dealing with the multiple stages of testing and reviewing that changes, i.e. patches, to TYPO3´s core have to pass before being merged into the system.
To begin with, what these posts will not cover, is the question as to why test automation is a good thing in the first place. There are lots of really good reads on the net to answer that. Here, we´ll direct our focus on the workflow patterns we use for testing TYPO3´s core and also give an overview on how the core team handles different testing demands in their daily work routine.
This first post will lay the foundation for the ones to follow and as there is so much to tell, it will be slightly longer than those to come. Bear with us, future posts will be shorter reads!
Software testing has various levels, typically somewhere between three and five. The TYPO3 core is a ready-made product as well as having a framework providing APIs. Due to these APIs, some parts of the testing have been adapted. Therefore, although our testing is fairly similar to regular procedures, it is not done fully by the book. Our process of testing TYPO´s core consists of four different steps. We will be focusing on each one of them in a single blog post in the near future, so stay tuned! The specific steps are:
- Unit tests
- Functional tests
Testing bigger parts of a code section including preparing and checking the state of the underlying database before and after operations. The TYPO3 v8 core currently comes with roughly 1100 functional tests.
- Acceptance tests
Remote control a browser to click through parts of the application. The TYPO3 v8 core currently comes with about 70 acceptance tests.
- Integrity tests
Various checks to verify the integrity of the system on different levels, for instance uniqueness of exceptions, a CGL checker and a PHP linter. 6 different types of such tests are currently in place.
The total of single tests handled in the TYPO3 core sums up to more than 10,000. On top of that, the core also supports multiple different platforms. Core version 8 supports PHP 7.0 and PHP 7.1, as well as MySQL and PostgreSQL and there´s more in the pipeline. To verify that different combinations are okay, it makes sense to run all unit tests on all currently supported PHP versions and to also run functional tests on all supported database platforms. At the time of this writing, these tasks sum up to the incredible number of more than 56,000 single tests in total.
Developers expect and need quick feedback from the test runs as to whether a patch is clean or not. Running all acceptance tests takes roughly 20 minutes and running functionals on a not well optimized platform can easily take more than an hour. This is way too much.
Testing must be convenient and quick, otherwise developers are easily tempted to simply skip this step. Being blocked by testing is such a bore.
The core team requires every single patch to be “green” (meaning that all tests are ok) before allowing it to be merged into the core. Otherwise, a merger would constantly have to post-fix test fails, which again is inconvenient and ends up with the risk of testing not being done thoroughly or the test suite constantly being “red”.
Fact is: The entire test series must be run pretty often and needs be executed as quickly as possible. Each and every single patch version that is pushed to the review system goes through every step of this process. At highly active phases during core development - for instance at TYPO3 code sprints - dozens of patch sets are pushed to the review system every single hour. The core team requires that every single one of them runs through the entire test suite. It is only then and if the results of these test are positive, that a patch has a chance of being merged into TYPO3´s core. This last step is done manually. A core team member takes a close look at the results of the review process and if all is well, the patch is merged.
The only way of dealing with the demands “handle lots of single runs” as well as “give us feedback quickly” at the same time, is by splitting the test suite into small pieces and running the single processes simultaneously.
The solution is to “throw more hardware at the problem”, which is a fairly common approach in the IT branch.
When the TYPO3 GmbH was created mid/end 2016, we quickly figured out that Atlassian´s continuous integration solution Bamboo could fulfill TYPO3´s core development requirements. So we started building and maintaining that system for the core team.
In Bamboo, each actively maintained TYPO3 core branch (currently v7 LTS, v8 LTS and master) has a dedicated test plan. Each plan consists of a series of jobs. A typical job does the following 4 single tasks: fetch correct core git branch, apply the patch to test, composer install and run tests. Bamboo tracks the output of each job for a single plan build and marks the build as failed if a single task fails or as green if everything runs smoothly.
For the parallelization, Bamboo comes with so-called “remote agents” which live on servers and run a java application that communicates with a master instance. A queue manager hands single jobs over to an agent and tracks the result. Single agents neither know each and nor communicate with each other.
Bamboo supports testing with local agent caches. For instance, it holds a copy of the main core git repository which is hundreds of megabytes in size, so it does not need to be cloned over the network for each job. Bamboo also takes care of removing test scenarios after test execution, which means that there are no left overs for the next job and everything is tidly cleaned up.
- 1 * validate integrity of composer.json file
- 1 * execute all unit tests with PHP 7.0
- 1 * execute all unit tests with PHP 7.1
- 2 * execute all unit tests in random order with PHP 7.0 - to find nasty side effects between tests
- 2 * execute all unit tests in random order with PHP 7.1
- 10 * execute functional tests on PHP 7.1 with MySQL - each job takes care of ~1/10th of the tests
- 10 * execute functional tests on PHP 7.0 with postgreSQL - again split into 10 sections
- 8 * execute acceptance tests on MySQL - each job executes 1/8th of the tests
- 1 * execute linter on all PHP files with PHP 7.0
- 1 * execute linter on all PHP files with PHP 7.1
- 1 * execute a script to find violations to coding guidelines in given patch
- 1 * execute various integrity test scripts
As TYPO3 comes with a platform promise the number of jobs will increase for a supported branch in the course of time. For instance, we’ll add jobs for Microsoft SQL Server or PHP 7.2 if they are added to the official list of supported platforms, but we would never drop PHP 7.0 from TYPO3 v8 as we promised that this will be supported until end-of-life of v8 in 3 years´ time.
Splitting the test plan into so many jobs and executing them simultaneously currently gives us a build duration of merely ~6 minutes. This means that the final test result of any patch is available just a few minutes after a patch has been pushed for review.
The big advantage of this review process is that both the developer as well as the core team know if a patch is clean before it is merged into the system.
In the course of the review process, a developer builds a patch, runs it through the automatic testing and checks the results of the test and - if necessary - repeats this procedure untíl the patch´s test build is green. After this step is completed and before merging the patch, a second and third verification is always required. Two further developers check the patch and vote for it if they are satisfied with the results. Naturally, a contributor raises his or her chances for getting these additional verifications, if the patch´s test build is green.
Quite a bit of hardware is needed for handling these tests. In next week's post we´ll be digging into the hardware products we use to keep things running smoothly and swiftly. So stay tuned and be well!