In the past few weeks we've taken a closer look at the various automatic testing procedures integrated into the process of developing TYPO3's core and at the workflow patterns we use. Establishing our review system has been a long and winding road and some parts of this process have had a significant (and very positive!) effect on the main codebase, too.
Today, we'll take a look at how testing TYPO3's core began and how it evolved over the years, letting the widely used open-source software TYPO3 continually grow and develop into the powerhouse content management system it is today.
Over the years, many people have contributed towards improving the testing infrastructure of TYPO3's core. Some of them more, some less. And some of them for a shorter period of time and some since the early days back in 2008 up until now. The number of contributors is so very large and I am strongly tempted to name them, each and every one of them. But ... their names would literally fill an entire book! So I found myself faced with a pretty tough decision to make and in the end decided not to mention any names.
On behalf of our whole team I'd like to give a warm and special thank you to every single one of you. What we have accomplished over the past years, where we stand today and what we are doing to remain successful is built on what each and every one of you contributed towards TYPO3 as part of the community.
No rule without exception! There is one TYPO3 pioneer I’d like to mention.
The first “real” unit test ever added to TYPO3's core was merged to the codebase on January 20th, 2009. It was Oliver Klee who submitted it and this was his commit message:
Fixed bug #10220: Feature: Get some unit tests for the core
Back then, the core team still used SVN as its versioning system and communicated patches via mailing lists. This specific topic had been handled with a total of 45 (!) single emails before being fixed. It seems that these first tests didn't made it to TYPO3 version 4.2, but were released to the public with version 4.3.0.
Extbase and fluid were introduced to the system when TYPO3 core version 4.3 was released. Running the core tests was done with a TYPO3 extension that could be loaded from TER called “phpunit”. The extension basically delivered the native PHPUnit and also added a test runner to execute the unit tests of the core (and extensions) within a module of the TYPO3 backend. Tests were not executed automatically and the tests were often broken by mistake by contributors who were not actively executing the procedure, sometimes noticed and sometimes not.
The first test automation was added in 2012 with TYPO3 CMS 4.5 LTS. This was the rather simple PHP lint testing (unit tests were added further along the line). Linting was executed on the platform Travis-CI. Travis-CI is still actively used by the core team to execute post-merge tests. Whenever a patch is merged into the TYPO3 core, Travis-CI executes the linter, the unit and also the functional tests.
Automatic unit testing was introduced a bit further along the line. Two versions later - with TYPO3 4.7 - our Travis-CI usage was tuned to set up a basic TYPO3 system, to load the extension PHPUnit from the TER and to execute the tests via command line. That was a great leap forward. Executing the tests allowed the core team to establish an “always green” rule: We quickly demanded all tests to be green at any point in time for the active branches
Meanwhile, the unit tests were already going through various refactorings. The file names and file locations were renamed and moved multiple times, and the core established a proper class loading so tons of “require_once()” calls could be removed from test files. The automatic test execution encouraged developers to add further tests and more experience on proper testing was gathered in the community.
With TYPO3 version 4.7 further issues with the given setup became apparent. Especially the test execution based on the TYPO3 specific variant of PHPUnit showed drawbacks: The system was required to basically bootstrap an entire TYPO3 backend and could only be executed within a fully set-up TYPO3 instance. Test execution was thus bound to the specific environment it was executed in, and it required a fully working database. A local developer instance could harm tests through the list of loaded extensions. “Works on my machine” is not good enough with unit tests: They should either fail always and everywhere, or succeed always and everywhere, regardless of any given environment details.
It took our TYPO3 core team quite a while to Improve the situation. Starting with version 6.0, we began the process of restructuring the main codebase. The switch to namespaced classes in PHP led to a massive cleanup of the unit tests, too. Additionally, the former spaghetti code of the main TYPO3 bootstrap process was significantly untied, streamlined and hacked into better understandable pieces.
This improved the testing infrastructure, as the bootstrap could be called in a way that was tailored to unit testing needs. More and more parts of the bootstrap were skipped and the tests were refactored accordingly.
Version 6.2 brought some big changes with it. Composer was integrated into the system and the core team was finally able to ditch the hand-rolled PHPUnit extension as a dependency and started using the vanilla version of PHPUnit. Furthermore most former unit test dependencies, for instance relying on a working database, were dropped. Nowadays, the bootstrap of unit tests consists of just a couple of lines and has no dependency to an underlying TYPO3 instance whatsoever anymore.
All of these steps helped to stabilize the unit tests and put an end to the nerve wracking “works on some machines but not on others” issue.
By changing the TYPO3 bootstrap - which had a long list of benefits, not only for testing purposes - the codebase was also enabled to execute functional tests in a sane way. It allowed to create a well-defined environment, a specific playground in which each test is executed in.
This was integrated with TYPO3 version 6.2, too. I recall a sprint we did at this time where a couple of people were working on the functional test environment. Suddenly one of them shouted out: “We have a dot! A very successful dot!” A single character, meaning that our first single functional test had successfully been executed. The functional test suite was quickly integrated to be executed via Travis-CI to run regularly. Nowadays, we're at 1,100 successful dots for each patch set.
Integrating unit and functional tests in a good way took the team quite some time and tears, with an always growing number of tests at the same time. Acceptance tests were the last missing piece and these were started in early 2016. Some first tests clicked through the backend and verified that major components like the main menu worked as expected.
However, we had to figure out how to execute the acceptance tests in a stable way. Meanwhile being accustomed to unit and functional tests, that wasn't too easy. The first iterations failed randomly, the “headless” solution “phantomjs” turned out to be less stable than it should have been, and the system showed additional issues in high-load scenarios. It took us until this spring to fix this. Since the last refactoring - just a couple of weeks ago - all of the main issues in this area seem to have finally been solved.
The first steps with Bamboo were done in mid 2016. The reason for this step was that we had two issues that we hadn't been able to sort in a satisfactory way. First of all, Travis-CI did not allow us to parallelize different test aspects in a good way. We had to execute each part of the testing after another. With the long running functional tests and the acceptance tests not being much quicker, we were faced with the situation that a single test build sometimes needed half an hour or so to finish. Merging more than a single patch within an hour then lead to significant waiting times. Additionally, we were unable to do proper pre-merge testing: Travis-CI is only triggered when a single patch is actually merged into the system, we had no knowledge as to whether a merge would keep the tests green before someone actually pushed the magic merge button. Triggering Travis-CI more often (for each patch set) was unrealistic due to the long waiting times we had already.
The solution to this was Bamboo. It allowed us to split the test suite into smaller parts and execute single parts simultaneously. Pre-merge tests within Bamboo have been established along the way. Now, if Bamboo gives the green light to a patch set, we know our codebase will stay green if merged.
I hope this blog post was able to give you a useful and interesting insight into the processes the TYPO3 core team, the contributors and the codebase have gone through over the years to establish the system as it is today with its well working test environment. This success is the result of a collaborative effort and I'd like to reach out again to everyone who added their time and their skills with a big “Thank you!” It has been a long way and what we've succeeded in achieving exceeds expectations any of us would have dreamed possible eight years back.
We have one more blog post in this series, in which we'll be taking a look at which areas of TYPO3's core testing still need improvement. At the end of the day it all boils down to software and software is never perfect. And there is always room for improvement.