Automated Testing for CI/CD

The aim of continuous integration and deployment (CI/CD) is to enable development teams to frequently deliver working software to users, thereby both delivering value and getting useful feedback on how their product is used in the real world. Many organizations have embraced the DevOps practice as a way to keep up with the competition.

However, the business pressure to deliver faster shouldn’t diminish the quality of what is being produced. After all, your users expect stable and working software, even when they’re clamouring for the next shiny thing. That’s why a reliable and thorough testing process that gives you confidence in your latest build is essential to your continuous integration and delivery practice.

The case for automated testing

Testing is essential to assuring the quality of software and has long formed part of the software development process. In a waterfall context, the manual testing or QA stage took place after code had been developed and integrated, and the purpose was to check whether the application behaved as per the specification.

This linear approach slows down the release process and means your developers cannot check that what they have built works as intended until much further down the line, when a lot more time has been spent building on top of it. By contrast, a CI/CD process enables an agile approach of short iterative cycles that provide rapid feedback and allow updates to be released little and often. A key part of these short, iterative cycles is testing to automatically validate that the new code works and has not broken anything else.

Continuous integration involves committing code changes to master or trunk regularly, triggering a build if applicable and testing the software each time. To really reap the benefits of CI, your team members should aim to commit changes at least daily. However, even on a small team, doing this level of testing manually would place considerable demand on testers and involve very repetitive work. This is where automated testing comes in.

Automation is ideal for repetitive tasks and produces more consistent results than manual testing, as people inevitably risk missing details or performing tests inconsistently when asked to do the same steps over and over again.

As well as being faster than running the equivalent tests manually, automated tests can be run in parallel, so you can scale up your testing (as far as your infrastructure allows) when time is of the essence. Although writing automated tests involves an upfront investment of time, it soon pays for itself when your team members are committing changes regularly and releasing to production much more frequently.

While automated testing cuts out a lot of boring, repetitive tasks it doesn't make your testers redundant. As well as defining and prioritizing test cases, testers are involved in writing automated tests, often in collaboration with developers. Testers are also needed for tests that cannot be automated, as we’ll discuss later.

Where does testing fit into the CI/CD process?

The answer is that testing takes place at multiple stages throughout the pipeline. If you’re new to continuous integration and deployment this may sound like overkill, but CI/CD is all about tight feedback loops that allow your team to find out about problems as early as possible.

It’s much easier to fix an issue soon after it’s been introduced as this avoids more code being written on top of a bad foundation. It’s also more efficient for your team to make changes before they move onto the next thing and lose their context.

Many automated testing tools support integration with CI/CD tools, so you can feed the test data into the pipeline and run the tests in stages, with results provided after each step. Depending on your CI tool, you can also choose whether to move a build to the next stage based on the outcome of the tests in the previous step.

To get the most out of your pipeline, it generally makes sense to order your tests so that the fastest tests run first. This gives you feedback sooner and makes for more efficient use of test environments, as you can ensure initial tests have passed before running longer, more involved tests.

When considering how to prioritize both creation and running of automated tests, it’s helpful to think in terms of the testing pyramid.

Building a testing pyramid

The testing pyramid is a handy way of conceptualizing how to prioritize tests in a CI/CD pipeline, both in terms of relative number and the order in which they are performed. Originally defined by Mike Cohn, the testing pyramid shows unit tests at the bottom, service tests in the middle and UI tests at the top.

Although the naming might not be accurate, the premise is sound: start with a strong foundation of automated unit tests that are quick and simple to run, before progressing to tests that are both more complex to write and take longer to run, and finish with a small number of the most complex tests. What types of tests should you consider? Let’s explore the options.

Unit tests

Unit tests rightly form the basis of the testing pyramid. These tests are designed to ensure your code works as you expect by addressing the smallest possible unit of behavior. For teams that have decided to invest in writing unit tests, developers typically take responsibility for writing them as they write the code. That follows automatically when practicing test driven development (TDD), but TDD is not a requirement for writing unit tests.

If you’re working on an existing system and haven’t previously invested in unit tests, writing them for your entire codebase from scratch is often an insurmountable barrier. While wide coverage with unit tests is recommended, you can start with whatever you have and add to it over time. A realistic strategy can be to add unit tests to any piece of code you touch, thereby ensuring all new code is covered and prioritizing existing code based on what you interact with while developing.

Integration tests

With integration tests, you ensure that multiple parts of your software interact with each other as expected, such as the interaction between some application code and a database. It can be helpful to subdivide integration tests into broad and narrow. With narrow integration tests, the interaction with another module is tested using a test double rather than the actual module, whereas broad integration tests use the actual component or service.

Depending on the complexity of your project and the number of internal and external services involved, you may want to write a layer of narrow integration tests which will run more quickly than broad integration tests (as they don’t require other parts of the system to be available), and follow these with a set of broad integration tests, potentially targeting higher priority areas of your system.

End-to-end tests

Also known as full-stack tests, end-to-end tests look at the entire application. While these tests can be run through the GUI, they don’t have to be; an API call can also exercise multiple parts of the system (although APIs can also be tested with integration tests). The testing pyramid recommends having a smaller number of these tests, not only because they take longer to run but also because they tend to be brittle.

Any change to the user interface can break these tests, resulting in unhelpful noise in your test results and time required to update the test. It pays to design end-to-end tests carefully, and with an understanding of what has already been covered by lower level tests, so that they provide the greatest value.

Performance tests

Although the testing pyramid makes no reference to performance tests, it is worth considering including them in your automated test suite, particularly for products where stability and speed are a key requirement.

Under the general heading of performance tests comes a range of testing strategies designed to check how your software will behave in a live environment. Load testing checks how the system behaves when demand increases, while stress testing deliberately exceeds expected usage and soak (or endurance) testing measures performance under a continued high load.

With these types of testing the aim is not just to confirm that the software will cope with defined parameters, but also to test how it behaves when those parameters are exceeded, ideally failing gracefully rather than crashing out in flames.

Test environments

Both performance and end-to-end tests require test environments that are very similar to production and may require test data. For an automated testing regime to provide confidence in the software under test, it’s important for tests to be run in the same way each time, and that includes ensuring test environments remain consistent between runs (although they should be updated to match production when changes are applied there).

Managing environments manually can be a time-consuming exercise, so it’s worth considering automating the steps to create and tear down pre-production environments with each new build.

Working with feedback

The purpose of running automated tests as part of your CI/CD practice is to get rapid feedback on the changes that you have just made, so listening and responding to that feedback is essential. CI servers typically integrate with automated testing tools so you can surface all the results in one place. Development teams often combine a dashboard or radiator display of the latest results with automated notifications to communication platforms, like Slack, to keep them informed of how the latest build is performing.

When a test fails, understanding which area of the codebase the test relates to and being able to view any information produced by the test, such as a stacktrace, output value or a screenshot, can speed up the process of getting to the root cause. It’s worth taking the time to design tests carefully, so that each tests one thing, and to label them specifically so you can understand what has failed. Testing and CI tools that provide additional information around test failures can also help you get your builds back in the green sooner.

As ever, tools and processes are only part of the equation. A really good CI/CD practice requires a team culture that recognizes not just the value of automated tests, but also the importance of responding to failed tests quickly in order to keep the software in a deployable state.

Is CI/CD the end of manual testing?

A common misconception among those new to CI/CD is that automated testing obviates the need for manual testing, and for professional testers. While automation frees up some time for QA team members, it does not make them redundant. Rather than spending time on repetitive tasks, testers can focus on defining test cases, writing automated tests and applying their creativity and ingenuity to exploratory testing.

Unlike automated tests which are carefully scripted for execution by a computer, exploratory testing requires only a loose remit. The value of exploratory testing is in finding things that planned, structured testing misses. Essentially, you’re looking for issues that you have not already considered and written a test case for. When deciding which areas to explore, consider both new features and areas of your system that would result in the most harm if something were to go wrong in production.

Exploratory testing should not slide into manual, repetitive testing; the intention is not to conduct the same set of tests each time. When issues are discovered during exploratory testing, as well as fixing it, take the time to write an automated test (at the appropriate level in the test pyramid) so that if it occurs again it will be caught and much earlier in the process. To make efficient use of testers’ time, manual testing should only take place after all automated tests have passed.

Continuous improvement for test automation

Automated testing plays a central role in any CI/CD pipeline. While writing automated tests requires an investment of time and effort, the benefits of rapid feedback and visibility of how deployable the code is mean automated tests soon pay for themselves. But building a test suite is not something you do once and forget.

Your automated tests should form as much a part of your application as the rest of your code, and therefore need to be maintained to ensure they remain relevant and useful. As such, the continuous improvement you apply to your code also applies to your tests.

Continuing to build in automated test coverage for new features and feeding in findings from exploratory testing will keep your test suite effective and efficient. It’s also worth taking the time to see how your tests are performing, and whether it’s worth re-ordering or breaking down the steps in your process to get some feedback sooner.

CI tools can provide various metrics to help you optimize your pipeline, while flaky test indicators can flag up unreliable tests which may be giving you false confidence or concern. But while metrics can help you improve your automated test process, avoid the trap of thinking test coverage is a goal in itself. The real aim is to regularly deliver working software to your users. Test automation serves that goal by providing rapid, reliable feedback so that you can have confidence in deploying your software to production.