Building a pipeline in the cloud

The advent of cloud computing has brought huge changes to the way software is hosted and delivered. Online services exist for most things in life, and that’s as true for retail as it is for computing infrastructure. In this article, we’ll look at how continuous integration, delivery and deployment can benefit from cloud technologies, particularly infrastructure-as-code and containers.

Whether you’re new to setting up a CI/CD pipeline or have an existing locally-hosted setup, it’s worth understanding how these techniques and tools can be used so you can adapt them to your needs.

Infrastructure as the limiting factor

Continuous integration and deployment are designed to make the process of releasing software to your users faster and more robust.

Combining a little-and-often approach with automation – of builds, environment creation, and testing – reduces the time from development to release while providing confidence in the quality of the product.

The later stages of a CI/CD workflow typically include end-to-end and performance tests, as well as manual testing, all of which require test environments that closely mirror production. For maximum efficiency and consistency of testing, these environments should be refreshed automatically rather than maintained manually. Putting all of this into practice requires not just DevOps skills and tools, but also a raft of infrastructure to provide the compute capacity for the CI server, build agents, test environments and data stores.

The number of machines required to support your build process depends on the size and complexity of your project and the number of developers contributing to it. However, that number may also vary over time.

If you’re hosting and managing all your own infrastructure for a CI/CD pipeline, you have to decide where to strike the balance between having enough capacity to run multiple jobs concurrently in times of high demand versus the cost of buying and maintaining machines that sit idle for significant periods of time. This is where cloud-hosted infrastructure can offer significant benefits.

What can the cloud do for your continuous integration pipeline?

The advent of cloud-hosted infrastructure has brought with it huge changes to the way infrastructure is managed.

With infrastructure-as-a service (IaaS), computing resource is provided via virtual machines (VMs) or containers. If you need more capacity, you need only ask for it (and pay, of course) – procurement, installation and management and physical hardware is no longer a concern.

This way, an organization has no visibility of the physical machines that host these VMs or containers (other than awareness of geographic region for regulatory and disaster recovery purposes) and this has enabled an important shift in mentality.

Infrastructure-as-code

In a cloud-hosted context, the vast physical resources that back up the service make it possible to treat servers as interchangeable and disposable commodities – so-called “cattle”. This contrasts with traditional bare-metal hosting, where servers have long been treated like pets; they are given a proper name, looked after when they get sick, and expected to live for a relatively long time.

With the “cattle” approach, if a server needs to be updated or repaired it is simply removed and replaced with a new one that meets the requirements. No time or effort is expended on modifying or fixing an existing instance. Instead, the image is re-configured as required and a new instance deployed.

With IaaS you only pay for the compute resource you use, so it makes sense to adopt the cattle mentality. This is where infrastructure-as-code (IaC) comes in. IaC is a DevOps practice whereby the provisioning of infrastructure is made repeatable using scripts.

Putting all the configuration details into code and keeping it in source control just like application code means you can avoid making manual tweaks to individual environments which cause inconsistencies to appear.

Just like the software under development, the infrastructure itself can be put through a CI/CD pipeline to ensure it works as expected, with the benefit that changes can be rolled back easily.

Codifying your infrastructure opens the door to more automation; environments can be created automatically when they are needed and updated by rolling out a new configuration.

The effectively limitless supply of compute resource from the IaaS provider allows you to scale up and down according to demand, while ensuring that if an instance fails it can be replaced immediately. This means your CI/CD setup can respond to increases in demand and ensure a reliable service.

Outsourcing hardware

With both IaaS and CaaS the maintenance of the physical hardware, including management of networking and storage capacity, and data center logistics are all handled by the cloud provider as part of their service offering.

This frees up your team to focus on optimizing your pipeline processes and keeping it secure. As cost is a factor of both processing power and time, it’s worth taking the time to parallelize tasks where possible, thereby getting results to your developers more quickly than if fewer machines are used over a longer period.

Containers

Containers apply IaC principles and allow you to make even more effective use of cloud-hosted infrastructure. A container packages up software with all the dependencies it needs to run, putting an end to the days of “well, it works on my machine” and a spot-the-difference hunt for configuration details. Docker is one of the best-known container technologies, but other options are also available.

Like virtual machines, containers allow multiple applications to run on the same physical server while keeping them isolated from each other. However, unlike VMs, containers do not include an OS, giving them a smaller footprint, and they do not demand a fixed portion of the host resources. As a result, it’s possible to fit far more containers than VMs on a single machine, making them ideal for deploying software to cloud-hosted infrastructure efficiently.

Adopting containers means packaging up the environmental dependencies and configuration details with the software into a single artifact that can be deployed on any machine that provides the container runtime.

An application may be split into multiple containers – as is the case with a microservice architecture – in which case the containers need to be deployed to the same machine or a networked cluster of machines. Container orchestration tools, such as Kubernetes, have been developed to make it easier to work with large numbers of containers by automating tasks like deployments, management and scaling.

Using containers in a CI/CD workflow makes the process of deploying the latest build to different stages of the pipeline much simpler. The build artifact is a container image, which can be deployed consistently to each test environment before being released to production.

With CI/CD cloud pipeline, containers make efficient use of compute resources and allow you to leverage automation tools. You can increase capacity when demand is high, but save on costs by killing off containers and releasing the underlying infrastructure when demand is lower.

In addition to IaaS, several cloud providers are now also offering containers-as-a-service (CaaS), allowing organizations to deploy containers directly without having to manage the orchestration platform and configure the cluster.

Considerations for cloud CI hosting

While a CI/CD cloud pipeline can offer significant benefits in terms of infrastructure cost, scalability and reliability, there are some drawbacks to consider.

Knowledge and skills

First, using cloud services like IaaS and CaaS inevitably involves a learning curve. If you don’t already have expertise in these areas, your team members will need time to upskill or you’ll need to look at bringing that knowledge in. Having said that, experience of working with cloud technologies is a desirable skill, and giving your teams the opportunity to develop those skills and use the latest technology can be a benefit both in terms of staff retention and hiring.

System architecture

If the software under development has been designed for the cloud, using microservices, containers and other cloud native practices, then automating progress through the CI/CD cloud pipeline using containers is relatively straightforward. On the other hand, if you’re working with a monolithic architecture, then packaging your software into containers can be a challenge.

Of course, containers are not essential for a cloud-hosted pipeline, and you can still use virtual machines on a cloud provider’s infrastructure to run builds and provide consistent pre-production environments for testing. However, VMs consume more resources than containers and environments will need to be configured separately.

Cost

In the context of the cloud, time is money, and you don’t want to be paying for compute resource to sit idle. For cloud hosting to be cost effective, it’s essential to use it efficiently. That means leveraging tools that will monitor usage and release idle instances after a timeout period, or implementing that logic yourself. The latter option may require skills your organization doesn’t already have, so it’s worth investigating and weighing the options.

Security

Security has always been a concern when it comes to hosting data and services in the cloud. For some companies, just the concept of critical software being located on a third party’s kit is a no-go. That said, many organizations are choosing to use public clouds to host both their live services and deployment pipelines, from source control repository, to CI server and test environments.

Understanding the potential attack vectors, building protections into your pipeline to prevent it from being used by malicious actors to access your live system and implementing best practices around credential management, test data and access control are all essential to mitigate the risks.

Hybrid approaches

While infrastructure-as-code, containers and container orchestration all have their roots in cloud technology, they can be used in hybrid infrastructure as well.

The same tools can be used both in private clouds, and in on-premise infrastructure, with the caveat that there is a limit to how far the pipeline can scale. If your organization is planning to transition to cloud-hosted infrastructure in future, then adopting cloud native tools early will allow you to build expertise in advance and ease the transition.

The cloud native practice of infrastructure-as-code brings multiple benefits for continuous integration and deployment on local infrastructure. For starters, configuring a new environment is much quicker, as you just need to run a script.

Codifying environment creation also provides consistency, so you can ensure parity between your production setup and pre-production environments – whether that’s for security, performance or UI testing, or sandboxes for support and sales teams. By keeping infrastructure configuration files in source control, the team has an audit trail of what changes were introduced and when, which can make debugging environmental issues much simpler.

In some cases organizations choose to use cloud resources for particular stages of their deployment pipeline. For example, load and performance tests can require substantial resources, and it may be more cost effective to create a temporary environment in the cloud in order to conduct them. When considering this approach, it’s important to consider the potential complexity and time involved to move your build to different infrastructure.

Why choose the cloud for your CI pipeline?

While an automated CI/CD process offers significant benefits in terms of speed of delivery and quality assurance, the available infrastructure is a limiting factor when it comes to speed and throughput. By moving your pipeline into the cloud, you no longer have to make a call between provisioning sufficient servers to cope with peak demand and the cost of buying and managing machines that are not in constant use.

Cloud native technologies and practices not only enable efficient use of cloud-hosted infrastructure but have also enhanced continuous integration and deployment techniques. The same approaches can also be applied to local infrastructure, making delivery of software faster and more reliable.