Architecture
CodeCanvas was designed to deploy into Kubernetes clusters, emphasizing scalability and reliability. It is aimed for installation on major cloud providers such as Amazon EKS, Azure AKS, and Google GKE. Support for on-premises infrastructure is also planned.
Core components
CodeCanvas application
The CodeCanvas application is the core component of the system. It is a web application that serves as a backend and provides the user interface for interactions with CodeCanvas.
Additional components:
Database – store the state of the CodeCanvas application and other data. Learn more
Object storage (S3-compatible) – store user data and configuration. Learn more
JetBrains Gateway – a client application that runs on end-user machines and let users create remote dev environments and connect to them. A user interacts with the IDE inside the dev environment through JetBrains Client.
When the JetBrains Gateway connects to a dev environment, it first identifies the version of the IDE running in that environment. It then downloads the corresponding JetBrains Client build to ensure compatibility, launches the Client, and establishes a connection to the IDE.
Relay server
For security reasons, the dev environment cluster typically doesn't allow inbound connections from the outside. To enable communication between the IDE client on the user's machine and the dev environment, CodeCanvas uses a relay server.
Relay server is a component that acts as an intermediary between JetBrains Gateway on user machines and JetBrains IDEs within the dev environments. It is responsible for relaying WebSocket connections between JetBrains Gateway on the user's machine and the dev environment.
Jump server
For the same security reasons, direct SSH connections to dev environments may also be restricted in the network. To enable SSH connections from user machines, CodeCanvas uses a jump server.
Jump server is a component that acts as an intermediary for relaying SSH connections from a user machine to a dev environment. For example, for connecting to a dev environment using an SSH client or VS Code Remote SSH.
External components
Docker registry
The Docker registry is a service that stores Docker images necessary to run CodeCanvas and its services. By default, CodeCanvas uses a public Docker registry hosted by JetBrains. Alternatively, customers can publish the required images to their private registry.
JetBrains CDN
The JetBrains CDN is a service that hosts distribution packages for JetBrains IDEs and feeds that are used by CodeCanvas to get updates on the availability of new IDE versions. CodeCanvas periodically checks which IDE builds are available; worker instances download IDE distribution packages from this CDN.
Instead of using the JetBrains CDN, customers can configure CodeCanvas to use their own HTTP share to get IDE builds.
License server
The license server is a JetBrains service that verifies the validity of IDE licenses.
This is an optional component. CodeCanvas doesn't control how the customer provides licenses for JetBrains IDEs to user machines.
(Optional) SMTP server
The SMTP server sends email notifications to users (e.g., warm-up failures, limit alerts) and for user management tasks like invitations and email confirmations. Managed by the customer.
Git hosting
CodeCanvas doesn't host Git repositories. Instead, it supports connections to external Git hosting services like GitHub, GitLab, BitBucket, and others. Managed by the customer.
(Optional) User directory
CodeCanvas integrates with user directories to provide user authentication and authorization. Supported protocols include OIDC, LDAP, AD, and SAML 2.0. Managed by the customer.
Workers
A worker in CodeCanvas is an agent application that constitutes an essential part of any dev environment or a warmup run. The worker connects to the CodeCanvas backend, gets the definition of a dev environment scheduled for start, and bootstraps its startup. After that, the worker monitors and reports the state of the dev environment to CodeCanvas. The bootstrap process of a dev environment includes:
starting required Docker containers, such as a dev container,
setting up a persistent disk for user data (see Worker storage),
downloading redistributable parts, such as the IDE.
In the case of Kubernetes installation, each dev environment is a pod with a single "worker" container. The worker application runs inside this container and uses the Docker daemon to spin up nested containers – the dev container and the auxiliary sidecar container. This model is known as Docker-in-Docker. In the future, this architecture will allow running dev environments on virtual machines in the same way.
Worker lifecycle
A user creates or activates a dev environment or a warmup run.
CodeCanvas adds the respective task to a queue.
CodeCanvas schedules a Kubernetes pod with the worker for running this task.
If there are available resources in the target Kubernetes cluster, the worker starts, connects to CodeCanvas, takes the task from the queue, and runs it.
If there are no available resources, the dev environment stays in the "provisioning-resources" (pending) state until resources free up or additional nodes are added to the cluster. Note that CodeCanvas doesn't manage the lifecycle of Kubernetes nodes.
After the dev environment is stopped or deleted, the worker terminates the nested containers and exits. The respective Kubernetes pod is terminated after that, ensuring that pods aren't reused for further or parallel tasks.
Worker storage
This is how CodeCanvas manages user data storage in dev environments:
Before creating a dev environment, CodeCanvas creates a persistent volume claim (PVC) for the user data.
Kubernetes gets the PVC and creates a persistent volume (PV) for it using the related cloud block storage (e.g., Amazon EBS, Azure Disk, Google Persistent Disk).
The volume is mounted to the worker pod and used by the dev environment.
When the dev environment is stopped, CodeCanvas unmounts the volume from the worker pod.
When the dev environment is restarted, CodeCanvas mounts the volume to the new worker pod.
Architectural decisions and requirements
CodeCanvas architecture imposes specific requirements and constraints on the infrastructure where it is deployed. These architectural decisions are made to ensure the CodeCanvas's performance, reliability, and security. Below, we will explain the reasoning behind these requirements.
Dynamic volumes (via CSI)
The Container Storage Interface (CSI) is a standardized interface used by Kubernetes to manage and interact with external storage systems. CodeCanvas uses dynamic volume provisioning via CSI to provide persistent storage for dev environments.
Dynamic volumes let user data persist across dev environment restarts. When a dev environment is stopped, the volume is detached from the worker pod, but the data on the volume remains intact in the cloud block storage. When the dev environment is restarted, the volume is reattached to a new worker pod.
The benefits of such an approach are:
Fast dev environment restarts – Users can stop and restart dev environments without the need for lengthy data copying operations – CodeCanvas mounts the existing volume almost instantly. In contrast, if the data were stored in a cold storage solution like S3, it would take much longer to copy the data to the dev environment.
Data safety – As no data is actually copied during dev environment restarts, there is no risk of data loss due to copying errors or interruptions.
CSI snapshots
CSI snapshots are essential for the CodeCanvas warm-up feature which is used to speed up the start of dev environments. During the warm-up, CodeCanvas runs user scripts and builds project indexes in a fresh dev environment. The result of the warm-up is a snapshot of the dev environment's volume. The snapshot is then stored in a cheaper cloud storage (e.g., S3). When a user creates a new dev environment, CodeCanvas takes the snapshot and restores it to the new dev environment's volume.
The benefits of such an approach are:
Fast start of dev environments with a snapshot – Cloud providers have efficient mechanisms for restoring snapshots to volumes, which is much faster than direct copying of data from object storage or downloading a Docker image of a comparable size. For instance, restoring a 10 GB snapshot would take a few seconds in AWS or even sub-seconds in Google Cloud. In contrast, downloading a Docker image of similar size may take 5–10 minutes.
(Not yet available) Fast creation of warm-up snapshots – As cloud providers support incremental snapshots, CodeCanvas creates further warm-up snapshots much faster by adding only the changes since the previous snapshot.
(Not yet available) Cost savings – If a stopped dev environment is not used for some time (e.g., 2-3 days), CodeCanvas can create a snapshot of the disconnected volume and delete the volume. The snapshot is stored in the cloud object storage at a lower cost than the volume in the cloud block storage. Depending on the cloud provider and other factors, this can save up to 80% of the cost of keeping the volume.
(Not yet available) Data backups – snapshots provide a backup mechanism for user data. In case of accidental data loss, users can restore the volume from a snapshot.
(Not yet available) Disk resize – snapshots allow resizing the volume without data loss. CodeCanvas can create a snapshot of the volume, create a new volume with a different size, and restore the snapshot to the new volume.
(Not yet available) High availability – snapshots aren't bound to a single Availability Zone (AZ). If one AZ fails, snapshots allow restarting dev environments in another AZ, unlike volumes that are bound to a single AZ.
Docker-in-Docker
The worker application, which controls the lifecycle of a dev environment, runs inside a container in a Kubernetes pod. The worker uses the Docker daemon to start dev environments in nested containers, a model known as Docker-in-Docker.
This approach has several benefits over running dev environments directly on Kubernetes pods:
Full control – Using Docker-in-Docker, CodeCanvas has direct control over the run environment via the Docker daemon. This allows:
mounting/unmounting inner container volumes at runtime,
efficient log management,
using sidecar containers for additional services, and so on.
Persistent state – Docker volumes store state between runs, preserving user changes and enabling efficient caching.
Fewer requirements for custom images – Custom Docker images for dev environments provided by users don't need to include any CodeCanvas-specific configurations.
(Not yet available) VM support – The Docker-in-Docker architecture allows for a consistent setup across Kubernetes and virtual machines. In the future, CodeCanvas will support running dev environments on virtual machines (VMs) in the same way as on Kubernetes.
Of course, using Docker-in-Docker has not only benefits but also challenges. For example, it requires additional configuration to propagate environment variables to inner containers; it doesn't expose accurate resource usage metrics for inner containers; and it requires the worker to run in the --privileged
mode (see below).
Docker-in-Docker and privileged mode
Docker-in-Docker requires the worker application to have additional permissions on the host system, such as access to the host's devices and filesystem. To grant these permissions, the host runs the worker in the --privileged
mode.
As an alternative to the privileged mode, you can configure Sysbox in the dev environment cluster. Sysbox is a container runtime that provides a secure way to run Docker-in-Docker without the need for the privileged mode.
PostgreSQL
CodeCanvas uses PostgreSQL as the database to store:
the state of the application,
the dev environment state,
data on users, groups, service accounts,
data on namespaces, personal secrets, and other metadata.
S3-compatible object storage
CodeCanvas uses S3-compatible object storage to store dev environment logs, audit logs, and some other data. The storage needs to be hosted in the same cloud provider as the dev environment cluster.