Datalore 2024.4 Help

Install on a Kubernetes cluster using Helm charts

The instructions in this article describe the installation of Datalore On-Premises on a Kubernetes cluster using Helm.

The chapters in this section describe the processes of installing, configuring, and updating Datalore On-Premises in Kubernetes deployment (Helm charts method).

This is what Kubernetes-based setup for Datalore looks like:

Kubernetes-based Datalore setup

You will learn how to do the following:

  • Basic installation: You complete the basic procedure to get Datalore On-Premises up and running on the infrastructure of your choice.

  • Required and optional configuration procedures: You customize and configure Datalore On-Premises. Some of these configurations are essential for you to start working on your projects.

  • Upgrade procedure: You upgrade your version of Datalore On-Premises. We duly notify you of our new releases.

It is highly recommended that you have experience using the Kubernetes technology, particularly Helm. For the PoC purpose, we suggest trying the Docker-based installation.

Prerequisites

Before installation, make sure that you have the following:

  • k8s cluster

  • Kubectl on your machine pointed to this cluster

  • Helm

This installation was tested with Kubernetes v1.24 and Helm v3.12.3, but other versions may work too.

Hardware requirements
  • Datalore server machine: 4GB of RAM (the number of CPU is irrelevant if the load is not high)

  • For every concurrently run notebook: from 4GB of RAM

AWS EKS deployment limitations

Datalore's Reactive mode may not operate properly on an Amazon EKS cluster with the Amazon Linux (default option) compute nodes. We recommend that you use Ubuntu 20.04 with the corresponding AMIs specifically designed for the EKS.

Here are our tips for AWS EKS deployments:

  • To find an AMI for manual setup, follow this link and select your option based on the cluster version and region.

  • To configure the cluster deployment using Terraform, you can refer to this sample file.

Basic Datalore installation

Follow the instruction to install Datalore using Helm.

Install Datalore

  1. Add the Datalore Helm repository:

    helm repo add datalore https://jetbrains.github.io/datalore-configs/charts
  2. Create a datalore.values.yaml file.

  3. In datalore.values.yaml, add a databaseSecret parameter to set up your database password. A random string is advised.

    databaseSecret: password: xxxx
  4. Configure your volumes. In datalore.values.yaml, add the following parameters:

    volumes: - name: storage ... - name: postgresql-data ...

    where:

    • storage: contains workbook data, such as attached files (UID:GID 5000:5000).

    • postgresql-data: contains PostgreSQL database data (UID:GID 999:999).

    Below are exemplary procedures of configuring your volumes:

    Configure hostPath volumes

    1. Create directories:

      mkdir -p /data/postgresql mkdir -p /data/datalore chown 999:999 /data/postgresql chown 5000:5000 /data/datalore
    2. Add to datalore.values.yaml:

      volumes: - name: postgresql-data hostPath: path: /data/postgresql type: Directory - name: storage hostPath: path: /data/datalore type: Directory

    Use volumeClaimTemplates

    If you set up volume auto-provisioning in Kubernetes, you can replace volumes with volumeClaimTemplates.

    volumeClaimTemplates: - metadata: name: storage spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi - metadata: name: postgresql-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Gi
  5. Run the following command and wait for Datalore to start up:

    helm install -f datalore.values.yaml datalore datalore/datalore --version 0.2.22
  6. Go to http://127.0.0.1:8080/ and sign up the first user. The first signed-up user will automatically receive admin rights.

  7. To access Datalore by a domain other than 127.0.0.1, add a URL with this host as the DATALORE_PUBLIC_URL parameter in the datalore.values.yaml file.

    For example, if you want to use the https://datalore.yourcompany.com domain, add the following:

    dataloreEnv: ... DATALORE_PUBLIC_URL: "https://datalore.yourcompany.com"
  8. Click your avatar in the upper right corner, select Admin panel | License and provide your license key.

    Opening Admin panel

Optional procedures

Run Datalore in a non-default namespace

  1. To deploy the Datalore server into a non-default namespace, run the following command:

    helm install -n <non_default_namespace> -f datalore.values.yaml datalore datalore/datalore --version 0.2.22
  2. To specify the non-default namespace for your agents configs, define the namespace variable in datalore.values.yaml as shown in the code block below:

    agentsConfig: k8s: namespace: <non_default_namespace> instances: ...

    Find more details about configuring agents in this topic

  3. Under dataloreEnv in datalore.values.yaml, you can define the following variables:

    Name

    Type

    Default value

    Description

    DATABASES_K8S_NAMESPACE

    String

    default

    K8s namespace where all database connector pods will be spawned.

    GIT_TASK_K8S_NAMESPACE

    String

    default

    K8s namespace where all Git-related task pods will be spawned.

    Find the full list of customized server configuration options in this topic.

Use an external postgres database

  1. Add two variables under dataloreEnv: database user and database URL.

    dataloreEnv: ... DB_USER: "<database_user>" DB_URL: "jdbc:postgresql://[database_host]:[database_port]/[database_name]"
  2. Set internalDatabase to false.

Enable an email whitelist

Enable a whitelist for new user registration. Only users with emails entered to the whitelist can be registered.

  1. Open the values.yaml file.

  2. Add the following parameter:

    dataloreEnv: ... EMAIL_ALLOWLIST_ENABLED: "true"

The respective tab will be available on the Admin panel.

Enable user filtration based on Hub group membership

By default, all Hub users can get registŠµred unless you disable registration on the Admin panel. If you want to grant Datalore access only to a specific Hub group members, perform the steps below:

  1. Open the values.yaml file.

  2. Add the following parameter:

    dataloreEnv: ... HUB_ALLOWLIST_GROUP: 'group_name', 'group_name1'

Configure notebook code import limit

Set your own value in bytes to configure the limit of notebook code import.

  1. Open the values.yaml file.

  2. Add the following parameter:

    dataloreEnv: VFS_MAX_IMPORT_SOURCE_LENGTH: 'integer, prefixes (K-, M-, etc.) not supported'

Fargate restrictions

While Datalore can operate in Fargate, be aware of the following restrictions:

  • Attached files and reactive mode will not work due to Fargate security policies.

  • Spawning agents in privileged mode, as set up by default, is not supported by Fargate.

  • Fargate does not support EBS volumes, our default volume option. Currently, as a workaround, we suggest that you have an AWS EFS, create PersistentVolume and PersistenVolumeContainer objects, and edit the values.yaml config file as shown in the example below:

    volumeClaimTemplates: - metadata: name: postgresql-data spec: accessModes: - ReadWriteMany storageClassName: efs-sc resources: requests: storage: 2Gi - metadata: name: storage spec: accessModes: - ReadWriteMany storageClassName: efs-sc resources: requests: storage: 10Gi

Further steps

Follow the basic installation with configuration procedures. Some of them are required as you need to customize Datalore On-Premises in accordance with your project.

Procedure

Description

Required

Configure agents

Used to change the default agents configuration

Set up GPU machines

Used to enable GPU machines

Configure plans

Used to customize plans for your Datalore users

Optional

Customize or update environment

Used to create multiple base environments out of custom Docker images

Set up JetBrains Hub

Used to integrate an authentication service

Enable gift codes

Used to enable a service generating and distributing gift codes

Enable email service

Used to activate email notifications

Enable user activity logging

Used to set up auditing of your Datalore users

We also recommend referring to this page for the full list of Datalore server configuration options.

Keywords

Datalore installation, Datalore deployment, install Datalore, installation procedures, installation requirements, Kubernetes deployment

Last modified: 18 August 2024