Datalore Help

Helm-specific instructions

Configure Helm installation of Datalore Enterprise

Modify the following Helm values in the datalore.values.yaml chart.

dataloreEnv

Editing fields under this key is mandatory to get everything working. The format is as follows:

dataloreEnv: KEY_NAME: "key_value" ...

Mandatory parameters

DATALORE_PUBLIC_URL

URL by which Datalore is accessed (DATALORE_ROOT_URL). It is used to generate links.

HUB_PUBLIC_BASE_URL

Base public (accessible via browser) URL of your Hub installation (${HUB_ROOT_URL}/hub from the Install Hub section, for example, https://hub.your.domain/hub).

HUB_DATALORE_SERVICE_ID

ID of the Datalore service in Hub (see Configure the Datalore service).

HUB_DATALORE_SERVICE_SECRET

Token of the Datalore service in Hub (see Configure the Datalore service).

HUB_PERM_TOKEN

Token for accessing Datalore and Hub scopes (see Create a Hub token).

HUB_FORCE_EMAIL_VERIFICATION

Used to specify whether email verification is required from the Datalore user.

DATABASES_BASE_URL

Must always be equal to "http://${SQL_CELLS_API_HOST}:${SQL_CELLS_API_PORT}".

SQL_CELLS_API_HOST

Internal hostname for the datalore service. Must be equal to DATALORE_INTERNAL_HOST.

DATALORE_INTERNAL_HOST

Internal hostname for the datalore service.

DEFAULT_INSTANCE_TYPE_ID

ID of the instance type that will be used by default (for more information, see the agentsСonfig description).

DEFAULT_PACKAGE_MANAGER

Default package manager.

DEFAULT_BASE_ENV_NAME

Default environment, matches one of the default package manager environments.

MAIL_ENABLED

If set to true, enables Datalore to send emails (welcome emails, sharing invitations, etc) and requires the following parameters:

  • MAIL_SENDER_EMAIL: sender's email

  • MAIL_SENDER_NAME: sender's name

  • MAIL_SENDER_USERNAME: username of SMTP user

  • MAIL_SENDER_PASSWORD password of SMTP user

  • MAIL_SMTP_SERVER: SMTP server host

  • MAIL_SMTP_PORT: SMTP server port

ADMIN_API_AUTH_TOKEN

Environment variable for an API token to set up an admin user.

Optional parameters

HUB_INTERNAL_BASE_URL (default: http://hub:8082/hub)

URL to access Hub from inside the cluster. Used if HUB_PUBLIC_BASE_URL is only available from outside and not inside the cluster.

DATABASES_K8S_NAMESPACE (default: default)

Name of the Kubernetes namespace where Datalore is installed. Used if you plan to install Datalore in a namespace other than default.

dbRootPassword

Used to set up the PostgreSQL password. There is one field to override:

  • ROOT_PASSWORD: root user's password. The database can be accessed on port 5432 with the username postgres and this password.

internalDatabase

Used to specify if you use an external database (for example, AWS RDS). To use an external database, set it to false and specify DB_USER and DB_URL under the dataloreEnv key.

volume, volumeClaimTemplates

Used to configure persistent storage.

The config has two Kubernetes volumes described:

  • storage: contains workbook data, such as attached files (UID:GID 5000:5000).

  • postgresql-data: contains PostgreSQL database data (UID:GID 999:999).

agentsConfig

Used to define agent types (such as Basic and Large machines in the cloud version of Datalore). It has the following schema:

k8s: instances: - id: <Unique instance ID> label: <Instance name> description: <Short description of what the instance is> features: <Information to be displayed in the tooltip text when hovering over the instance> minAllowed: <Minimum number of instances to be preserved in the pool> maxAllowed: <Maximum number of instances to be preserved in the pool> numCPUs: <Number of CPUs> cpuMemoryText: <CPU memory> numGPUs: <Number of GPUs> gpuMemoryText: <GPU memory> yaml: <Kubernetes config of Pod to be used for the instance> - id: <Another type with the same schema as above> ...

The minAllowed and maxAllowed fields are used to configure the number of pre-created instances, which will speed up the process of starting up notebooks.

logbackConfig

Used to collect logs from Datalore and agents. We provide the default one, which prints requested information to stdout, but you can configure it any way you like. Find more information on how to configure Logback in the official documentation.

Optional procedures

Run Datalore in a non-default namespace

  1. Specify the namespace when running Datalore:

    helm install -n [non_default_namespace] -f datalore.values.yaml <release_name> datalore/datalore
  2. Add the namespace under the agentsConfig key as shown in the code below::

    k8s: namespace: datalore instances: ...
  3. Add DATABASES_K8S_NAMESPACE: "[non_default_namespace]" under the dataloreEnv key.

Use an external postgres database

  1. Add two variables under dataloreEnv:

    • DB_USER: "[database_user]" to specify the database user

    • DB_URL: "jdbc:postgresql://[database_host]:[database_port]/[database_name]" to specify the database URL

  2. Set internalDatabse to false.

Last modified: 11 August 2022