Datalore 2024.1 Help

Git repositories

This section explains how to work with Git in Datalore.

Install a Git repository in your Datalore notebook environment

If you or your team store a collection of Python scripts or a pip-compatible package in Git, you can access that repository from Jupyter notebooks in Datalore. The table below describes the ways to do this. When choosing your method, take into consideration such factors as repository access level and repository type.

Using Environment tool

Using Terminal or IPython magic commands

Using team’s base environment (Enterprise-only feature)

Repository access level

From a chosen notebook only

From a chosen notebook or from each notebook of a chosen workspace

From any team member's notebook in any workspace

Repository type

Public Git repositories and private Git repositories accessed via SSH

Any private or public Git or non-Git repositories (Artifactory, Space Packages, privately hosted PyPI repositories)

Installation specifics

Installed on demand, refreshable at any time from the UI

Installed on-demand using Git CLI. Certain options can be automated with init.sh and installed on notebook computation start

Installed as a part of the custom docker image

Refresh type

Refresh button and restart kernel

Using Git CLI via Terminal

Rebuild docker image

Available actions

Clone, pull

Clone, pull, push

Clone on image creation

The main benefit of cloning Git repositories to Datalore is that you gain access to custom Python modules, scripts, or functions, which you can then edit collaboratively in Datalore.

Clone a Git repository using the Environment tool

This is the easiest way to install a publicly available Git repository from the user interface into a single Datalore notebook. This method allows you to choose the repository's branch and refresh the connection.

  1. Open the notebook and click the Environment icon on the left-hand sidebar of the editor.

  2. Switch to the Repositories tab.

  3. Click the Add new button. The Add repository dialog opens.

  4. Enter the repository URL and click the Check button.

  5. Select the branch you need and click the Add button.

  6. After the package is installed, click restart kernel in the notification popup to complete the environment update.

If you want to access a private Git repository with a personal token or via credentials, use the init.sh script.

Clone a Git repository using Terminal

You can use the Terminal tool.

  1. In the editor, go to Main menu | Tools | Terminal. This opens a terminal session.

  2. Use Git CLI commands to clone a repository or get a fresh version of it in your notebook or workspace files.

  3. (Optional) To use the repository in all the notebooks of the workspace, clone it to Workspace files.

  4. (Optional) To access the repository contents from the notebook, import the necessary functions. Datalore provides you with code completions and documentation popups for imported Python modules.

To make an edited init.sh script available across the whole workspace, do the following using the Attached data tool:

  • Make sure your Workspace files are attached to the notebook.

  • Move the init.sh file from Notebook files to Workspace files.

Find more details about this in Attached files.

Clone a Git repository using a team's base environment

If you want to provide centralized access to a certain repository for your whole team, make this repository part of a custom base environment. Base environments are custom Docker images that users will easily use as pre-built configurations when creating a new notebook in Datalore.

    Edit Git repository contents in Datalore

    If you want to edit Python scripts or files available in your Git repository, you can clone the repository to your Attached data in one of the following ways:

    • Select Main menu | Tools | Terminal. This will open a terminal session for you to execute the required Git CLI commands.

    • Use Python magic commands directly inside notebook’s code cells.

    After you’ve cloned the repository to your Attached data, you will be able to edit file contents collaboratively. For Python files, you will be provided with the code completion and syntax highlighting assistance. To use the updated functions in your notebook, make sure to restart the kernel or use an autoreload extension:

    ... %load_ext autoreload %autoreload 2 ...

    Version your data science work with Git and Datalore

    Track changes using the History tool

    Use the History tool to keep track of changes in the notebook. This tool will allow you to:

    • Create checkpoints to save notebook states

    • View your previously saved states and revert to them

    • Find the differences between the current notebook version and its checkpoints

    • View changes made by your collaborators

    Also, Datalore automatically creates checkpoints to rectify potentially dangerous actions, such as deleting a cell.

    Find more details in History.

    Version files using Terminal

    You also version the files you work on if you push or commit files or folders to your Git repositories using Terminal.

    Last modified: 20 February 2024