DataSpell Quick Start Guide

Thank you for your interest in DataSpell, an Integrated Development Environment (IDE) that is dedicated to specific tasks for exploratory data analysis and prototyping machine learning models! Let’s have a quick overview tour to get started with a typical data processing workflow.

After you have installed DataSpell and launched it for the first time, you’ll be asked to tune the IDE settings.

Setting up the environment

Perform one of the most important steps – configure an environment for the default workspace:

If you already have any Conda environment, it will be automatically suggested. If you don’t have Conda on your machine, DataSpell will provide a link to download and install it.

Attaching local folders to the workspace

Here’s how DataSpell looks on the very first launch. You need to add your data file now.

Click the Attach new or existing directory link in the Workspace tool window and open the target directory. You can add new notebooks to the workspace. See Create a Notebook File for more details.

Connecting to a remote Jupyter server

If you have to work with a remote Jupyter server, click the icon on the toolbar, select Connect to Jupyter server using URL, and specify a server URL.

If you prefer to work with Jupyter notebooks locally, you don’t have to connect to a remote Jupyter notebook. Just attach a local folder with your notebooks, open the notebook, and run cells. The IDE will take care of starting the Jupyter server locally using your selected environment.

Editing Jupyter notebook files

Once you have successfully attached your files to the workspace or get connected to a remote Jupyter server, you can work with your Jupyter notebooks, Python scripts, and other files.

Most popular commands for editing cells are available in the Cell group of the main menu. You can also right-click a cell to get specific commands in the context menu.

If you are not sure which command or action you’re looking for, press Shift + Shift and start typing “Cell”. The IDE will show you an extended list of possible actions that you might find helpful.

When editing Jupyter notebooks and Python scripts use code insights, such as syntax highlighting, code completion, and so on. See more details in Editing Jupyter Notebook Files.

Running notebooks

You can execute the code of the notebook cells in many ways using the icons on the notebook toolbar and cell toolbars, commands of the code cell context menu (right-click the code cell to open it), and the Run commands of the main menu.

Note that when you work with local notebooks, you don’t need to start any Jupyter server in advance: just execute a cell and the server will be launched. The Variables tab of the Jupyter tool window provides detailed information about the variables during the current execution session.

Processing the execution output

When any data frames are built, you can preview them in tabular form. To open a data frame in an editor tab, right-click the cell output and select Open in New Tab menu item.

You can copy the selected fragment or all cells of the table, save the output in the *.csv format, sort data in a column by clicking its header, and copy a column header or all headers of the table to the clipboard. Just right-click any table header to get the context menu and select the target command.

If your notebook cell involves any code that plots charts, you can save the chart as an image: right-click the output and select Save As from the context menu.

In case of any error detected during code execution, try the visual Debugger to discover and fix the problematic code. See more details in Debug code in Jupyter notebooks.

Running Python scripts

With DataSpell, you can write and execute Python code. The simplest way to do it is to run the file in the Run tool window. However, you can also execute your script in the interactive read-evaluate-print loop (REPL) Python Console. You can even execute a selected fragment of your code, or a Python code cell.

In addition to Jupyter notebooks and Python scripts, DataSpell provides support for Database tools and SQL and Big Data Tools.