DataGrip 2020.1 Help

Big Data tools

The Big Data Tools plugin is available for DataGrip 2020.1 and later. It provides specific capabilities to monitor and process data with S3, Spark, Google Cloud Storage, and Hadoop Distributed File System (HDFS).

User interface of the IDE with the Big Data Tools plugin enabled

Getting started with Big Data Tools in DataGrip

The basic workflow for big data processing in DataGrip includes the following steps:

Configure your environment

  1. Install the Big Data Tools plugin.

  2. Create a new project in DataGrip.

  3. Configure a connection to the target server.

  4. Work with your data files.

Get familiar with the user interface

When you install the Big Data Tools plugin for DataGrip, the following user interface elements appear:

Big Data Tools window

The Big Data Tools window appears in the rightmost group of the tool windows. The window displays the list of the configured servers and files structured by folders.

You can navigate through the directories and preview columnar structures of .csv and .parquet files.

Basic operations on data files are available from the context menu. You can also move files by dragging them to the target directory on the target server.

Data files in the BDT window

For the basic operations with the servers, use the window toolbar:

Item

Description

Add connection

Adds a new connection to a server.

Refresh Connection

Refreshes connections to all configured servers.

Connection settings

Opens the connection settings for the selected server.

Spark tool windows

Spark monitoring tool window

This window appears when you have connected to a Spark server by creating a new connection.

Last modified: 26 May 2020