DataSpell 2024.1 Help

Data extractors

Enable the Database Tools and SQL plugin

This functionality relies on the Database Tools and SQL plugin, which is bundled and enabled in DataSpell by default. If the relevant features aren't available, make sure that you didn't disable the plugin.

  1. Press Ctrl+Alt+S to open the IDE settings and then select Plugins.

  2. Open the Installed tab, find the Database Tools and SQL plugin, and select the checkbox next to the plugin name.

Data extractors set rules of how to copy or view your data in the editor (in the Text editor view mode). You can select a default extractor from the Data Extractors list near the Export Data icon (The Export Data icon), configure the existing extractor, or create a custom extractor by using Groovy or JavaScript.

Data extractor list on the data editor toolbar

The last selected extractor becomes the default one for newly opened editor tabs. For MongoDB, the default extractor is always JSON.

In the following video, the data is copied using the CSV data extractor, and viewed in the Text editor view mode using the SQL Updates and Pipe-separated data extractors.

Using data extractors, you can export database data as SQL statements and in various formats. When you export to a file, a separate file is created for each individual table or view. For more information about exporting, refer to the Export section.

The list of available data extractors is as follows:

  • Built-in data extractors. Use them to export data as a set of INSERT or UPDATE statements, TSV and CSV files, Excel XLSX files, Markdown, HTML tables and JSON format.

    Built-in data extractor

    Export file format

    SQL Inserts

    .sql

    SQL Updates

    Where Clause

    One-row

    SQL-Insert-Multirow

    SQL-Insert-Statements

    Excel (xlsx)

    .xlsx

    CSV

    .csv

    TSV

    .tsv

    Pipe-separated

    .txt

    Pretty

    HTML (groovy)

    .html

    HTML (js)

    JSON

    .json

    Markdown

    .md

    XML

    .xml

  • Custom data extractors. Create them using Groovy or Javascript and the provided API.

  • Extractors for delimiter-separated values. Use them to create your own format that is based on CSV or any DSV format.

Configure an extractor for delimiter-separated values

You can extend the default functionality and create your own format that is based on CSV or any DSV format. In its settings, you can set separators for rows and headers, define text for NULL values, specify quotation, create new extractors for formats with delimiter-separated values.

  1. From the list of data extractors, select Configure CSV Formats.

  2. In the CSV Formats dialog, click the Add Format icon The Add Format icon.

  3. Specify a name of a new format (for example, Confluence Wiki Markup).

  4. Define settings of the format: set separators for rows and headers, define text for NULL values, specify quotation. Click OK.

    When the format is created, you can select it in the drop-down list near the Export Data icon (The Export Data icon).

For more information about the CSV Formats dialog, refer to the reference.

Add a custom extractor

You can create your own extractor that you can write on Groovy or JavaScript.

  1. In the Workspace tool window ( Window | Tool Windows | Workspace) , navigate to Scratches and Consoles | Extensions | Database Tools and SQL | data | extractors.

  2. Right-click the extractors node and select New | File.

  3. Type the filename (for example, PHP.array.groovy), where:

    • PHP: the name that is used in the extractor menu.

    • array: output of the extractor.

    • groovy: the extension of the script.

  4. Type or paste the code of the extractor in a new file. You can use code of other extractors in the extractors folder to write your extractor. Also, you can try the following extractors:

  5. Select the extractor from the list of extractors.

    Consider the following example of adding the PHP-array extractor in DataSpell.

API for custom data extractors

Use the following API to create your custom data extractor.

Binding

Methods (Groovy and JavaScript)

Description

DIALECT

getDbms() followed by either one of the following methods:

  • isOracle

  • isMysql

  • isPostgres

  • isBigQuery

  • isRedshift

  • isGreenplum

  • isVertica

  • isMicrosoft

  • isSybase

  • isDb2

  • isHsqldb

  • isH2

  • isDerby

  • isSqlite

  • isExasol

  • isClickHouse

  • isCassandra

  • isHive

  • isSpark

  • isSnowflake

  • isMongo

  • isCouchbase

  • isCockroach

  • isTransactSql

  • isDocumentOriented

Database dialect

TABLE

  • getName()

  • getParent().getName()

Database table metadata

ALL_COLUMNS

  • int columnNumber()

  • String name()

A list of all columns in the query result or currently open table editor.

COLUMNS

  • int columnNumber()

  • String name()

A list of selected columns in the query result or currently open table editor.

FORMATTER

  • format(Row, Column)

  • formatValue(Object, Column)

  • getTypeName(Object, Column)

  • isStringLiteral(Object, Column)

Default formatting methods.

OUT

append(String)

Extractor output.

ROWS

  • int rowNumber()

  • boolean first()

  • boolean last()

  • java.util.List<java.lang.Object> data()

  • Object value(Column)

Row data stream.

TRANSPOSED

Boolean value that indicates whether the table editor is in transposed mode.

Example

You can use the following simple CSV data extractor as a base for your custom data extractors. The full version of this data extractor is available in DataSpell. To locate it, in the Workspace tool window , navigate to Scratches and Consoles | Extensions | Database Tools and SQL | data | extractors

ROWS.each { row -> COLUMNS.eachWithIndex { column, i -> if (i > 0) OUT.append(",") OUT.append(FORMATTER.format(row, column)) } OUT.append("\n") }

Reference

CSV Formats dialog

To access this dialog, click the Data Extractors list on the toolbar and select Configure CSV Formats.

This dialog contains the settings for converting table data into delimiter-separated values formats (for example, CSV, TSV) and vice versa.

When you change settings, the preview changes correspondingly.

Item

Description

Formats

Select a template that successfully converts the file data into a table. You can change settings of predefined templates or add a new template. To add a template, click Add Format button (the Add Format icon).

Use the Add Format (), Remove Format (), Up () and Down () buttons to create, delete and reorder the formats; Copy Format (the Copy Format icon) to create a copy of the selected format.

Value separator

Select or type the character that you want to use as a separator for values.

Row separator

Select or type the character that you want to use as a separator for rows.

Null value text

Select or type the text that you want to use if a cell contains the NULL value.

Add row prefix/suffix

Click the link and type a row prefix and suffix. Prefix and suffix are character sequences which in addition to the row separator indicate the beginning and end of a row.

Quotation

Each line in the area under Quotation is a quotation pattern. A quotation pattern includes:

  • Left: a quotation character that is inserted before a value.

  • Right: a quotation character that is inserted after a value.

  • Escape: an escape method or character for the cases when the quotation character is part of a value. The <duplicate> value means that if a quotation character occurs within a value, it is doubled. You can specify your own escape character.

If there is more than one pattern, the first pattern is used.

Use the Add (), Remove (), Up () and Down () buttons to create, delete and reorder the patterns.

Quote values

Select when you want to enclose values within quotation characters.

  • Never: do not quote values.

  • When needed: quote a value if it contains the value or the row separator.

  • Always: quote all the values.

Trim whitespaces

Ignore or remove whitespace characters. If this checkbox is cleared, the whitespace characters are treated as parts of the corresponding values.

First row is header

Treat the first row as a row that contains column names.

First column is header

Treat the first column as a column that contains row names.

Last modified: 29 March 2024