Create and configure dbt project

Before you start

Make sure that the following prerequisites are met:

You are working with DataSpell version 2023.3 or later. If you still do not have DataSpell, download it from this page. To install DataSpell, follow the instructions, depending on your platform.
You have access to a data platform.

Create a dbt project

To create a project, do one of the following:
- Click the Project widget, and then select New Project.
- On the Welcome screen, select Projects and then click New Project.
In the New Project dialog, select the dbt project type.
Specify the project name in the Name field and location in the Location field. DataSpell will create the project directory in the provided location.
To work with dbt, you will need a profiles.yml file, that contains the connection settings for your data platform.
Specify Profiles location and select Profile to load, if you already have profiles.yml file.
Click Create.

Explore project structure

The newly created project contains dbt-specific files and directories.

The structure of the project is visible in the Project tool window (Alt+1):

analyses directory is used for storing ad-hoc SQL queries or analyses that aren't part of the main data transformation logic. These queries are often used for exploratory analysis or one-time investigations.
macros directory is where you can store SQL files that define reusable snippets of SQL code called macros. Macros can be used to encapsulate commonly used SQL patterns, making your code more modular and easier to maintain.
models directory is one of the most important directories in a dbt project. It's where you define your dbt models, which are SQL files containing the logic for transforming and shaping your data. Models are the core building blocks of a dbt project.
seeds directory, is where you can store seed data in a dbt project. Seeds are static datasets that you manually create and manage. Unlike source tables, which dbt typically reads directly from a data warehouse, seeds are user-defined tables that you provide as input to your dbt models.
snapshots directory is used for creating incremental models or snapshots of the data. Snapshots are useful when you want to capture changes in the data over time.
tests directory is where you define tests for your dbt models. Tests help ensure the quality of data transformations by checking for expected outcomes, such as verifying that certain columns are not null or that a column is unique.
dbt_project.yml is the main configuration file for your dbt project. It contains settings such as your project name, source configurations, and target configurations.
README.md file provides an introductory welcome and a list of useful resources.

These directories and files collectively provide a structured environment for developing, testing, and documenting your data transformations using dbt.

Configure profiles.yml file

When you run a dbt command, dbt reads the dbt_project.yml file to identify the project's name, and then looks for a profile with the same name within the profiles.yml file.

Create a profiles.yml file in your home directory (~/.dbt), and configure it with the necessary information to connect to your data warehouse:

            # example profiles.yml file
            your_project's_name:
              target: dev
              outputs:
                dev:
                  type: postgres
                  host: localhost
                  user: jetbrains
                  password: <password>
                  port: 5432
                  database: sakila
                  schema: dbt_jetbrains
                  threads: 4
        

Configure data source

Depending on a database vendor, you need to configure a corresponding data source to use it to connect to your data platform.

Navigate to Settings | Languages & Frameworks | dbt.
Click Add data source.
Select Data Source and choose the database vendor.
Configure the connection settings in the Data Sources and Drivers dialog.
Click OK.

Check warehouse connection

To check the connection to your warehouse, run dbt debug command.

Possible error	Solution
`Could not find profile named 'your_project's_name'`	Create and configure the profiles.yml file. If you already have profiles.yml file, add the new profile for the project you are working with to the file.
`Could not find adapter type adapter_name`	Install and upgrade adapter for your data platform. For example, to install postgres adapter, run `pip install --upgrade dbt-postgres`

Last modified: 29 April 2024