Whether you work with CSV files, S3 buckets, or SQL databases, Datalore offers you easy ways to access and query your data from multiple data sources in one notebook.
Watch the Data connections video overview below:
Datalore comes with persistent internal storage for fast access to your notebooks and other work artifacts.
Whether you upload local files and folders or import data by link or download files from code, all the data will be stored in notebook files. When sharing a notebook with collaborators, notebook files will be shared automatically.
Share datasets among multiple notebooks via Workspace files. When working in a shared workspace, you can upload a dataset once and it will become available for every workspace editor.
Connect your notebooks to databases right from the editor with a few clicks, and query your data with native SQL cells without passing your credentials to the environment.
Datalore supports user and password authentication for Amazon Redshift, Azure SQL Database, MariaDB, MySQL, Oracle, PostgreSQL, Snowflake, and more. Please contact us via datalore-support@jetbrains.com if you have specific questions about database connectivity.
Choose specific database schemas and tables for introspections when creating a database connection in Datalore. This will help speed up the initial introspection and make schema navigation easier.
Connect to your remote databases using SSH tunneling in Datalore. This will create an encrypted SSH connection between Datalore and your gateway server. Connecting via SSH tunnels makes it possible to connect to databases that are not exposed to a public network.
Mount AWS S3 and GCS buckets as folders directly to the notebook without passing your credentials to the environment.
Apart from the supported data source connections via the user interface, you can connect any bucket, database, or data storage from code as you would normally with a Jupyter notebook.
Add native SQL cells to query your database connections. In addition to SQL syntax-highlighting, you get code completion based on the introspected database tables. The query result is automatically transferred to a pandas DataFrame and you can continue working on the dataset in Python.
In Datalore, it is now possible to use variables (strings, numbers, booleans, lists) defined in Python code inside the SQL cells. This allows you to build interactive reports with parameterized queries, helps minimize the SQL code written, and presents a better UI for report users.