๐Ÿ”ŒAdd a Databricks connector

How to connect a databricks cluster in Toucan

Connector features

You can use the Toucan Databricks connector to connect to your Databricks account with a Personal Access token and access tables or views with a SQL query.

With this connection, you can fetch data from your Snowflake to fill your charts and dashboards.

Configuring a Databricks connection in Toucan

Retrieve ODBC connection information from Databricks as described here

Follow the steps described in Add a connector, choose Databricks and fill out the form with the following info:

Field
Format / Type
Description
Example

Name (mandatory)

String

Use it to identify your connection

MyDatabricksConnection

Host (mandatory)

String

hostname of databricks cluster can be found the cluster configuration

my-databricks-cluster.cloudprodiverdatabricks.net

Port (mandatory)

Integer

The listening port of your Databricks cluster

443 (default)

Http Path (mandatory)

String

Databricks compute resources URL, can be retrieved from Databricks UI clusterโ€™s configuration in the โ€˜ODBCโ€™ section

sql/protocol/v1/o/xxx/yyy

User (mandatory)

String

token"if you use a personal access token PAT, or username if you connect by username/password (deprecated since July 2024)

databricks_user

Password (mandatory)

String

Access token (generated from Databricks UI in user settings) (will be stored as a secret)

dapixxxxxx

ANSI

Boolean

Enforce compliance with the ANSI SQL standard for SQL operations and behaviors

On Demand

Boolean

if your cluster is self-stopping, make sure to tick this option. With this option, the connector will try to start the cluster if itโ€™s stopped before any query

Retry Policy (optional)

Boolean

Boolean allows to configure a retry policy if the connection is flaky.

  • max attempts: maximum number of retries before giving up

  • max_delay: in seconds, above the connection is dropped

  • wait_time: time in seconds between each retry

Slow Queries' Cache Expiration Time

Integer

Slow queries' cache expiration time

Click on the TEST CONNECTION button then SAVE the connection

Create a dataset from a Databricks connection

This data connector is only supported in code/SQL mode

To create a dataset from Databricks, click on the "create from icon", you will then be able to:

  • QUERY: the SQL query you want to run

  • PARAMETERS (optional): dict, allows to parameterize the query.

We specifically designed this connector to handle DATA REFRESH from an on-demand clusters. During this process, the connector will try to start the cluster and wait for it to be ready before running queries.*

Last updated

Was this helpful?