Skip to main content

Databricks Connector Setup

Guide to connecting Databricks

Updated over a week ago

The Databricks Connector in Savant connects to Databricks SQL Warehouses or clusters using Databricks connection details and a token. After connecting, Savant can read tables/views from the specified database and create datasets for downstream workflows.

Features

  • Connect to Databricks SQL Warehouses (recommended) or clusters

  • Authenticate with token-based auth

  • Supports enterprise governance using Groups + Service Principals

  • Create datasets from Databricks tables/views and refresh on schedule

Requirements

  • Access to a Databricks workspace

  • A SQL Warehouse or cluster you are allowed to use

  • Permissions to read data objects you want to ingest (catalog/schema/table permissions if using Unity Catalog)

  • The Databricks connection details: Server Hostname and HTTP Path (from the compute resource connection page), plus a token

Connection Methods

  • User Token (Personal Access Token)- Use this for dev/test or small teams.

  • Groups + Service Principal (Enterprise / Production)- Use this for production to avoid tying access to an individual employee

Connection Details

  1. In Databricks, open SQL Warehouses (or the cluster you will use).

  2. Select the warehouse/cluster.

  3. Open Connection Details.

  4. Copy:

    • Server Hostname

    • HTTP Path

    • (Port may appear on the page, but Savant typically uses Hostname + HTTP Path + Token)

Databricks documents this exact “Connection Details” approach for integrating external tools.

Step 1 — Create a Group for Savant access

(Performed by Databricks admin / workspace admin in Admin Console)

  1. Go to Admin Console

  2. Navigate to Groups → Create Group

  3. Name it something like: savant-databricks-readonly

  4. In the group Entitlements, enable Databricks SQL access (so the identities in the group can use Databricks SQL/SQL Warehouses)

  5. Open the group and capture the Group ID from the URL (pattern similar to accounts/groups/<groupId>)

Note: Entitlements and where they are managed can vary by deployment (Databricks vs Azure Databricks), but “Databricks SQL access” is a standard entitlement concept.

Step 2 — Create the Service Principal and attach it to the Group

You can do this either via UI (Account Console) or via SCIM API depending on how your org operates. Databricks supports managing service principals in both places.

  • Path A (UI-first, preferred if available)

    • In Account Console → User management → Service principals → Add service principal

    • Assign the service principal to the target workspace (Workspace permissions: User or Admin)

    • Add the service principal to the Savant group created in Step 1

  • Path B (API-based)

    • Generate an admin user PAT (used only to call the admin APIs).

    • Call the SCIM service principals endpoint to create the service principal and attach group + entitlement.

    curl --location --request POST '<Databricks_Workspace_Endpoint>/api/2.0/preview/scim/v2/ServicePrincipals' \
    --header 'Authorization: Bearer <Access_Token>' \
    --header 'Content-Type: application/json' \
    --data-raw '{
    "displayName": "<Display_Name>",
    "entitlements": [
    {
    "value": "databricks-sql-access"
    }
    ],
    "groups": [
    {
    "value": "<Group_Id>"
    }
    ],
    "schemas": [
    "urn:ietf:params:scim:schemas:core:2.0:ServicePrincipal"
    ]
    }'

  • Response Sample

    {
    "displayName": ""<Display_Name>",
    "groups": [
    {
    "display": "service_principal",
    "type": "direct",
    "value": "<Group Id>",
    "$ref": "Groups/<Group Id>"
    }
    ],
    "id": "<Service_Principal_Id>",
    "entitlements": [
    {
    "value": "databricks-sql-access"
    }
    ],
    "applicationId": "<Application_Id>",
    "schemas": [
    "urn:ietf:params:scim:schemas:core:2.0:ServicePrincipal"
    ],
    "active": true
    }

Step 3 — Assign the Service Principal to the Workspace

Even if the service principal exists, it must be assigned to the workspace to access it.

  • Account Console → Workspaces → select workspace → Permissions → Add permissions → select service principal → assign Workspace User (or Admin only if needed)

Step 4 — Generate a Token for the Service Principal

  • Path A (UI-first, preferred if available)

  • Path B (API-based)

    • Use the service principal application/client ID returned when the SP is created.

    • Create an on-behalf-of token (often called OBO token) with a defined lifetime (example: 180 days).

Save the returned token value — this is what you paste into Savant as the connector token.

curl --location --request POST '<Databricks_Workspace_Endpoint>/api/2.0/token-management/on-behalf-of/tokens' \
--header 'Authorization: Bearer <Access_Token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"application_id": "<Application_Id>",
"comment": "<Any Comments>",
"lifetime_seconds": 15552000
}'

Sample Response:

{
"token_value": "<Token_Value>",
"token_info": {
"token_id": "<Token_Id>",
"creation_time": 1666291396808,
"expiry_time": 1668883396808,
"comment": "<Comments In Create API>",
"created_by_id": <Creator_User_Id>,
"created_by_username": "[email protected]",
"owner_id": <Owner_Id>
}
}

Step 5 — Grant Data Permissions to the Service Principal (or Group)

To read data, the service principal must have permissions on the data objects.

Recommended approach. Grant permissions to the group (clean governance), and keep the service principal in that group. Without this, the connection will fail even if the token is valid.

  • In Databricks, go to SQL Warehouses

  • Select the warehouse Savant will use

  • Open Permissions

  • Grant the group (or service principal):

    • CAN USE

This allows Savant to execute queries.

Depending on your metastore:

Unity Catalog (common in modern workspaces) or Hive Megastore (in legacy workspaces)

  • Grant these to the group (recommended):

    • USE CATALOG

    • USE SCHEMA

    • SELECT on tables or views

    • READ_METADATA (if required by governance rules)

  • Using Data Explorer

    • Select Catalog → Schema

    • Click Permissions

    • Grant permissions to the group or service principal

If your org uses “ANY FILE” access

  • You may also grant file-level permissions via SQL (for example, GRANT SELECT ON ANY FILE ...) — only if your data access pattern requires it and your security team approves.

(Exact grants vary by catalog structure, but the principle is: compute access + object permissions.)

Step 6 — Configure Savant Connection

In Savant → Connections → Add System → Databricks, enter:

  • Server Hostname (from SQL Warehouse connection details)

  • HTTP Path (from SQL Warehouse connection details)

  • Database Name

  • Token value (service principal token created in Step 4)

  • Click on Authenticate

Please refer to the below demo for connection setup.

Troubleshooting

  • Connection test fails → Verify Server Hostname and HTTP Path from the warehouse Connection Details page

  • 401 Unauthorized → Token is wrong/expired, or token usage is restricted; regenerate/replace token

  • 403 Forbidden → Service principal/group lacks permissions on warehouse or data objects; grant CAN USE + data privileges

  • Can connect but can’t see tables → Database/catalog/schema permissions missing (USAGE/READ_METADATA/SELECT)

  • Warehouse not usable → Service principal/group missing Databricks SQL access entitlement

  • Token creation on-behalf-of fails → Admin token lacks token-management privileges or wrong application/client ID

Don’t see what you’re looking for? Contact us in the Community or reach out in Chat Support

Did this answer your question?