

Connectors / Integration
Connect Databricks and Snowflake to Unify Your Data Stack
Automate data pipelines between your lakehouse and cloud data warehouse to power faster analytics, ML, and business intelligence.
Databricks + Snowflake integration
Databricks and Snowflake cover most of what a modern data organization needs — large-scale data engineering on one side, governed SQL analytics on the other. Data science teams lean on Databricks for machine learning, feature engineering, and Spark-based transformations. BI teams depend on Snowflake's high-performance SQL warehouse for reporting and sharing. Connecting the two eliminates silos, cuts manual data movement, and lets processed insights flow automatically to where decisions actually get made.
When Databricks and Snowflake run separately, you end up with duplicated data, stale reports, and expensive manual handoffs between engineering and analytics. Connecting them through tray.ai lets you automate the full lifecycle — raw ingestion and transformation in Databricks, governed query-ready tables in Snowflake — without writing and maintaining custom ETL scripts. You get real-time or scheduled syncs of ML model outputs, aggregated metrics, and cleansed datasets straight into Snowflake schemas, where analysts, dashboards, and downstream tools can use them immediately. Engineering overhead drops, data stays fresh, and every team works from the same source of truth.
Automate & integrate Databricks + Snowflake
Automating Databricks and Snowflake business processes or integrating data is made easy with Tray.ai.
Use case
Sync ML Model Outputs from Databricks to Snowflake
After training and scoring models in Databricks, teams need their predictions, scores, and feature outputs available to business users in Snowflake. tray.ai automates the transfer of model inference results from Databricks Delta tables directly into Snowflake target schemas on a scheduled or event-driven basis, so sales, marketing, and operations teams can act on ML-generated insights without waiting on manual data exports.
- Eliminate manual CSV exports of model outputs between platforms
- Make ML predictions available to BI tools connected to Snowflake within minutes of scoring
- Maintain a full audit trail of when model results were synced and to which tables
Use case
Automate ETL Pipelines from Snowflake to Databricks for Feature Engineering
Data science teams frequently need raw or semi-processed data from Snowflake loaded into Databricks to build features for machine learning models. tray.ai can trigger Databricks notebook runs or Delta Live Table pipelines whenever new data lands in Snowflake, creating a clean upstream-to-downstream workflow without engineers manually kicking off jobs or writing bespoke scheduling scripts.
- Automatically trigger Databricks jobs when Snowflake tables are updated
- Reduce feature engineering pipeline latency with event-driven automation
- Decrease dependency on manual engineering intervention for routine data prep
Use case
Write Aggregated Databricks Metrics Back to Snowflake for Reporting
Databricks jobs that produce aggregated KPIs, summary statistics, or transformed datasets can have their results written back to Snowflake so that existing BI tools and dashboards always reflect the latest numbers. tray.ai handles this write-back process — schema mapping, table upserts, and error notifications — without custom code, so analytics teams can trust that Snowflake always has the freshest processed data.
- Keep Snowflake reporting tables current without manual refresh cycles
- Support upserts and incremental loads to avoid full table rewrites
- Alert data teams via Slack or email when write-back jobs fail or produce anomalies
Use case
Replicate Reference and Lookup Tables from Snowflake into Databricks
Databricks workloads often depend on reference data — product catalogs, customer segments, currency tables — that lives in Snowflake. tray.ai automates scheduled replication of these lookup tables into Databricks so that notebooks and pipelines always join against current reference data, preventing stale enrichment from silently corrupting model training or transformation logic.
- Keep Databricks enrichment data in sync with Snowflake master records
- Schedule replication to run before nightly batch jobs automatically
- Receive alerts when reference tables fail to sync on time
Use case
Orchestrate Cross-Platform Data Quality Checks
Keeping data consistent between Databricks and Snowflake is a persistent headache for data engineering teams running parallel pipelines. tray.ai orchestrates automated reconciliation checks — comparing row counts, checksums, or aggregate values across both platforms — and routes discrepancy alerts to the right team channels, so you get visibility into pipeline health without building custom monitoring infrastructure.
- Catch data drift between platforms before it impacts downstream reports
- Route quality alerts directly to Slack, PagerDuty, or email
- Log reconciliation results to a central audit table in Snowflake or Databricks
Use case
Trigger Databricks Job Runs Based on Snowflake Data Events
Many data workflows require Databricks processing to kick off only when specific conditions are met in Snowflake — a new batch of records arriving, a table exceeding a row threshold, or a status flag being updated. tray.ai monitors Snowflake for these conditions and automatically triggers the corresponding Databricks job or workflow. Your pipelines react to actual data availability instead of running on a fixed clock, which cuts unnecessary compute spend.
- Eliminate over-scheduled Databricks jobs that run on empty datasets
- Reduce Databricks compute costs by running jobs only when data is ready
- Enable event-driven architecture between your lakehouse and data warehouse
Challenges Tray.ai solves
Common obstacles when integrating Databricks and Snowflake — and how Tray.ai handles them.
Challenge
Managing Authentication Across Two Enterprise Platforms
Databricks and Snowflake use distinct authentication mechanisms — Databricks relies on personal access tokens and service principals, while Snowflake uses key-pair authentication, OAuth, or username-password with MFA. Rotating credentials for both platforms in custom scripts is error-prone, and when tokens expire mid-pipeline, you typically find out after data has already stopped moving.
How Tray.ai helps
tray.ai provides a secure, centralized credential store for both Databricks and Snowflake connections. Authentication is configured once per connector and reused across all workflows, with no credentials embedded in pipeline code. When tokens need rotation, only the tray.ai connector configuration needs updating — every workflow using it picks up the change automatically.
Challenge
Handling Schema Evolution and Mismatches Between Platforms
Databricks Delta tables and Snowflake schemas evolve independently as teams add columns, change data types, or rename fields. Custom ETL scripts moving data between the two frequently break when schemas drift, causing silent data loss or failed loads that are difficult to diagnose.
How Tray.ai helps
tray.ai's visual data mapper lets teams explicitly define and maintain column mappings between Databricks and Snowflake schemas. When a source schema changes, the workflow surfaces a clear mapping error rather than silently dropping or misrouting data. Teams can update mappings in the tray.ai UI without rewriting pipeline code.
Challenge
Orchestrating Job Dependencies Across Platform Boundaries
Many data pipelines require Databricks jobs to finish before Snowflake tables are loaded, or Snowflake queries to complete before Databricks notebooks are triggered. Building these cross-platform dependencies using each platform's native scheduler in isolation leads to hard-coded wait times, race conditions, and fragile cron-based coupling.
How Tray.ai helps
tray.ai acts as a cross-platform orchestration layer, letting teams build conditional, event-driven workflows that wait for job completion signals from one platform before triggering actions on the other. Native branching, retry logic, and status polling replace brittle time-based scheduling with reliable dependency management.
Templates
Pre-built workflows for Databricks and Snowflake you can deploy in minutes.
This template monitors for Databricks job run completions and automatically writes the resulting Delta table data into a specified Snowflake target table, handling schema mapping and upsert logic.
This template polls a Snowflake table for new or updated rows on a schedule and automatically triggers a Databricks notebook or workflow job, passing relevant parameters such as record IDs or date ranges.
This template runs on a nightly schedule to copy reference and lookup tables from Snowflake into Databricks Delta tables, so that all batch jobs and ML pipelines use up-to-date enrichment data.
After a Databricks model scoring job finishes, this template extracts prediction results and loads them into a Snowflake table so that downstream BI tools and operational applications can immediately consume ML outputs.
This template runs automated reconciliation checks between Databricks and Snowflake tables on a scheduled basis, comparing row counts and aggregate metrics, and routes discrepancy alerts to the responsible data team.
How Tray.ai makes this work
Databricks + Snowflake runs on the full Tray.ai platform
Intelligent iPaaS
Integrate and automate across 700+ connectors with visual workflows, error handling, and observability.
Learn more →Agent Builder
Build AI agents that read, write, and take action in Databricks and Snowflake — with guardrails, audit, and human-in-the-loop.
Learn more →Agent Gateway for MCP
Expose Databricks + Snowflake actions as governed MCP tools — observable, rate-limited, authenticated.
Learn more →Ship your Databricks + Snowflake integration.
We'll walk through the exact integration you're imagining in a tailored demo.