Databricks
Databricks
AI Agents
Unity Catalog
Data Engineering
Cloud Security

Databricks December 2025 Update: Agentic AI and Unified Governance

D
Data & AI Insights CollectiveDec 30, 2025
7 min read

Overview of the December 2025 Updates

As 2025 draws to a close, the Databricks ecosystem has reached a pivotal turning point. The updates released in December 2025 signal a clear shift in strategy: moving away from legacy architectures and doubling down on "Agentic AI." For you as an engineer or data scientist, these changes mean your workflows are becoming more automated, but the infrastructure supporting them is becoming more strictly governed.

What matters here is the transition from passive tools to active agents. The platform is no longer just providing a space to write code; it is now providing the intelligence to execute multi-step tasks autonomously. Simultaneously, Databricks is tightening the reins on security and governance, effectively mandating the use of Unity Catalog for all new accounts.

In this guide, you will learn how to navigate these new features, from the Public Preview of the Data Science Agent to the implementation of context-based ingress controls.

The Rise of the Databricks Assistant Agent Mode

One of the most significant shifts this month is the move of Databricks Assistant Agent Mode into Public Preview. If you have used the standard Assistant before, you know it as a helpful sidecar for code suggestions. Agent Mode is fundamentally different.

How Agent Mode Differs from Standard Chat

Traditional LLM assistants are reactive; you ask a question, and they provide text. Agent Mode is proactive. From a single prompt, the agent can perform an autonomous loop of actions:

  1. Retrieve: It identifies and pulls relevant assets from your workspace.
  2. Generate: It writes the necessary Python or SQL code.
  3. Execute: It runs the code in your notebook environment.
  4. Self-Correct: If the code fails, the agent analyzes the error and attempts to fix it without your intervention.
  5. Visualize: It can automatically generate charts based on the resulting data.

The real value is in the agent's ability to "sample" data. By looking at a subset of your actual data and cell outputs, the agent gains context that a generic LLM lacks, leading to much higher accuracy in complex data science tasks.

The Infrastructure Behind the Agent

Under the hood, the Assistant in Agent Mode intelligently chooses between Azure OpenAI and Anthropic models hosted within the Databricks security perimeter. This ensures that your prompts and data samples do not leave the controlled environment. To use this, you must ensure that "Partner-powered AI features" are enabled in your admin settings.

Strengthening the Security Perimeter

Security in a modern data lakehouse requires more than just a password. The December updates introduce two major features that change how you manage access: Single-use refresh tokens and Context-based ingress control.

Single-Use Refresh Tokens for OAuth

For those managing user-to-machine authentication, single-use refresh tokens are a major security upgrade. Previously, a leaked refresh token could potentially be used multiple times to generate new access tokens.

With this update, you can configure your OAuth applications to require token rotation. Every time a refresh token is used to get a new access token, the old refresh token is invalidated, and a new one is issued. This significantly limits the window of opportunity for an attacker who might intercept a token during a session.

Context-Based Ingress Control

Currently in Public Preview, context-based ingress control allows account admins to move beyond simple IP whitelisting. You can now create sophisticated allow and deny rules based on a combination of factors:

  • Identity: Who is making the call?
  • Network Source: From where is the request originating (e.g., a specific VPC or IP range)?
  • Target Asset: What specific workspace or resource are they trying to reach?

This "zero trust" approach ensures that even if a user has the right credentials, they cannot access sensitive data if they are calling from an untrusted network or trying to reach a workspace they aren't authorized for under that specific context.

The Unity Catalog Mandate: Sunsetting Legacy Features

Perhaps the most controversial yet necessary change is the decision to disable legacy features for all new Databricks accounts created after December 18, 2025.

What is being removed?

New accounts will no longer have access to:

  • DBFS Root and Mounts: The old way of interacting with cloud storage is being replaced by Unity Catalog external locations.
  • Hive Metastore: All metadata must now live in the Unity Catalog.
  • No-Isolation Shared Compute: This enforces better workload isolation, which is critical for multi-tenant environments.

For long-time users, this might feel like a restriction. However, the real value is in the consistency. By forcing Unity Catalog, Databricks ensures that every new user starts with a unified governance layer, making audit logs, lineage, and fine-grained access control the default rather than an afterthought.

Expanding Ingestion with Lakeflow Connect

Data ingestion is often the most brittle part of a data pipeline. The December updates to Lakeflow Connect aim to make this a "set it and forget it" process for two major sources: MySQL and Meta Ads.

Managed MySQL Ingestion

The MySQL connector is now in Public Preview. This is a fully-managed service that handles incremental data ingestion. Whether your MySQL instance is on Amazon RDS, Aurora, Azure, or even self-hosted on EC2, Lakeflow Connect can track changes and sync them into your lakehouse. This removes the need for complex CDC (Change Data Capture) setups involving Debezium or custom Glue jobs.

Meta Ads Integration

For marketing analytics teams, the new Meta Ads connector allows for direct ingestion of advertising data. By bringing this data directly into Databricks, you can join your social media spend with your internal sales data in Unity Catalog, enabling more accurate ROI modeling.

The Expanding Model Garden: Claude 4.5 and Gemini 3

Databricks continues to expand its Foundation Model APIs, providing hosted versions of the latest industry models. This month brings two heavy hitters to the platform:

ModelPrimary StrengthUse Case
Anthropic Claude Haiku 4.5Speed and EfficiencyHigh-volume text classification, simple extraction, and low-latency chatbots.
Gemini 3 FlashMultimodal AnalysisVideo analysis, complex visual Q&A, and extracting data from documents with heavy imagery.

You can access these via pay-per-token endpoints, meaning you don't have to manage the underlying GPUs. This is particularly useful for developers who want to experiment with different models for specific tasks without committing to a dedicated provisioned throughput instance.

Lakebase: Bringing OLTP Capabilities to the Lakehouse

Lakebase (Databricks' project for handling transactional, OLTP-style workloads) received several updates this month focused on operational maturity.

  • Autoscaling Metrics: A new dashboard allows you to monitor how the system scales in response to transaction volume.
  • ACL Support: You can now use Access Control Lists to manage who can create or manage resources within Lakebase, bringing it in line with the rest of the Databricks permission model.
  • SQL Editor Access: You can now connect to Lakebase directly from the Databricks SQL editor with full read-write access, making it easier to run ad-hoc queries or administrative tasks.

Developer Productivity and Compute Reliability

Finally, several quality-of-life improvements were made for the day-to-day engineer.

Flexible Node Types

Flexible node types are now Generally Available (GA). This is a massive win for reliability. If you request a specific instance type (like an m5.xlarge) and AWS is out of capacity for that type, Databricks will automatically fall back to a compatible alternative. This prevents your jobs from failing at the start due to cloud provider capacity constraints.

Notebook Job Integration

You can now view the results of the latest scheduled notebook run directly within the notebook interface. If the scheduled job produced updated results, you can sync your current notebook view with those results. This bridges the gap between "development" notebooks and "production" jobs, allowing you to debug or verify output without jumping between different UI screens.

Tecyfy Takeaway

December 2025 is a clear indicator that the "modern data stack" is consolidating. For you, the takeaway is three-fold:

  1. Embrace the Agent: Start experimenting with Assistant Agent Mode for repetitive data cleaning and visualization tasks. It is significantly more capable than a standard chatbot.
  2. Audit Your Access: If you haven't moved to Unity Catalog or implemented OAuth rotation, now is the time. The platform is moving toward a mandatory governance model, and getting ahead of it will save you significant technical debt.
  3. Simplify Ingestion: Before building a custom ETL pipeline for MySQL or Meta Ads, check if Lakeflow Connect can handle it. Managed ingestion is almost always cheaper and more reliable than custom-coded solutions.

Databricks is no longer just a place to run Spark; it is becoming an autonomous engine for data intelligence. Your role is shifting from writing the code to orchestrating the agents and governing the data they use.

Share this article