
Databricks Sneak Peek: Exciting Updates for January 2025
A collaborative team of Data Engineers, Data Analysts, Data Scientists, AI researchers, and industry experts delivering concise insights and the latest trends in data and AI.
Databricks is evolving once again, bringing exciting updates to simplify workflows, optimize performance, and boost productivity. Whether you're a data engineer, analyst, or scientist, these changes are set to enhance your experience on the platform. Let's explore what's new and how it impacts you.
Predictive Optimization: Smarter, Faster, Better! π
Starting January 21, 2025, Databricks will begin enabling statistics management to all accounts with predictive optimization enabled. Statistics management expands existing predictive optimization functionality by adding stats collection on write and automatically running ANALYZE
commands for Unity Catalog managed tables.
For all accounts created after November 11, 2024, Databricks enables predictive optimization by default.This feature automatically identifies and optimizes tables for better performance and efficiency.
What Does Predictive Optimization Do?
Feature | Details |
---|---|
Tasks Automated | OPTIMIZE, VACUUM, ANALYZE (Unity Catalog managed tables only). |
Technology | Uses serverless compute with Databricks Managed Services SKU. |
Benefits | Improves file sizes, reduces storage costs, and boosts query performance. |
Limitations | Does not support streaming tables, materialized views, or ZORDER operations. |
Note: Predictive optimization only works on Unity Catalog managed tables.
Git Integration: A New Era Begins β¨
Say goodbye to the legacy notebook Git integration! As of January 31, 2024, it will be officially retired. Transition to Databricks Git folders (formerly Repos) for syncing work with remote Git repositories seamlessly.
Notebook Upgrade: Hello, IPYNB Format! π
Starting January 2025, the default notebook format will switch to IPYNB (.ipynb). This change captures notebook environments, visualization definitions, and widgets, offering a richer development experience.
Prefer the old format? No worries! You can change the default in your workspace user settings.
Delta Live Tables: Enhanced Flexibility π―
Big updates are coming to Delta Live Tables, making it more versatile and user-friendly:
- Multi-Catalog Publishing: Publish to multiple catalogs and schemas within a single pipeline.
- Syntax Simplification: The
LIVE
keyword is no longer required for internal dataset references. - Object Recovery: Recover dropped objects with the
UNDROP
command.
Pro Tip: Deleted materialized views and streaming tables are not auto-dropped. Use DROP
commands explicitly.
Workspace Files: Universal Availability ποΈ
As of February 1, 2025, workspace files will be enabled for all workspaces. This unlocks enhanced file management features, making collaboration and organization easier.
Marketplace: Unified and Simplified ποΈ
The Partner Connect and Marketplace links are merging into a single Marketplace link for streamlined access. Look for it in a higher position on the sidebar.
Legacy Dashboards: Time to Say Goodbye π
Support for legacy dashboards ends on April 7, 2025. After November 3, 2025, Databricks will archive unused legacy dashboards. Transition to AI/BI dashboards (formerly Lakeview dashboards) with available upgrade tools.
Other Noteworthy Updates
Hereβs a quick rundown of additional improvements:
Update | Impact/Details |
---|---|
Delta Sharing History | Includes history by default for improved read performance. |
Serverless Compute | Enhanced cost vs. performance optimization for workflows. |
AWS IAM Role Policy Enforcement | Self-assuming roles required starting January 20, 2025. |
Audit Logs | The sourceIpAddress field will no longer include port numbers. |
Support Tickets | Moved to the Databricks workspace help menu. |
Runtime Changes | JDK 8 support ends; JDK 11 support ends with LTS version 14.x. |
Unity Catalog | Automatically enabled for new workspaces. |
SQLite JDBC Upgrade | Upgraded to version 3.42.0.0 in runtime maintenance releases. |
Why This Matters
Databricks is doubling down on making the platform smarter, faster, and more intuitive. These updates are designed to enhance productivity while keeping pace with the demands of modern data engineering.
Get Ready to Embrace the Future π
From predictive optimization to enhanced Delta Live Tables and the all-new AI/BI dashboards, Databricks continues to redefine the boundaries of data engineering. Stay ahead by exploring the latest documentation and upgrading your workflows today!
π‘ What are your thoughts on these updates? Drop your comments and let's discuss how these changes will impact your data journey.