DAILY BRIEFING · FRIDAY, MAY 29, 2026

Data & AI Platforms Briefing

Agentic AI is rewiring the data lifecycle this week — dbt collapses the semantic layer into model YAML, DuckDB goes client-server with Quack, Snowflake buys Natoma to govern MCP agent access, and Unravel and Acceldata ship autonomous engines that retune Snowflake, Databricks, and BigQuery without human hands.

› Stream Processing

Story	Signal
↗ dbt collapses semantic layer into model YAML in Core v1.12	The metrics-layer rewrite lands; measures become metrics, files consolidate.
↗ dbt ships dbt-autofix migrator for the new semantic spec	Existing semantic users get a scripted path off the legacy metrics layer.
↗ DuckDB turns client-server with Quack remote protocol	An HTTP-based protocol unlocks multiple concurrent writers — still beta.
↗ DuckDB 1.5.3 patches Quack into core extensions	A “patch” release that adds a remote protocol — production Quack waits for 2.0.
↗ Apache Flink 2.0.2 patches the streaming-for-AI baseline	Bug-fix release stabilizes the 2.x branch behind real-time agentic pipelines.
↗ Snowflake to acquire Natoma for MCP-governed agent access	Identity and access controls move into the warehouse for AI-agent traffic.
↗ OneLake security goes GA, on by default for all items	Engine-agnostic row/column controls land at the storage layer of Fabric.
↗ Snowflake takes Apache Iceberg v3 to general availability	Deletion vectors, defaults, and row lineage now write-supported on Snowflake.
↗ Snowflake details the engineering behind Iceberg v3 GA	External engines can read v3 via Horizon REST; write-back from outside still gated.
↗ Pinecone opens AWS Frankfurt region for sovereign AI retrieval	Vector workloads gain a central-Europe data residency anchor.
↗ Vespa pushes hybrid retrieval and ranking tooling in May newsletter	Console metrics, query pinning, and a learn.vespa.ai course target RAG teams.
↗ Context architecture is replacing RAG as agents hit scale walls	Retrieval is moving from a layer to an architecture spanning the platform.
↗ Cortex Code in Snowsight goes GA with Agent Teams	Snowflake-native coding agent now coordinates parallel work and Windows CLI.
↗ Unravel Data launches Arvix AI for autonomous platform tuning	Agentic engine claims 40% spend cuts and 4× speed across the big three.
↗ Acceldata calls the lakehouse era over with Autonomous Data & AI Platform	Hybrid-by-default routing for agentic workloads goes GA.
↗ OpenMetadata 1.12.9 ships incremental Unity Catalog ingestion	Catalogs detect changed tables and re-ingest only the deltas.
↗ BigID named Leader in Forrester Wave for sensitive data discovery	Classification and discovery converge with AI agent access governance.
↗ Waehner: lineage belongs in a platform-independent catalog	OpenLineage and ODCS get pitched as the cross-vendor lineage spine.

Apache Flink · May 2026

Apache Flink 2.0.2 ships stability fixes for the AI-era streaming baseline

The third patch on the 2.0 line lands targeted fixes around state recovery, checkpoint behavior, and connector resilience — the kind of housekeeping that matters when Flink jobs sit between Kafka and an agentic inference layer. Combined with the May 15 2.2.1 release and the May 26 Kubernetes Operator 1.15.0, the community is hardening Flink for production streaming-for-AI rather than chasing new APIs.

✍️ Apache Flink PMC · Read article →

› Transformation Frameworks

dbt Labs · May 2026

What's shipped in dbt — May 2026: semantic layer rewrite goes live in Core v1.12

The May monthly summary leads with the new semantic layer YAML spec: semantic models are now embedded directly inside model YAML entries, measures collapse into simple metrics, and frequently used keys move to the top level. It is a deliberate simplification aimed at lowering the barrier to defining governed metrics — the asset every text-to-SQL and agentic BI tool now pulls from.

✍️ dbt Labs · Read article →

dbt Developer Blog · May 2026

Modernizing the semantic layer spec — and shipping dbt-autofix to migrate

dbt's deeper engineering writeup explains the three changes: measures are gone from the authoring spec, deep dictionary nesting is flattened, and semantic annotations now live alongside model YAML. Crucially, dbt-autofix deprecations --semantic-layer ports existing projects off the legacy metrics layer, removing a major reason teams stalled on adoption.

✍️ dbt Labs Engineering · Read article →

› In-Process Compute

DuckDB · May 2026

Quack: the DuckDB client-server protocol that adds multiple writers

DuckDB's defining constraint — single-writer, embedded-only — is now optional. Quack is an HTTP-based remote protocol that lets DuckDB instances talk to one another in a classic client-server arrangement, enabling concurrent writers and remote attachments without giving up the in-process simplicity for the local case. The protocol is beta; production parity is targeted for DuckDB 2.0 in the fall.

✍️ DuckDB Labs · Read article →

DuckDB · May 2026

DuckDB 1.5.3 is “not an ordinary patch release” — Quack lands as a core extension

The 1.5.3 point release packages the Quack remote protocol as a core extension that auto-installs and auto-loads on first use, so any DuckDB client can speak the new protocol without a manual install step. For data engineers, it is the cleanest path yet to using DuckDB as a shared analytical workspace across small teams or services.

✍️ DuckDB Labs · Read article →

› Cloud Data Warehouses

Snowflake · May 2026

Snowflake announces intent to acquire Natoma to govern MCP agent access

Bundled with the May 27 earnings beat, Snowflake said it will buy Natoma, an enterprise Model Context Protocol platform that brokers and authorizes AI clients into applications, databases, and APIs. The pitch to platform engineers: a native identity and privileged-access layer for agent traffic, with a curated library of MCP servers reachable from Cortex Agents, Snowflake Intelligence, and Cortex Code — terms undisclosed.

✍️ Snowflake · Read article →

› Lakehouses

Microsoft Fabric · May 2026

OneLake security goes GA — fine-grained access control at the storage layer

Microsoft completed the rollout of OneLake security to all supported item types, with the model now enabled by default on creation and applied retroactively to existing items. The role-based model enforces item-, folder-, table-, row-, and column-level controls that travel with the data — visible whether queried from a Spark notebook, Power BI report, or Fabric data agent. New granular APIs let admins manage roles at scale.

✍️ Microsoft Fabric Team · Read article →

› Table Formats

Snowflake Docs · May 2026

Snowflake takes Apache Iceberg v3 support to general availability

Following the March 4 preview, Snowflake flipped Iceberg v3 to GA on May 7. Practitioners get default column values, deletion vectors for fast updates and deletes without rewrites, and row-lineage tracking suitable for CDC out of Snowflake-managed tables. External engines can read v3 via the Horizon Iceberg REST Catalog API; cross-engine writes back through Horizon remain unsupported for now.

✍️ Snowflake · Read article →

Snowflake Blog · May 2026

Announcing Apache Iceberg v3 support on Snowflake — the engineering view

Snowflake's companion blog walks through how the v3 features change pipeline economics — most consequentially, deletion vectors replacing copy-on-write for UPDATE and DELETE, and row lineage as the substrate for bidirectional CDC between Snowflake and external engines. For architects choosing between Snowflake-managed v3 and Polaris-cataloged external tables, it sharpens the trade-off around cross-platform write access.

✍️ Snowflake Engineering · Read article →

› Vector & Specialty Stores

BigDATAwire · May 2026

Pinecone opens AWS Frankfurt region for sovereign AI retrieval

Pinecone's serverless vector database and the broader knowledge infrastructure stack are now available in AWS eu-central-1. The Frankfurt landing is paired with prior launch-week introductions — Pinecone Nexus and the KnowQL declarative retrieval language — and gives EU-bound platform teams a path to keep agentic retrieval inside data-residency boundaries instead of routing it through US regions.

✍️ BigDATAwire / Pinecone · Read article →

› AI-Driven Consumption

Snowflake Blog · May 2026

Cortex Code in Snowsight goes GA, adds Agent Teams and Windows CLI

Snowflake's developer-facing coding agent moves out of preview inside Snowsight, the CLI gains native Windows support, and a new Agent Teams construct lets a primary agent decompose work across coordinated subagents. Combined with the May 26 release of additional data-clean-room skills, Cortex Code is positioning as the in-platform automation surface for everything from pipeline scaffolding to governed cross-org collaboration flows.

✍️ Snowflake · Read article →

› Enterprise RAG & Retrieval

Vespa Blog · May 2026

Vespa's May newsletter: retrieval quality, ranking flexibility, learn.vespa.ai

Vespa's monthly roundup leans into the parts of the stack RAG teams complained about: deeper per-application metrics in the Cloud Console, group pinning so paginated queries stay consistent across requests, and learn.vespa.ai — a self-paced course that walks engineers from BM25 baselines to ML-ranked hybrid retrieval on a working e-commerce search app.

✍️ Vespa.ai · Read article →

VentureBeat · May 2026

Context architecture is replacing RAG as agentic AI pushes retrieval to its limits

The piece argues that monolithic RAG-as-a-layer is breaking under agentic workloads, where retrieval needs to span permissioned tools, memory, structured tables, and multi-hop reasoning. Vendors interviewed describe a shift to “context architecture”: retrieval as a first-class platform concern with policy, freshness, and ranking treated like data engineering rather than a prompt-time afterthought.

✍️ VentureBeat · Read article →

› Data Observability

Business Wire · May 2026

Acceldata launches an Autonomous Data & AI Platform — and calls the lakehouse era over

Announced May 19, Acceldata's new platform is positioned as “governed compute wherever the data lives,” a hybrid-by-default control plane that routes workloads, augments quality, enforces policies, and tunes cost across cloud, on-prem, and sovereign environments at the speed of agents. CEO Rohit Choudhary's framing — “the lakehouse architecture broke in the agentic era” — is the most provocative pitch in the category this quarter.

✍️ Acceldata · Read article →

› Catalogs & Metadata

OpenMetadata Docs · May 2026

OpenMetadata 1.12.9 lands incremental Unity Catalog ingestion

The May 28 maintenance release adds change detection to the Unity Catalog connector, so OpenMetadata re-ingests only the modified entities instead of crawling the full estate — material for large Databricks tenants. Tag handling for Databricks and Unity Catalog tables that are tagged without explicit values is now correct, and there are targeted UI, search, and Python-client fixes.

✍️ OpenMetadata Community · Read article →

› Data Contracts & Lineage

Kai Waehner · May 2026

Beyond enterprise data lineage: the case for a platform-independent catalog

Waehner argues that lineage hard-wired into a single vendor's catalog (Unity, Polaris, OneLake, Horizon) collapses the moment a workload spans two of them — which is now the default. The recommended pattern: a vendor-neutral catalog layer fed by OpenLineage events and Open Data Contract Standard definitions, treating lineage as cross-platform infrastructure rather than a feature in any one platform.

✍️ Kai Waehner · Read article →

› Governance, Security & Compliance

PR Newswire · May 2026

BigID named a Leader in Forrester Wave for sensitive data discovery and classification

BigID earned Leader status in Forrester's Q2 2026 Wave and used the moment to extend Data Access Governance to AI agents — visibility and control over what non-human identities can read and act on across the data estate. For platform teams, the underlying signal is that classification, DSPM, and agent access governance are converging into one control surface.

✍️ BigID · Read article →

› FinOps for Data

SiliconANGLE · May 2026

Unravel Data launches Arvix AI to autonomously tune Databricks, Snowflake, and BigQuery

Unravel's new agentic engine, announced May 27, claims average savings of 40% on platform spend and 4× performance gains by continuously rewriting queries, right-sizing infrastructure, and pruning storage — with every change validated against real workload behaviour before commit and automatically reverted on regression. A reference airline took $340K out in three days via 1,500 auto-applied insights; the production-readiness model — “test, apply, watch, roll back” — is the differentiator from FinOps dashboards.

✍️ SiliconANGLE / Unravel Data · Read article →

Compiled by Rainvil Labs · Friday, May 29, 2026
Sources verified via live web research on May 29, 2026 across vendor engineering blogs (dbt Labs, DuckDB, Snowflake, Microsoft Fabric, Vespa, Apache Flink, OpenMetadata), industry analyst outlets (SiliconANGLE, BigDATAwire, VentureBeat), and primary press releases (Business Wire, PR Newswire). This briefing is for informational purposes only and does not constitute legal, regulatory, or investment advice.

Data & AI Platforms Briefing

Move & Transform

› Stream Processing

Apache Flink 2.0.2 ships stability fixes for the AI-era streaming baseline

› Transformation Frameworks

What's shipped in dbt — May 2026: semantic layer rewrite goes live in Core v1.12

Modernizing the semantic layer spec — and shipping dbt-autofix to migrate

› In-Process Compute

Quack: the DuckDB client-server protocol that adds multiple writers

DuckDB 1.5.3 is “not an ordinary patch release” — Quack lands as a core extension

Store & Architect

› Cloud Data Warehouses

Snowflake announces intent to acquire Natoma to govern MCP agent access

› Lakehouses

OneLake security goes GA — fine-grained access control at the storage layer

› Table Formats

Snowflake takes Apache Iceberg v3 support to general availability

Announcing Apache Iceberg v3 support on Snowflake — the engineering view

› Vector & Specialty Stores

Pinecone opens AWS Frankfurt region for sovereign AI retrieval

Consume & Activate

› AI-Driven Consumption

Cortex Code in Snowsight goes GA, adds Agent Teams and Windows CLI

› Enterprise RAG & Retrieval

Vespa's May newsletter: retrieval quality, ranking flexibility, learn.vespa.ai

Context architecture is replacing RAG as agentic AI pushes retrieval to its limits

Govern & Operate

› Data Observability

Acceldata launches an Autonomous Data & AI Platform — and calls the lakehouse era over

› Catalogs & Metadata

OpenMetadata 1.12.9 lands incremental Unity Catalog ingestion

› Data Contracts & Lineage

Beyond enterprise data lineage: the case for a platform-independent catalog

› Governance, Security & Compliance

BigID named a Leader in Forrester Wave for sensitive data discovery and classification

› FinOps for Data

Unravel Data launches Arvix AI to autonomously tune Databricks, Snowflake, and BigQuery