DAILY BRIEFING · SATURDAY, MAY 23, 2026

Data & AI Platforms Briefing

Agentic AI moves from the edges to the core of the platform — multi-agent protocols land in streaming, runtime guardrails reach GA in the warehouse, and observability vendors begin executing remediation without a human in the loop.


⇣ Jump To

🔄 ⚡ Move & Transform

Streaming & Messaging ·  CDC ·  Stream Processing ·  Transformation Frameworks ·  In-Process Compute

🏛️ 🗄️ Store & Architect

Cloud Data Warehouses ·  Lakehouses ·  Table Formats ·  Architectural Patterns ·  Specialty Platforms

⚡ 📤 Consume & Activate

AI-Driven Consumption ·  Semantic Layers & Retrieval ·  Enterprise RAG & Retrieval

🛡️ ⚙️ Govern & Operate

Orchestration & Workflow ·  Data Observability ·  Data Quality & Testing ·  Catalogs & Metadata ·  Governance, Security & Compliance

⚡ QUICK TAKES

Story Signal
  Confluent ships A2A protocol for multi-agent networks The streaming bus becomes the agent bus.
  Redpanda posts 70% ARR growth on agentic data plane Kafka-compat rebrands as AI substrate, with results.
  Estuary: production Debezium is "far from set-and-forget" CDC operational tax becomes the buying criterion.
  RisingWave claims 22 of 27 Nexmark wins over Flink Streaming database model challenges DAG orthodoxy.
  dbt Labs ships new Semantic Layer YAML, Fusion in Core 1.12 Semantic models collapse into model YAML entries.
  DuckDB 1.5.2 lands DuckLake; Polars + DuckDB stack matures In-process is now production for medium-scale analytics.
  Snowflake Batch Cortex Search GA on May 18 Millions of fuzzy lookups in one SQL statement.
  Snowflake adds column-level lineage to dbt DAG view Horizon Catalog wires lineage into the dev loop.
  Databricks Native Lakehouse Sync goes preview on Autoscaling Postgres WAL writes directly to Delta — zero pipeline.
  Apache Iceberg v3 GA on Snowflake (May 7) VARIANT, deletion vectors, row lineage in production.
  Open lakehouse "no longer experimental" in 2026 Federated catalogs like Polaris reshape the pattern.
  Teradata launches Autonomous Knowledge Platform Sovereign-AI play with Dell/NVIDIA on-prem option.
  Cortex AI Guardrails GA for Snowflake Intelligence Prompt-injection defense at the warehouse boundary.
  AtScale Semantic Layer Summit puts agentic analytics center-stage 8,500+ practitioners; OSI and AI take the keynote.
  Context architecture replaces RAG for agent workloads Agents pull data at runtime, not as pre-loaded payload.
  Astronomer + IBM OEM Airflow for regulated industries 70% downtime cut, claims joint launch material.
  Acceldata's Autonomous Data & AI Platform reaches GA Agents detect, diagnose, and remediate without alerts.
  CTERA InsightAI brings agentic management to unstructured data Audit logs + permissions analyzed in natural language.
  Soda 4.0 fuses observability with a contracts engine Contracts + anomaly checks ship in one open-source core.
  OpenMetadata 1.12.8 hardens Unity Catalog & Iceberg ingest CVE patches and Iceberg property surfacing land May 13.
  Informatica ties Iceberg governance into Snowflake "Build once, deploy anywhere" row-level policy expands.
🔄

Move & Transform

› Streaming & Messaging

TechTarget · May 2026

Confluent Adds A2A Support to Fuel Multi-Agent AI Networks

Confluent Intelligence now supports Google's Agent2Agent (A2A) protocol in open preview, letting Streaming Agents orchestrate task-handoffs across heterogeneous agent frameworks over Kafka topics. Paired with Multivariate Anomaly Detection, the move positions the streaming bus as the orchestration spine for multi-agent enterprise networks — and gives platform teams a single place to govern, secure, and observe agent traffic alongside data traffic.

✍️ TechTarget · Read article →

Yahoo Finance / Redpanda · May 2026

Redpanda Reports Record Q1, Delivering 70% Year-over-Year Growth on Agentic Data Plane

Redpanda's FY27 Q1 numbers show 70% ARR growth, with the company crediting demand for "data and governance infrastructure needed to deploy AI agents safely at scale." Kafka compatibility has been deliberately backgrounded in favor of the Agentic Data Plane (MCP, A2A, AI Gateway, OIDC-based identity, OpenTelemetry traces) — a signal that the streaming vendor positioning fight is now over agent governance, not throughput.

✍️ Redpanda via Yahoo Finance · Read article →

› CDC

Estuary · April 2026

Debezium for CDC in Production: Pain Points and Limitations

Estuary's field-report takedown catalogs the operational tax of running Debezium at scale: Kafka Connect lifecycle, replication-slot management, snapshot semantics, schema-registry coordination, and the lack of native idempotent sinks. The piece is pointed marketing, but the failure modes it lists are the same ones platform teams hit in week 12 of a Postgres-to-Snowflake build — and they're shaping the buying conversation around CDC managed services.

✍️ Estuary · Read article →

› Stream Processing

RisingWave · May 2026

Apache Flink vs RisingWave: A Practical Comparison for 2026

The streaming-database vendor claims RisingWave outperforms Flink on 22 of 27 Nexmark queries, with the gap widest on multi-stream joins (10+ inputs) where Flink's RocksDB-backed state management struggles. Beyond the benchmark argument, the piece frames a real architectural choice for platform teams: PostgreSQL-compatible streaming database with built-in storage versus distributed DAG framework — and what each implies for ops, replay, and integration with downstream lakehouses.

✍️ RisingWave · Read article →

› Transformation Frameworks

dbt Labs · May 2026

What's Shipped in dbt — May 2026

The May release introduces a new Semantic Layer YAML spec that embeds semantic models inside model YAML entries, promotes measures to simple metrics, and lifts frequently-used options to top-level keys; the spec ships in dbt Core v1.12 and on the platform "Latest" track. With the Fivetran merger still pending close, the rapid spec evolution and Fusion engine availability through dbt Projects on Snowflake suggest dbt is locking in semantic-layer relevance before vendor consolidation lands.

✍️ dbt Labs · Read article →

› In-Process Compute

PyInns · 2026

DuckDB vs Polars in 2026 — Benchmarks & Guide

DuckDB 1.5.2 (April 2026) brings the DuckLake extension to production and continues to harden SQL-first workflows, while Polars wins on transformation-heavy, programmatic logic. The two engines now share Arrow-native memory with zero-copy handoff, making "DuckDB for relational, Polars for dataframe" a defensible production split for medium-scale jobs that used to need Spark.

✍️ PyInns · Read article →

↑ Top


🏛️ 🗄️

Store & Architect

› Cloud Data Warehouses

Snowflake Docs · May 2026

Batch Cortex Search Reaches General Availability (May 18, 2026)

The new CORTEX_SEARCH_BATCH table function executes millions of fuzzy-match queries against a Cortex Search Service in a single SQL statement, with separate batch compute that doesn't degrade interactive serving. Snowflake is targeting entity resolution, deduplication, and clustering workloads that previously required external pipelines — and crucially, the service can be queried in batch and interactive mode concurrently, with batch jobs able to hit suspended serving instances.

✍️ Snowflake · Read article →

Snowflake Docs · May 2026

dbt Projects on Snowflake: Column-Level Lineage and Fusion Engine

The May 19 release renders each dbt DAG node with its columns sourced from Horizon Catalog; selecting a column highlights every upstream and downstream model touching it. dbt Projects on Snowflake also now supports the dbt Fusion engine at no extra license cost. For platform teams, lineage moves from an after-the-fact catalog problem to a development-time view inside the editor.

✍️ Snowflake · Read article →

› Lakehouses

Databricks · May 2026

Announcing Native Lakehouse Sync (Public Preview on Lakebase Autoscaling)

Lakehouse Sync decodes Lakebase's Postgres write-ahead log and writes directly to Unity Catalog managed Delta tables as SCD Type-2 history — no Kafka Connect, no Debezium, no external compute. Schema-level toggle, claimed zero impact on Postgres, and no extra cost. If it holds up, this is the simplest OLTP-to-lakehouse CDC path on the market, and a direct shot at the standalone CDC vendor stack.

✍️ Databricks · Read article →

› Table Formats

Snowflake Docs · May 2026

Apache Iceberg v3 General Availability on Snowflake (May 7, 2026)

With v3 GA Snowflake gets VARIANT for semi-structured payloads in a relational table, deletion vectors for faster CDC, row-level lineage, default values, and richer types — most of which already align with Delta v4. The story for architects is convergence: the same physical Iceberg table now exposes the AI-shaped surface (VARIANT, lineage) that previously required separate JSON columns or external doc stores.

✍️ Snowflake · Read article →

› Architectural Patterns

Architecture & Governance Magazine · May 2026

Breaking the Silos: The Rise of the Open Lakehouse Architecture in 2026

The piece argues the open lakehouse has moved from "emerging" to "default enterprise pattern" in 2026, driven by federated catalogs (Apache Polaris graduated in February), open table-format convergence, and the practical demand for engine-swap agility under agentic AI workloads. The argument that resonates for architects: vendor lock at the storage layer becomes intolerable when AI workloads demand multiple compute engines hitting the same physical files.

✍️ Architecture & Governance Magazine · Read article →

› Specialty Platforms

Teradata · May 2026

Introducing the Teradata Autonomous Knowledge Platform

Announced May 7, the platform bundles Teradata AI Studio, an enhanced Teradata Cloud with Elastic Compute, and an on-premises "Factory" variant on Dell PowerEdge + NVIDIA AI Enterprise for regulated workloads. The pitch is sovereign agentic AI without re-platforming — interesting positioning for an incumbent fighting to stay relevant against Snowflake and Databricks now that data residency is a first-order AI requirement.

✍️ Teradata · Read article →

↑ Top


📤

Consume & Activate

› AI-Driven Consumption

Snowflake Docs · May 2026

Cortex AI Guardrails GA for Snowflake Intelligence & Cortex Agents (May 14, 2026)

Guardrails — part of Horizon Catalog — now provide runtime detection of prompt injection and jailbreak attempts across Snowflake Intelligence, Cortex Agents, and Cortex Code, including indirect injection embedded in tool calls. Admins flip a single account-level AI_SETTINGS parameter to enable across surfaces. For platform teams, this pushes the AI-safety control plane down into the warehouse rather than the application layer.

✍️ Snowflake · Read article →

› Semantic Layers & Retrieval

AtScale · May 2026

Semantic Layer Summit 2026 Centers on Agentic Analytics and Open Semantics

AtScale's May 20 summit drew 8,500+ practitioners with Vodafone, TELUS, Carrefour, Papa Johns, Blue Yonder, and SlickDeals speaking on the architectural foundations needed to scale AI in production. The recurring framing — semantic layer as the governed interface between agents and data — is now industry consensus among governance-aware platform teams, even as the Open Semantic Interchange standard tries to align dbt, Cube, AtScale and warehouse-native semantics.

✍️ AtScale · Read article →

› Enterprise RAG & Retrieval

VentureBeat · May 2026

Context Architecture Is Replacing RAG as Agentic AI Pushes Enterprise Retrieval to Its Limits

VB Pulse's Q1 2026 tracker shows retrieval optimization spending overtook evaluation for the first time, with enterprises pivoting to runtime tool-call retrieval as agents make orders of magnitude more requests than human users. Redis Iris is the headline reference example — semantic-interface auto-generates MCP tools from data models, with 99% of memory on Flex flash at a tenth of in-memory cost. The architectural takeaway for data-platform teams: the retrieval layer is becoming a real-time service tier, not a pre-loaded index.

✍️ VentureBeat · Read article →

↑ Top


🛡️ ⚙️

Govern & Operate

› Orchestration & Workflow

Astronomer / IBM · May 2026

Astronomer and IBM Collaborate to Transform Enterprise Data Orchestration

"Astronomer with IBM" packages Astronomer's managed Airflow as a client-hosted OEM product for regulated industries, with claimed 70% reduction in data downtime and 20% faster pipeline build/test. For enterprises stuck on self-managed Airflow because Cloud Composer or MWAA doesn't meet residency requirements, this is a credible third path — and a sign IBM is buying its way back into the modern data stack via partnerships rather than acquisition.

✍️ Astronomer · Read article →

› Data Observability

BigDATAwire / Acceldata · May 2026

Acceldata Launches Autonomous Data & AI Platform for the Agentic AI Era

GA worldwide May 19. The platform shifts the observability narrative from "alert humans" to "agents detect, diagnose, and remediate" across five domains — quality, pipelines, infrastructure, usage, and cost. Acceldata's "xLake" framing positions compute as governed-and-portable rather than tied to a single warehouse; whether the autonomous-remediation claim survives contact with regulated change-management practices is the question for platform leads.

✍️ Acceldata via BigDATAwire · Read article →

Help Net Security / CTERA · May 2026

CTERA Launches InsightAI: Agentic Intelligence for Unstructured Data Environments

Announced May 20, InsightAI correlates audit trails, metadata, permissions, capacity trends, and security events across CTERA's unstructured data platform, with an "Ask InsightAI" natural-language assistant for investigation. Deployable as SaaS, in a customer-managed VPC, or fully air-gapped (including AWS GovCloud / Azure Government). Targets compliance reporting, storage cost optimization, and chargeback for the file-data side of the estate.

✍️ Help Net Security · Read article →

› Data Quality & Testing

Soda · 2026

Introducing Soda 4.0: AI, Engineers, and Business SMEs in One Platform

Soda 4.0 unifies observability with a new open-source data contracts engine (Soda Core 4.0) and adds always-on anomaly detection that runs even without manually authored checks. The bet — also visible at Gable and Datafold — is that the contracts-versus-observability split was always a tooling artifact and that platform teams want a single layer that enforces upstream agreements and detects unanticipated drift downstream.

✍️ Soda · Read article →

› Catalogs & Metadata

OpenMetadata · May 2026

OpenMetadata 1.12.8: CVE Hardening, Unity Catalog & Iceberg-on-Athena Fixes

The May 13 maintenance release patches recently disclosed CVEs, eliminates hotspots in the search and tag pipelines, and improves connector behavior across Databricks, Unity Catalog, Athena, Datalake, and OpenLineage. Notable: Unity Catalog no longer hard-fails on missing httpPath, Iceberg-on-Athena tables now ingest properties from $properties metatables, and PostgreSQL/MSSQL connectors gain mTLS support — a quiet but important release for production users.

✍️ OpenMetadata · Read article →

› Governance, Security & Compliance

Informatica · May 2026

Informatica Brings Headless Data Management and Iceberg Governance to Snowflake

Announced May 20 at Informatica World, the release extends Cloud Data Access Management (CDAM) row-level policy enforcement to Snowflake Iceberg tables under a "build once, deploy anywhere" model. Following its parallel Databricks Lakebase integration earlier this week, Informatica is positioning CDAM as the cross-platform governance plane that floats above whichever lakehouse the workload happens to land on — a credible answer to multi-engine reality.

✍️ Informatica · Read article →

↑ Top

Compiled by Rainvil Labs · Saturday, May 23, 2026
Sources verified via live web research on May 23, 2026. Outlets cited: Snowflake Documentation, Databricks Blog, Confluent, Redpanda / Yahoo Finance, Estuary, RisingWave, dbt Labs, Teradata, AtScale, VentureBeat, TechTarget, Astronomer, BigDATAwire, Help Net Security, Soda, OpenMetadata, Informatica, PyInns, Architecture & Governance Magazine. This briefing is for informational purposes only and does not constitute legal, regulatory, or investment advice.