One question we got during our last webinar: "Is the underlying data warehouse Iceberg? What are the options?" Short answer: yes. Apache Iceberg is the open format handshake between Bauplan and the rest of your stack. Your data never moves — it stays in your storage layer, your existing source of truth. No migration, no lock-in. Turns out that matters quite a bit when AI agents are running dozens of experiments in parallel. In this webinar, we show exactly how that works end to end — with our friends at Recce: → An agent builds a user segmentation pipeline from scratch, in full branch isolation → A second agent adds bot detection to that same pipeline → Recce's review agent compares the branches, surfaces schema diffs + lineage impact, and generates an auditable merge report Zero production risk. Full human oversight. Structured workflow. Trusting AI agents with your data is one of the hardest unsolved problems in data engineering. This is how we're solving it. Full recording 👇 https://lnkd.in/g4FpT8Nc
Apache Iceberg: Open Format for Data Warehouse and AI Agents
More Relevant Posts
-
Is your AI missing the "Ground Truth"? 🧩 I caught up with Wes Madrigal, CEO of Kurve, at ODSC West to talk about why the world of metadata and foreign keys is actually the future of Generative AI. I learned that despite the hype, 80% of AI work is still stuck in the data discovery phase. Wes breaks down how Kurve acts as a developer tool to automate the extraction of primary and foreign keys on data lakes. By building a relationship metadata graph, they allow users to navigate complex data as a graph traversal problem rather than a manual data munching nightmare. As Wes puts it: "Text to SQL falls short without facts, without robust foreign keys... the reality is you need robust facts, just like humans do." Why this matters for us GraphGeeks: - Graph for Data Prep: By treating data preparation as a graph traversal problem, we can automate the manual merging and aggregation that usually slows down analytics. - The Fact Gap: Relational metadata is making a comeback as the essential ground truth that AI agents need to function reliably. - True ML automation: Is about the entire end-to-end pipeline, including data relationships—from discovery to model—not just tuning parameters. Between the technical deep dives, this is another incredible hallway conversation at the Open Data Science Conference (ODSC)! Special thanks to Bryce Merkl Sasaki Merkl-Sasaki for being the pro behind the camera and capturing these conversations live from the floor. 🎥 🎧 Listen on the go: https://lnkd.in/gpvE7Qv8 🎬 Watch the full Graph Chat interview here: https://lnkd.in/gXH9Ni-g
Graph Chat: Automating Data Discovery with Wes Madrigal
https://www.youtube.com/
To view or add a comment, sign in
-
It’s time to stop viewing data governance as a "checkbox" and start seeing it as a competitive advantage. In his latest article featured in Express Computer, Jitender Punia, our Principal Technical Architect for Data Analytics, explores how the absence of a robust data governance framework can quickly turn AI models from assets into liabilities. Read the full feature in our Newsroom: https://lnkd.in/gQ2vQQB5 Do you see data governance as a business enabler or a speed bump in AI adoption? Share your thoughts in the comment below. Narinder Kumar | Vinayak . | Anmol Kalra #GenerativeAI #AIStrategy #DataQuality #DataGovernance
To view or add a comment, sign in
-
-
Why do so many AI projects stall after the pilot phase? I love the insights Glenn Dekhayser and Ravit Jain shared about how siloed systems, unstructured data, and infrastructure gaps hold enterprises back—and what can be done to fix it. https://lnkd.in/gZhHcHWu
To view or add a comment, sign in
-
-
Why do so many AI projects stall after the pilot phase? I love the insights Glenn Dekhayser and Ravit Jain shared about how siloed systems, unstructured data, and infrastructure gaps hold enterprises back—and what can be done to fix it. https://lnkd.in/dN5KDR6U
To view or add a comment, sign in
-
-
Why do so many AI projects stall after the pilot phase? I love the insights Glenn Dekhayser and Ravit Jain shared about how siloed systems, unstructured data, and infrastructure gaps hold enterprises back—and what can be done to fix it. https://lnkd.in/gp5abh4P
To view or add a comment, sign in
-
-
Why do so many AI projects stall after the pilot phase? I love the insights Glenn Dekhayser and Ravit Jain shared about how siloed systems, unstructured data, and infrastructure gaps hold enterprises back—and what can be done to fix it. https://lnkd.in/dfmD3kGW
To view or add a comment, sign in
-
-
Why do so many AI projects stall after the pilot phase? I love the insights Glenn Dekhayser and Ravit Jain shared about how siloed systems, unstructured data, and infrastructure gaps hold enterprises back—and what can be done to fix it. https://lnkd.in/e8kvJiwd
To view or add a comment, sign in
-
-
A lot of AI conversations focus on models, but the real issue often shows up earlier in the pipeline. If your data is delayed, incomplete, or stuck in legacy systems, the outputs will be too. No amount of tuning fixes stale inputs. Jess Ramos lays this out well here, especially around the impact of batch pipelines and missed refreshes. If this is something your team is working through, register for the live event: https://bit.ly/4sAJHtO 📅 April 21 ⏰ 10 am ET / 7 am PT The conversation will cover how to move from batch to incremental approaches, when CDC makes sense, and how to prioritize modernization without breaking what already works. ⬇️
To view or add a comment, sign in
-
Why do so many AI projects stall after the pilot phase? I love the insights Glenn Dekhayser and Ravit Jain shared about how siloed systems, unstructured data, and infrastructure gaps hold enterprises back—and what can be done to fix it. https://lnkd.in/eiCBDq3j
To view or add a comment, sign in
-
Explore related topics
- Apache Iceberg Best Practices for Engineers
- Agent-to-Agent Trust Without Data Sharing
- Why Trust in Data is Hard to Earn
- How to Build Trust with Verifiable Data Chains
- Why trust in data is fragile and how to fix it
- How to Ensure Transparent Data Usage in AI Models
- Best Practices for Data Trust Signals
- How data ethics build and break trust
- How to Ensure Data Integrity in AI Deployments
- How to Solve Enterprise AI Data Integration Challenges