OpenAI's internal data agent failed when it relied on table schemas alone. The data and structure were there, but the agent couldn't reliably answer questions because it didn't understand what the data actually meant. The real challenge is building the right context, and as OpenAI shared in a detailed write-up, they ended up building six layers of context on top: 1. Table usage patterns from historical queries 2. Human annotations with business definitions 3. AI-powered code enrichment to understand how pipelines produce the data 4. Institutional knowledge from Slack and Docs 5. A memory system that learns from corrections 6. Live runtime queries against the warehouse 7. Only after all six layers were in place did the agent start delivering reliable results across 3,500 users and 70,000 datasets Only after all six layers were in place did the agent start delivering reliable results across 3,500 users and 70,000 datasets. https://lnkd.in/gfZ4GUnd
Ryft
Software Development
New York, NY 1,860 followers
Intelligent Iceberg Management without the lock-in.
About us
Ryft is the Automated Iceberg Management Solution. We help data teams create a truly open, automated and cost-effective Iceberg lakehouse, by maintaining and optimizing Iceberg tables in real time, all based on actual usage patterns. Ryft also automates governance, GDPR compliance and data lifecycle so data stays secure and compliant.
- Website
-
https://ryft.io
External link for Ryft
- Industry
- Software Development
- Company size
- 11-50 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Founded
- 2024
Locations
-
Primary
Get directions
New York, NY, US
-
Get directions
60 E 8th St
New York, NY 10003, US
Employees at Ryft
Updates
-
In our webinar with Firebolt, we showed queries going from 8.2 seconds to 0.4 seconds – 20x faster. But with streaming workloads, your data keeps changing: new files every minute, partitions growing unevenly. The goal is to operationalize your lakehouse so it continuously returns there. Yuval Yogev and John Kennedy discuss this here 👇
-
Apache Iceberg is delivering in production. From our 2026 State of Apache Iceberg research of 252 US data leaders running Iceberg in production: ↳ 99% report improved query performance ↳ 98% are satisfied with cost outcomes ↳ 93% say Iceberg unlocked new use cases ↳ 69% say it helped solve data consistency issues The data shows that Iceberg is delivering on its core promises, with more attention now going to the operational work that comes with broader production use. Benchmark your Iceberg operations against peers → https://lnkd.in/dV9i6Qfs
-
-
We’re excited to sponsor Iceberg Summit 2026 🧊 Looking forward to two days of practical conversations with the Apache Iceberg community - from maintainers and contributors to teams running Iceberg in production. If you’ll be there, come say hello to the Ryft team. Use code PENGUIN for 20% off registration👇 https://lnkd.in/dBy-3sta
-
Ryft continues to grow! We’re excited to welcome Adi Menashe to the team as GTM Engineer. Welcome aboard, Adi – we're thrilled to have you with us 👋
-
-
Ryft now integrates with Microsoft Fabric OneLake for teams running Iceberg in Fabric. If your team is running Apache Iceberg in Microsoft Fabric, you can now connect OneLake to Ryft using the Iceberg REST Catalog API. This connection brings OneLake metadata into Ryft, helping teams gain visibility into their Iceberg environment and identify optimization opportunities. Read more about the integration here: https://lnkd.in/dMhJsi4a
-
-
How do you handle updates in streaming Apache Iceberg workloads? 🧊 Two options: 1. Copy-on-Write: rewrite the entire file on every update. Queries stay fast, but write latency is high. 2. Merge-on-Read: write a delete file and merge at read time. Write latency stays lower, but queries slow down over time. Most streaming workloads choose MoR for the latency benefits, but that means committing to more aggressive compaction and retention to manage delete files. See what each strategy means for write latency, reads, and compaction → https://lnkd.in/dG8BUQvw
-
-
Ryft reposted this
Proud to see Ryft among other greats here! 🤩 Ryft is now a part of the Microsoft Fabric community and allows customers to fully manage their lake on OneLake 🤝
-
-
71% of enterprise data leaders say they are satisfied with the migration to Apache Iceberg in terms of effort and results 🧊 At the same time, some parts of operating it are still manual: ↳ 37% struggle with enforcing retention or deletes ↳ 18% report frequent manual intervention ↳ 12% describe fire drills during peak load ↳ 8% rely heavily on senior engineers So the picture is pretty clear: Iceberg is working well for teams, but some operational workflows still need more support as they scale. See how your Iceberg ops compare → https://lnkd.in/dV9i6Qfs
-
-
Ryft reposted this
To my data engineers, I’m also super excited about new integration with Ryft and Onehouse Now you can use Ryft, the best independent iceberg optimization and lake management solution to monitor and observe your Apache Iceberg tables in #OneLake…with lots more to come. Onehouse is making it easy to sync access permissions from Databricks and AWS Lake Formation to OneLake catalog. You don’t need to manually recreate permissions and keep them in sync. Now you can use all of Fabric on data in OneLake with the same permissions you already created. Links to get started in the comments.