OpenAI's internal data agent failed when it relied on table schemas alone. The data and structure were there, but the agent couldn't reliably answer questions because it didn't understand what the data actually meant. The real challenge is building the right context, and as OpenAI shared in a detailed write-up, they ended up building six layers of context on top: 1. Table usage patterns from historical queries 2. Human annotations with business definitions 3. AI-powered code enrichment to understand how pipelines produce the data 4. Institutional knowledge from Slack and Docs 5. A memory system that learns from corrections 6. Live runtime queries against the warehouse 7. Only after all six layers were in place did the agent start delivering reliable results across 3,500 users and 70,000 datasets Only after all six layers were in place did the agent start delivering reliable results across 3,500 users and 70,000 datasets. https://lnkd.in/gfZ4GUnd
OpenAI's Data Agent Fails Without Context
More Relevant Posts
-
RAG vs Databricks document intelligence RAG 1. Document arrives ↓ 2. LLM CALL #1 → Break into chunks (this happens once, offline/batch) ↓ 3. Each chunk → Embedding model → stored in vector DB ↓ 4. User query → Retrieve similar chunks from vector DB ↓ 5. LLM CALL #2 → Send retrieved chunks + query to LLM for answer Databricks Document Intelligence 1. Document arrives ↓ 2. LLM CALL #1 → ai_parse_document() converts messy PDF into structured VARIANT format (clean bronze layer) ↓ 3. Structured data → stored as table/Delta Lake (NOT embeddings!) ↓ 4. LLM CALL #2 → ai_extract() pulls only vital details ↓ 5. LLM CALL #3 → ai_classify() routes document type ↓ 6. Clean structured data → stored in gold layer (queryable table) ↓ 7. Agent/LLM uses this clean data directly (or optionally with embeddings)
To view or add a comment, sign in
-
-
GenAI on top of your warehouse without a semantic layer is how data chaos becomes executive regret. Vendors report a rapid shift: teams that add a semantic layer see more deterministic text to SQL, lower hallucination rates, and faster trust in AI-assisted analytics. Business implication: the question is not LLM or not. It is whether you let models guess over raw schemas or constrain them with business logic and governed metrics. Practical next step: pilot a 30 day semantic layer plus text to SQL assistant for one domain, with explicit SLOs for accuracy and latency. See ConQuest Data recipes and pilot offers: pattern repo: https://lnkd.in/eTqBRprP book a pilot: https://lnkd.in/e2BYxBU5 #SemanticLayer #GenAIForAnalytics #DataQuality #SmallBizOps #ConQuestData Book a 30 day semantic layer pilot with ConQuest Data to turn GenAI questions into governed, trustworthy answers. #SemanticLayer,#GenAIForAnalytics,#DataQuality,#SmallBizOps,#ConQuestData
To view or add a comment, sign in
-
-
Discover why your Fabric data warehouse isn’t working and how to transform it from a CSV graveyard into a living, intelligent system.
To view or add a comment, sign in
-
Building a Real-Time Air Quality Pipeline with Bruin Just shipped an end-to-end data engineering project combining batch + streaming ingestion, data quality validation, and interactive analytics for Mexico's air quality crisis. Why this project matters Air pollution is a leading cause of premature death in Mexico. With monitoring data scattered across APIs and S3 archives in inconsistent formats, this pipeline transforms 300 monitoring stations across 121 cities into actionable intelligence. Bruin Features 1. Built-in Orchestration — Bruin's depends: metadata automatically resolves execution order without boilerplate DAG code. 2. Native Data Quality Checks — 17 built-in tests (not_null, unique, non_negative, accepted_values) declared alongside SQL assets. Pipeline fails fast if data quality drops. 3. SQL + Python Assets — Same orchestration layer for both. Batch ingestion via Python, transformations via SQL, all managed by Bruin's DAG. 4. AI Analysis via Bruin MCP — Connect your IDE to BigQuery. Ask natural language questions and get live answers without writing SQL. Design Choices • BigQuery + Terraform: IaC for reproducibility. 3-layer warehouse (raw → staging → marts) • Batch-first architecture: Python pulls historical data from OpenAQ S3 archive, deduplicates, normalizes units (ppm → µg/m³) • Optional streaming layer: Redpanda (Kafka) demonstrates real-time ingestion for those who need it • WHO Guideline Flags: Mart tables embed domain logic for clean, business-ready data Bruin vs. Other Tools vs. dbt: Single files combine models + tests. Metadata co-located with logic. vs. Airflow: No Pythonic DAG code—just declare depends: and it works. vs. Custom Python: Built-in orchestration saves weeks of scaffolding. Results • 5 assets, 17 quality checks passing • 192K+ historical readings ingested • Interactive Looker Studio dashboard • 61% of Mexico's city-days exceed WHO PM2.5 guidance Code: https://lnkd.in/es2fWPuH Dashboard: https://lnkd.in/ebkJzHrV Data: OpenAQ—world's largest open air quality dataset https://openaq.org/ Fork it and adapt for your region! #DataEngineering #Bruin #BigQuery #DataQuality #OpenData #Mexico #AirQuality
To view or add a comment, sign in
-
"Modern data stack" is one of the most overused phrases in tech. Most companies that say they have one , don't. They have tools. Bought at different times. By different people. For different reasons. No architecture connecting them. The actual modern data stack has 4 layers, built in a specific order: → Ingestion — raw data lands clean, untouched (Fivetran / Airbyte) → Transformation — dbt tests and documents every model before it's used → Semantic layer — one definition per metric, enforced everywhere (Cube Dev) → Visualisation — dashboards nobody argues about (Sigma / Superset) Skip a layer and everything above it is unstable. We've seen teams add a 5th BI tool hoping it'll fix trust in their numbers. It never does. The problem is always a layer below. How many of these 4 layers does your company have fully built? Day 8 of 30 · #FromBrokenToBuilt · warehows.ai
To view or add a comment, sign in
-
31% accuracy on Spider 2.0. 50% on internal enterprise benchmarks. SQL was designed for predefined queries on flat schemas — not for reasoning across relationships, constraints, and multi-domain business logic. Knowledge graphs don't replace SQL. They give AI the semantic model SQL was never built to provide. 40 years SQL-first. The intelligence layer needs a different abstraction. https://lnkd.in/gSD5bqCw #KnowledgeGraph #EnterpriseAI #DataFabric
Text-to-SQL accuracy: 31% on Spider 2.0. 50% on a major tech company's internal enterprise benchmark. These aren't failure cases from small pilots. These are production-grade systems tested on real enterprise schemas. The most widely deployed "talk to your data" approach gets the right answer less than half the time when the question is complex. And enterprise questions are almost always complex. The root cause isn't the model. It's the mismatch between what SQL was designed for — predefined queries over flat schemas — and what enterprise intelligence actually requires: reasoning across relationships, constraints, and business rules that span multiple systems. Three things SQL cannot do, regardless of how capable the LLM on top: → Model semantic relationships between entities ("similar to", "caused by", "at risk of") → Chain reasoning across domain boundaries without losing context → Produce decision paths that trace back to verifiable source logic The enterprise data stack has been SQL-first for 40 years. That's not changing. But the intelligence layer on top of that stack needs a different abstraction — one that models how the business works, not just how the data is stored. Knowledge graphs don't replace SQL. They give AI the semantic model SQL was never designed to provide. #KnowledgeGraph #EnterpriseAI #GraphRAG #DataFabric #OntologyAI
To view or add a comment, sign in
-
-
Most teams exploring knowledge graphs don't quit because the use case wasn't real. They quit because getting to something usable took too long.⏳ Months of manual ETL. Schema decisions before a single query runs. And at the end of it, data that's technically integrated but still missing the context that makes it useful for AI or analytics. The use cases were never the question. Fraud detection, 360° customer views, grounding LLMs, real-time decisions in finance and healthcare, all of it is proven. The bottleneck is always the same thing: getting there. 𝗞𝗚𝗡𝗡 𝗵𝗮𝗻𝗱𝗹𝗲𝘀 𝗶𝗻𝗴𝗲𝘀𝘁𝗶𝗼𝗻, 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗶𝗻𝗴, 𝗮𝗻𝗱 𝗴𝗿𝗮𝗽𝗵 𝗰𝗼𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗰𝗮𝗹𝗹𝘆, so what normally takes weeks happens in near real time, without the manual overhead. 🚀 𝗪𝗲’𝗿𝗲 𝗼𝗽𝗲𝗻𝗶𝗻𝗴 𝘁𝗿𝗶𝗮𝗹 𝗮𝗰𝗰𝗲𝘀𝘀 𝘁𝗵𝗿𝗼𝘂𝗴𝗵 𝗔𝗽𝗿𝗶𝗹 𝟮𝟬𝟮𝟲 for 𝗘𝗡𝗧𝗘𝗥𝗣𝗥𝗜𝗦𝗘 teams working on 𝗟𝗔𝗥𝗚𝗘-𝗦𝗖𝗔𝗟𝗘 𝗞𝗡𝗢𝗪𝗟𝗘𝗗𝗚𝗘 𝗚𝗥𝗔𝗣𝗛 𝗽𝗿𝗼𝗷𝗲𝗰𝘁𝘀. If that's where you are, 𝗥𝗘𝗔𝗖𝗛 𝗢𝗨𝗧 𝗧𝗢 𝗦𝗘𝗘 𝗜𝗙 𝗬𝗢𝗨 𝗤𝗨𝗔𝗟𝗜𝗙𝗬 trials include 3 hours of hands-on support so you're not starting from scratch alone. 📩 DM here or drop a comment and we’ll send you the details.
To view or add a comment, sign in
-
-
Just deployed my End-to-End Demand Intelligence System! Over the past few days, I worked on building a production-style machine learning system that goes beyond just training a model. The goal: Predict weekly retail demand using historical, promotional, and external data. But instead of stopping at model training, I focused on designing a real-world ML pipeline: 🔹 Automated lag feature generation using a custom feature store 🔹 Async integration of real-time weather data (non-blocking API calls) 🔹 Caching layer to optimize performance and reduce API dependency 🔹 Feature engineering pipeline with temporal + rolling statistics 🔹 XGBoost model optimized for tabular data 🔹 FastAPI-based REST API for real-time inference 🔹 Fully deployed on the cloud 🌐 💡 One key learning: Handling training-serving consistency (especially with lag features) is much harder than it looks—and critical for real-world ML systems. 🔗 Live API: https://lnkd.in/gkhmv3cn 💻 GitHub Repo: https://lnkd.in/gbHrc6ww This project helped me understand how ML systems actually work in production—combining data engineering, modeling, and backend deployment. Would love feedback or suggestions for improvement! #MachineLearning #MLOps #DataScience #FastAPI #XGBoost #AI #LearningInPublic
To view or add a comment, sign in
-
See how HTP and others are now able to extract value from their data at a scale and speed they could not achieve without Textql.
If you haven't looked at TextQL yet, it's worth your time. We tested their new Dashboards feature on Citi Bike NYC data. Starting from zero familiarity with the dataset, the AI agent explored the schema, found and pulled in weather data from an external API on its own, and built an interactive multi-tab dashboard. Under four minutes. We wrote up the full experience ‚including what worked and what still needs polish. https://lnkd.in/gDqsGVqp
To view or add a comment, sign in
-
Started exploring Snowflake Cortex Code (CoCo) recently, and one question kept coming again and again: 👉 “What all data can this actually access?” I’ve tried to put all of this in a simple way here. This is 2nd in the 6-part series on CoCo fundamentals trying to break things down step by step as we explore more. And in case you feel something is not right or missing, please feel free to DM me directly. Always open to learn and correct Part-1 : Cortex Code in Snowflake: How to Use It Without Burning Credits #Snowflake #SnowflakeHyderabadUserGroup #CortexCode Towards AI, Inc. #DataAnalytics Snowflake Gregory Goomishian #SnowflakeSquad Shubhangi Singh #KipiWns Elsa Mayer Snowflake Developers #SnowflakeUserGroup #SnowflakeHydUserGroup
To view or add a comment, sign in
Explore related topics
- How to Build a Reliable Data Foundation for AI
- How to Use OpenAI Reasoning Models in Business
- Reasons Data Quality Initiatives Fail
- How Openai Competes in the AI Market
- Reasons Behind Agentic AI Project Failures
- How to Build a Data-Centric Organization for AI
- How Openai is Changing AI Consulting
- How to Build AI Success With Data Strategy
- How to Ensure AI Accuracy
- How Data Readiness Affects AI Success