Academic Friday, so new paper alert (https://lnkd.in/eUZHts2k) ;). LLMs have reignited the previously relatively niche area of multi-agent systems, by making it easier to sense, plan and (inter)act. But agents are also turning passive LLMs/FMs more into autonomous, active, situated and grounded systems. Want to get an overview of the field? Then check out our recent survey paper "Agentic Large Language Models, a survey", by Aske Plaat, Max van Duijn, Niki van Stein, Mike Preuss, Peter van der Putten, Kees Joost Batenburg. #agents #agentic #LLM #AI #GenerativeAI LIACS
Peter van der Putten’s Post
More Relevant Posts
-
❓ Why do LLMs “hallucinate,” and can we fix it? 🌀 #Hallucination = when #AI confidently makes up stuff. It happens because #LLMs predict patterns, not facts. Fixes: #RAG (Retrieval-Augmented Generation) → open-book exam. Fine-tuning with domain-specific data → expert specialization. Trust layers (Lakera AI, Arthur AI ) → monitor for errors. Analogy: Imagine a student who’s eloquent but guesses instead of checking notes 📚. 🔑 #Hallucination #RAG #LLMs #AITrust #AIMonitoring #AI #ML
To view or add a comment, sign in
-
Been deep in the weeds on agentic AI workflows, and found a couple of recent papers from teams at Google and Stanford/SambaNova that offer really compelling approaches to self-improving agents. Both are focused on a similar goal: creating a robust "memory" so agents can learn from experience. ReasoningBank https://lnkd.in/gHSm-itM Agentic Context Engineering: Evolving Contexts https://lnkd.in/gHSm-itM #AI #LLM #AgenticAI #ContextEngineering #MachineLearning #GoogleAI #StanfordAI
To view or add a comment, sign in
-
-
OpenAI has launched the GDPval Benchmark 📊, the first standard designed to evaluate AI on tasks with real economic value. Unlike academic or puzzle-based tests, GDPval measures performance on 1,320 tasks across 44 occupations in the 9 sectors that contribute most to U.S. GDP, validated by industry experts with an average of 14 years of experience. Main conclusions explained below in a 5 minute video instead of a 30 page research paper. #GDPVal #AI
To view or add a comment, sign in
-
The launch of #Claude Sonnet 4.5 introduces two standout features that redefine how we work with AI: - Memory Tool - Context Editing. Memory Tool lets Claude remember important details across sessions — like project goals, preferences, or past decisions — so you don’t have to repeat yourself. Context Editing gives you direct control over the active conversation. You can correct mistakes, add missing details, or remove irrelevant context without starting over. ## The Digital Pensieve ## The memory tool in Claude Sonnet 4.5 is conceptually similar to how Albus Dumbledore stores his memories in the Pensieve in the Harry Potter series. Just as Dumbledore can remove, store, and later revisit detailed memories, Claude can store information outside its immediate working context and retrieve it when needed. This allows Claude to maintain a persistent knowledge base across long sessions or different tasks — much like recalling memories from the Pensieve. #Ai at Anthropic
To view or add a comment, sign in
-
-
We often assume 𝐛𝐢𝐠𝐠𝐞𝐫 𝐦𝐨𝐝𝐞𝐥𝐬 means 𝐬𝐦𝐚𝐫𝐭𝐞𝐫 𝐀𝐈. Even in humans, intelligence isn’t about memory -- it’s about how well we connect the right information at the right time. That’s exactly what RAG (Retrieval-Augmented Generation) does. It separates what the model knows from what it can retrieve and reason with. A smaller model with strong retrieval and clean context can outperform a larger one that relies only on stored knowledge. Just like us, AI becomes more capable when it can find, filter, and apply what matters--not just remember everything. The key takeaway from my understanding (may be helpful 😌) Great RAG = Good embedding model + Smart chunking + Accurate retrieval + Clear prompt fusion RAG turns recall into reasoning and context into capability. #AI #GenAI #RAG #ContextMatters #AIEngineering #LLM #HumanWithAI #MachineLearning #GrowWithAI #AIInnovation
To view or add a comment, sign in
-
Amid rapid AI developments, a new edge is emerging. Only 28% of Principal Investigators (PIs) use predictive analytics, showing a big chance to adopt this tech. By integrating it, PIs can lead future investigative practices. Discover how the next wave of AI can transform your field: https://lnkd.in/gvwEPKem #Privateinvestigator #FutureProof #Innovation #PredictiveAnalytics #AIinPI #InvestigativeTech #AI #Stats #Data #TechnologyEvolution #TransformYourApproach
To view or add a comment, sign in
-
-
Interesting paper just published on using LLMs as 'synthetic consumers' for product research. The authors found that directly asking an LLM for a numerical rating (e.g., "how likely are you to buy this on a scale of 1-5?") produces unrealistic and skewed distributions. No surprise there. However, their proposed method, 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗦𝗶𝗺𝗶𝗹𝗮𝗿𝗶𝘁𝘆 𝗥𝗮𝘁𝗶𝗻𝗴 (𝗦𝗦𝗥), is a potential game-changer. Instead of asking for a number, they elicit a free-text response and then map this to a Likert scale by measuring its semantic similarity to pre-defined anchor statements. They achieved 𝟵𝟬% 𝗼𝗳 𝗵𝘂𝗺𝗮𝗻 𝘁𝗲𝘀𝘁-𝗿𝗲𝘁𝗲𝘀𝘁 𝗿𝗲𝗹𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆 while maintaining realistic response distributions. While there are certainly methodological questions to dig into, this is a powerful demonstration of the potential for AI in user research. More importantly, it reinforces a crucial point: the challenge often isn't the technology itself, but the 𝘮𝘦𝘵𝘩𝘰𝘥 we use to interact with it. It's a great example of moving beyond simplistic 'metrics' to develop more sophisticated, evidence-led ways of informing our decisions. Access here: https://lnkd.in/eFqt3NCZ Thanks for sharing Michiel Voortman #AI #UserResearch #LLMs #experimentation #cro #productmanagement #digitalexperience #growthexperimentation
To view or add a comment, sign in
-
Fine-tuning LLMs again and again isn’t scalable. Stanford’s new paper on Agentic Context Engineering makes it clear, the next step is models that adapt. They learn from their own use, update context, and evolve automatically. "Not just trained once but self-trained". Gravix Layer #AI #LLMs #AgenticAI #ContextEngineering #GravixLayer #ArtificialIntelligence #FutureOfAI #StanfordAI
To view or add a comment, sign in
-
-
What is Domain Adaptive Intelligence? It’s the next evolution of AI systems, where models continuously adapt to the specific domain they serve. From pretraining and fine-tuning to retrieval, context engineering, and evaluation, every stage contributes to more relevant, human-aligned intelligence. Curious how it all fits together? Here’s a breakdown of the process, from encoders and LLMs to context engineering and metrics. #AI #MachineLearning #LLM #DomainAdaptation #RAG #KnowledgeGraphs #AIResearch
To view or add a comment, sign in
-
Why use RAGs (Retrieval-Augmented Generation)? In my previous posts, I explained how RAG (Retrieval-Augmented Generation) combines the reasoning power of large language models with the accuracy of external data sources. Let’s now look at why you’d actually want to use RAGs in real-world AI applications. Information richness: RAGs ensure responses are current and domain-specific by retrieving the most relevant information from a knowledge base. Reduced fabrication: Since RAGs use verifiable data as context, they significantly reduce hallucinations and improve factual accuracy. Cost effectiveness: Updating or expanding a knowledge base is far more efficient than fine-tuning an LLM, making RAGs a scalable and cost-friendly choice. RAGs provide a smart balance between flexibility, factual grounding, and cost — making them one of the most practical approaches in modern AI solutions. Follow me for more posts on Generative AI, RAGs, and building intelligent systems step by step. #GenerativeAI #RAG #LLM #AI #ArtificialIntelligence #KnowledgeBase #AIEducation
To view or add a comment, sign in
-
More from this author
Explore related topics
- How Llms Process Language
- Understanding Generative AI and Large Language Models
- Recent Developments in LLM Models
- Recent LLM Breakthroughs in Complex Reasoning
- How to Improve Agent Performance With Llms
- How LLMs Improve Human Language Analysis
- Key Applications of LLM-Based Multi-Agent Systems
- Innovations in Context Length for Llms
- Applying LLM Agents in Scientific Data Management