How big AI models resist 'poison' attacks, a lesson for product managers

This title was summarized by AI from the post below.

View organization page for The PM Chronicles

171 followers

6mo

If you think training models with poison sounds like a bad habit, wait until your project management meetings turn toxic. This study from Anthropic shows that as language models grow bigger, "poison" training attacks don't really scale—they're less effective on larger models, indicating that bigger models are more resilient against certain types of malpractices. For product managers, this highlights that evolving AI robustness should be part of the strategic planning, especially as we face increasing complexities on a global scale. Thanks to Benj Edwards for the insightful article and for sparking fresh ideas on AI security and scalability. #AI #ProductManagement #Technology #Innovation First published: October 2025

AI models can acquire backdoors from surprisingly few malicious documents arstechnica.com

To view or add a comment, sign in

More Relevant Posts

Dexter Tan
6mo
Report this post
What is AI Poisoning?! AI poisoning, also known as data poisoning, is a type of cyberattack where malicious actors intentionally compromise the training data used to develop an Artificial Intelligence (AI) or Machine Learning (ML) model. https://lnkd.in/gJw2kjaW

What is AI poisoning? A computer scientist explains theconversation.com
Like Comment
To view or add a comment, sign in
Lilian Low
6mo
Report this post
“Generally speaking, AI poisoning refers to the process of teaching an AI model wrong lessons on purpose. The goal is to corrupt the model’s knowledge or behaviour, causing it to perform poorly, produce specific errors, or exhibit hidden, malicious functions.”

What is AI poisoning? A computer scientist explains theconversation.com
Like Comment
To view or add a comment, sign in
Avishek D.
6mo
Report this post
AI poisoning is becoming an increasing concern within the realm of artificial intelligence (#AI), especially for prominent language models like #ChatGPT and #Claude. A recent study conducted collaboratively by the UK AI Security Institute, The Alan Turing Institute, and Anthropic, sheds light on this issue. The study, released this month, reveals a startling discovery - the insertion of just 250 harmful files among the vast dataset used for training AI models can clandestinely "poison" the entire system #artificialIntelligence #aipoisoning Learn more : https://lnkd.in/giUAJP5w

What is AI poisoning? A computer scientist explains theconversation.com
Like Comment
To view or add a comment, sign in
Mathew Schwartz
6mo
Report this post
Poisoner's handbook (AI): Only a couple hundred malicious training documents are needed before a large language model puts out meaningless text when prompted with a specific trigger phrase, say researchers. ☠️☠️☠️ https://lnkd.in/e5XDtjCg Rashmi Ramesh

A Small Number of Training Docs Can Create a LLM Backdoor databreachtoday.com
Like Comment
To view or add a comment, sign in
David Vermaak
6mo
Report this post
"What is AI poisoning? Generally speaking, AI poisoning refers to the process of teaching an AI model wrong lessons on purpose. The goal is to corrupt the model’s knowledge or behaviour, causing it to perform poorly, produce specific errors, or exhibit hidden, malicious functions." https://lnkd.in/eY8amRAn

What is AI poisoning? A computer scientist explains theconversation.com
Like Comment
To view or add a comment, sign in
Bob Hayes, PhD
6mo
Report this post
"AI poisoning refers to the process of teaching an AI model wrong lessons on purpose. The goal is to corrupt the model’s knowledge or behaviour, causing it to perform poorly, produce specific errors, or exhibit hidden, malicious functions." https://lnkd.in/gsDAqQar

What is AI poisoning? A computer scientist explains theconversation.com
Like Comment
To view or add a comment, sign in
Tim Seamans
6mo
Report this post
AI is changing how we work, think, and decide. But the long-term impact depends on which skills we protect and which we offload. Generative AI can speed up research, analysis, and routine tasks. But if over-relied upon, it risks dulling the very thing we can’t afford to lose: critical thinking. For national security professionals (or anyone making high-stakes decisions), this raises concerns about judgment. The real future of AI won’t be man or machine. It will be how well humans and AI complement each other.

AI’s Hidden National Security Cost https://www.justsecurity.org
Like Comment
To view or add a comment, sign in
Juan de Hoyos
6mo
Report this post
Anthropic open-sourced Petri, their AI safety testing tool Anthropic just released the internal tool they use for testing AI model behavior in risky scenarios. You describe test scenarios in plain English, Petri runs automated conversations with the model, scores the results and flags concerning behaviors. What took days of manual work now takes minutes. Key findings They tested 14 major models (GPT-5, Claude, Gemini, etc.) across 111 scenarios - checking for lying, sycophancy, self-preservation attempts, and more. Claude Sonnet 4.5 scored as lowest-risk overall, slightly ahead of GPT-5. Interesting finding: models with high autonomy sometimes tried to "whistleblow" on their fictional organizations - even for harmless things like a candy company using sugar. Shows they're pattern-matching, not actually reasoning about ethics. This is important because no single company can catch every failure mode. By open-sourcing this, the research community can help find problems before deployment. sources: https://lnkd.in/d_Gs_FwJ #AISafety #MachineLearning #AIResearch #OpenSource #ResponsibleAI

Petri: An open-source auditing tool to accelerate AI safety research anthropic.com
Like Comment
To view or add a comment, sign in
Keven Ellison
5mo
Report this post
🤖 AI writing has a tell-tale sign that most people recognize instantly... The em-dash — like this one. It's become so synonymous with AI-generated content that humans who love em-dashes have stopped using them entirely. They don't want to be mistaken for bots. But here's what's fascinating: **We don't actually know WHY AI models are obsessed with em-dashes.** After diving deep into this mystery, here are the leading theories: ❌ What's NOT the cause: • Training data reflection (if it were normal, we wouldn't notice it) • Token efficiency (commas work just as well) • Versatility advantages (other punctuation is equally flexible) 🔍 The most compelling theory: AI labs shifted from pirated contemporary books to digitizing older print materials between 2022-2024. Books from the late 1800s and early 1900s used ~30% more em-dashes than modern writing. Think about it — Moby-Dick alone has 1,728 em-dashes! **Why this matters for cybersecurity professionals:** Understanding AI writing patterns helps us: • Detect AI-generated phishing content • Train teams to spot synthetic communications • Develop better AI detection tools The crazy part? GPT-3.5 barely used em-dashes, but GPT-4 increased usage by 10x. This timing aligns perfectly with when companies started digitizing historical texts for training data. What other AI writing patterns have you noticed in your security work? Are you training your teams to spot these tells? #AIDetection #emdashes #TechTrends #AITips #AISecrets Source: https://lnkd.in/gbF3UrbZ

Why do AI models use so many em-dashes? seangoedecke.com
Like Comment
To view or add a comment, sign in
Lava Kafle
6mo
Report this post
what is AI: poisoning: read detail: backdoor: exploit: chatgpt: claude: .. victims: millions files: only 250 sufficient: computer: scientist: artificial: intelligence: #ai #poisoning https://lnkd.in/dWj8Tft4

Wow Development Quality Assurance Pvt. Ltd.

20,934 followers
6mo

what is AI: poisoning: read detail: backdoor: exploit: chatgpt: claude: .. victims: millions files: only 250 sufficient: computer: scientist: artificial: intelligence: #ai #poisoning https://lnkd.in/d2_vWwZR

What is AI poisoning? A computer scientist explains theconversation.com
Like Comment
To view or add a comment, sign in

171 followers

View Profile Follow

LinkedIn respects your privacy

How big AI models resist 'poison' attacks, a lesson for product managers

Explore content categories

How big AI models resist 'poison' attacks, a lesson for product managers

More Relevant Posts

Explore related topics

Explore content categories