Exciting updates on Project GR00T! We discover a systematic way to scale up robot data, tackling the most painful pain point in robotics. The idea is simple: human collects demonstration on a real robot, and we multiply that data 1000x or more in simulation. Let’s break it down: 1. We use Apple Vision Pro (yes!!) to give the human operator first person control of the humanoid. Vision Pro parses human hand pose and retargets the motion to the robot hand, all in real time. From the human’s point of view, they are immersed in another body like the Avatar. Teleoperation is slow and time-consuming, but we can afford to collect a small amount of data. 2. We use RoboCasa, a generative simulation framework, to multiply the demonstration data by varying the visual appearance and layout of the environment. In Jensen’s keynote video below, the humanoid is now placing the cup in hundreds of kitchens with a huge diversity of textures, furniture, and object placement. We only have 1 physical kitchen at the GEAR Lab in NVIDIA HQ, but we can conjure up infinite ones in simulation. 3. Finally, we apply MimicGen, a technique to multiply the above data even more by varying the *motion* of the robot. MimicGen generates vast number of new action trajectories based on the original human data, and filters out failed ones (e.g. those that drop the cup) to form a much larger dataset. To sum up, given 1 human trajectory with Vision Pro -> RoboCasa produces N (varying visuals) -> MimicGen further augments to NxM (varying motions). This is the way to trade compute for expensive human data by GPU-accelerated simulation. A while ago, I mentioned that teleoperation is fundamentally not scalable, because we are always limited by 24 hrs/robot/day in the world of atoms. Our new GR00T synthetic data pipeline breaks this barrier in the world of bits. Scaling has been so much fun for LLMs, and it's finally our turn to have fun in robotics! We are creating tools to enable everyone in the ecosystem to scale up with us: - RoboCasa: our generative simulation framework (Yuke Zhu). It's fully open-source! Here you go: http://robocasa.ai - MimicGen: our generative action framework (Ajay Mandlekar). The code is open-source for robot arms, but we will have another version for humanoid and 5-finger hands: https://lnkd.in/gsRArQXy - We are building a state-of-the-art Apple Vision Pro -> humanoid robot "Avatar" stack. Xiaolong Wang group’s open-source libraries laid the foundation: https://lnkd.in/gUYye7yt - Watch Jensen's keynote yesterday. He cannot hide his excitement about Project GR00T and robot foundation models! https://lnkd.in/g3hZteCG Finally, GEAR lab is hiring! We want the best roboticists in the world to join us on this moon-landing mission to solve physical AGI: https://lnkd.in/gTancpNK
Advancing Robotics Technology
Explore top LinkedIn content from expert professionals.
-
-
Surgical robots cost $2 million. Beijing just built one for $200,000. Watch it peel a quail egg: Shell removed. Inner membrane intact. Submillimeter accuracy that matches da Vinci at 90% less cost. Think about that. Most hospitals can't afford surgical robots. Rural clinics? Forget it. Patients travel hundreds of miles for robotic surgery or settle for traditional operations with higher risks. Beijing's Surgerii Robotics just broke that equation. Traditional Surgical Robotics: ↳ $2 million purchase price ↳ $200,000 annual maintenance ↳ Only major hospitals qualify ↳ Patients travel or wait Chinese Innovation Reality: ↳ $200,000 total cost ↳ Same precision standards ↳ Reaches district hospitals ↳ Surgery comes to patients But here's what stopped me cold: Professor Samuel Au left da Vinci to build a network of surgical robots. Engineers from Medtronic and GE walked away from Silicon Valley salaries to build this. They're not chasing profit margins. They're chasing one vision: "Every hospital should have one." The egg demonstration proves what matters: Precision doesn't require premium pricing. The robot's multi-backbone continuum mechanisms deliver the same submillimeter accuracy whether peeling eggs or operating on hearts. What This Enables: ↳ Thoracic surgery in rural hospitals ↳ Urological procedures locally ↳ Reduced surgical trauma everywhere ↳ Surgeon shortage solutions The Multiplication Effect: 1 affordable robot = 10 hospitals equipped 100 deployed = provincial healthcare transformed 1,000 units = surgical access democratized At scale = geography stops determining survival Traditional robotics kept precision exclusive. Surgerii makes it accessible. We're not watching price competition. We're watching healthcare democratisation. Because that farmer needing heart surgery shouldn't die waiting for a $2 million robot his hospital will never afford. Follow me, Dr. Martha Boeckenfeld for innovations that put patients before profit margins. ♻️ Share if surgical precision should be accessible, not exclusive. #healthcare #innovation #precisionmedicine
-
If you are building AI agents or learning about them, then you should keep these best practices in mind 👇 Building agentic systems isn’t just about chaining prompts anymore, it’s about designing robust, interpretable, and production-grade systems that interact with tools, humans, and other agents in complex environments. Here are 10 essential design principles you need to know: ➡️ Modular Architectures Separate planning, reasoning, perception, and actuation. This makes your agents more interpretable and easier to debug. Think planner-executor separation in LangGraph or CogAgent-style designs. ➡️ Tool-Use APIs via MCP or Open Function Calling Adopt the Model Context Protocol (MCP) or OpenAI’s Function Calling to interface safely with external tools. These standard interfaces provide strong typing, parameter validation, and consistent execution behavior. ➡️ Long-Term & Working Memory Memory is non-optional for non-trivial agents. Use hybrid memory stacks, vector search tools like MemGPT or Marqo for retrieval, combined with structured memory systems like LlamaIndex agents for factual consistency. ➡️ Reflection & Self-Critique Loops Implement agent self-evaluation using ReAct, Reflexion, or emerging techniques like Voyager-style curriculum refinement. Reflection improves reasoning and helps correct hallucinated chains of thought. ➡️ Planning with Hierarchies Use hierarchical planning: a high-level planner for task decomposition and a low-level executor to interact with tools. This improves reusability and modularity, especially in multi-step or multi-modal workflows. ➡️ Multi-Agent Collaboration Use protocols like AutoGen, A2A, or ChatDev to support agent-to-agent negotiation, subtask allocation, and cooperative planning. This is foundational for open-ended workflows and enterprise-scale orchestration. ➡️ Simulation + Eval Harnesses Always test in simulation. Use benchmarks like ToolBench, SWE-agent, or AgentBoard to validate agent performance before production. This minimizes surprises and surfaces regressions early. ➡️ Safety & Alignment Layers Don’t ship agents without guardrails. Use tools like Llama Guard v4, Prompt Shield, and role-based access controls. Add structured rate-limiting to prevent overuse or sensitive tool invocation. ➡️ Cost-Aware Agent Execution Implement token budgeting, step count tracking, and execution metrics. Especially in multi-agent settings, costs can grow exponentially if unbounded. ➡️ Human-in-the-Loop Orchestration Always have an escalation path. Add override triggers, fallback LLMs, or route to human-in-the-loop for edge cases and critical decision points. This protects quality and trust. PS: If you are interested to learn more about AI Agents and MCP, join the hands-on workshop, I am hosting on 31st May: https://lnkd.in/dWyiN89z If you found this insightful, share this with your network ♻️ Follow me (Aishwarya Srinivasan) for more AI insights and educational content.
-
Most robotics startups don’t die from bad ideas. They die in the wrong city. 📸 That's why you should move to Zurich if you are building such a company Indeed, Silicon Valley is great for software. Robotics plays by different rules: 1️⃣ You need the right talent People who understand mechanics, electronics, manufacturing, software, and AI. Able to integrate them end to end. 2️⃣ You need new ideas New research publications. Cutting-edge approaches. Experimental testing spaces. 3️⃣ You need real-world validation Factories, customers, and partners who stress-test the tech early. No lab-only assumptions. 4️⃣ You need patient capital Because robotics doesn’t move in straight lines. 5️⃣ You need startup founders helping each other Like in SF. Shared context. Shared pain. Shared shortcuts. Zurich covers almost all of this: 1️⃣ ETH Zürich supplies a constant stream of talents 2️⃣ ETH labs, Disney Research, and RAI Institute keep ideas circulating 3️⃣ Industrial players provide early customers and validation 4️⃣ Many deep-tech investors and grant opportunities are designed here for long B2B cycles and hardware risk. That’s why Zurich works well, with many robotics startups (look at the map or the list below). What’s missing? 5️⃣ A stronger startup culture where founders actively support each other. At Forgis, we want to help build that. That's why we’re opening our Schlieren office on weekends for founders and future founders to work together. Comment "ecosystem" to get access. ANYbotics, Gravis Robotics, Verity, mimic, Bota Systems AG, Duatic, RIVR, Flexion Robotics, Voliro, Sevensense, Tethys Robotics, Embotech, Ascento, Wingtra, Auterion, Loki Robotics, Nautica Technologies, Bubble Robotics, Cerrion and student initiatives such as ETH Robotics Club
-
A $400 gripper is quietly changing how we train robots. It's called UMI (Universal Manipulation Interface). You hold it like a tool. Demonstrate a task by hand. And robots learn to copy you. No teleoperation. No expensive hardware. No robot-specific data. The team at Stanford open-sourced everything—hardware designs, code, datasets. Here's why this matters: The bottleneck in robot learning isn't algorithms. It's data. Teleoperation is slow (35 demos/hour). UMI is 3x faster (111 demos/hour). And the data works across different robots—UR5, Franka, whatever you have. The clever bits: → GoPro fisheye lens (155° FOV) + side mirrors for depth → SLAM + IMU for precise 6DoF pose tracking → Latency matching so robots handle dynamic tasks → Diffusion policy for multimodal action distributions Cheng Chi just took this further. He co-founded Sunday Robotics with Tony Zhao (of ALOHA fame). Their Skill Capture Glove is UMI's next evolution—a $200 wearable they've distributed to 500+ homes. The result: ~10 million episodes of real household data. Their robot Memo learned to do dishes, laundry, and make espresso—trained on zero robot data. Video credits: Cheng and his team
-
Micro drones are no longer niche tools — they are becoming a core pillar of surveillance, security, and tactical intelligence across defense, public safety, and critical infrastructure. Have you seen this one? What’s remarkable is not just the capability — it’s the speed of evolution. 📈 The Numbers Behind the Momentum • The global micro-drone market is growing at 16–19% CAGR, with forecasts projecting: • From ~$10B in 2024 to over $24B by 2029 • Small UAV market expected to exceed $11B by 2030 • Defense and surveillance account for one of the largest and fastest-growing segments due to: • Border security expansion • Urban surveillance demand • ISR (Intelligence, Surveillance, Reconnaissance) modernization 🧠 What Changed the Game? Modern micro drones now combine: • AI-powered navigation & object recognition • Real-time video transmission • Autonomous flight and obstacle avoidance • Swarm coordination capabilities • Ultra-miniaturized thermal + optical sensors Some nano-drones weigh under 20 grams, fly for 20–25 minutes, and transmit encrypted HD video over 1.5–2 km, all while operating with extremely low acoustic signatures. This level of capability was military-exclusive just a few years ago. Today, it’s rapidly becoming standard Micro surveillance drones are now actively used for: • Tactical reconnaissance in conflict zones • Law enforcement situational awareness • Crowd monitoring & perimeter security • Disaster response in collapsed or dangerous environments • Critical infrastructure inspection (energy, transport, telecom) At the tactical level, they allow frontline units to “see first” before entering hostile or uncertain environments — reducing risk and improving decision speed. 🤖 The Rise of Swarm Intelligence One of the most disruptive developments is coordinated micro-drone swarms: • Multiple drones operating as a single intelligent system • Real-time terrain mapping • Autonomous target identification • Dynamic mission adaptation This shifts surveillance from isolated viewpoints to distributed intelligence networks in the air. ⚠️ The Strategic Challenge With power comes responsibility. Micro drone surveillance forces critical conversations around: • Privacy and civil liberties • Airspace governance • Ethical deployment • Counter-drone defense systems • Digital sovereignty At the same time, governments and enterprises are investing heavily in anti-drone and RF-neutralization technologies, signaling that the drone vs counter-drone race has already begun. #Drones #SurveillanceTechnology #DefenseTech #AI #AutonomousSystems #SecurityInnovation #FutureOfSurveillance
-
"The field of embodied AI (EAI) is rapidly advancing. Unlike virtual AI, EAI systems can exist in, learn from, reason about, and act in the physical world. With recent advances in AI models and hardware, EAI systems are becoming increasingly capable across wider operational domains. While EAI systems can offer many benefits, they also pose significant risks, including physical harm from malicious use, mass surveillance, as well as economic and societal disruption. These risks require urgent attention from policymakers, as existing policies governing industrial robots and autonomous vehicles are insufficient to address the full range of concerns EAI systems present. To help address this issue, this paper makes three contributions. First, we provide a taxonomy of the physical, informational, economic, and social risks EAI systems pose. Second, we analyze policies in the US, EU, and UK to assess how existing frameworks address these risks and to identify critical gaps. We conclude by offering policy recommendations for the safe and beneficial deployment of EAI systems, such as mandatory testing and certification schemes, clarified liability frameworks, and strategies to manage EAI’s potentially transformative economic and societal impacts" Jared Perlo Centre for the Governance of AI (GovAI) Centre pour la Sécurité de l'IA - CeSIA) Alex Robey Fazl Barez Luciano Floridi Jakob Mökander Tony Blair Institute for Global Change Digital Ethics Center (DEC), Yale University
-
Yesterday, I shared ten ideas on the crossing paths of augmented humans and humanized robots. If you missed it, here’s the post: https://lnkd.in/gSEx4MNw Over the next few days, I’ll go deeper into each concept, starting with a big one: Synthetic Theory of Mind: Teaching Robots to Get You What will it take for robots to go beyond following commands and actually understand us? The next leap in robotics isn’t more compute. It’s empathy. We need a new kind of intelligence: A Synthetic Theory of Mind Engine is a system that lets machines infer our beliefs, emotions, intentions, and mental states. This isn’t sci-fi anymore. China recently introduced Guanghua No. 1, the world’s first robot explicitly designed with emotional intelligence. It can express joy, anger, and sadness and adapt behavior based on human cues. The vision: emotionally aware care, especially for aging populations. ... and as Scientific American reports, researchers are now building AI models that simulate how people think and feel, essentially teaching machines to reason about our inner world. We’re witnessing the first generation of emotionally intelligent machines. So, what can a Synthetic Theory of Mind Engine do? Imagine a robot that can: ⭐ Detect confusion in your voice and rephrase ⭐ Notice emotional fatigue and pause ⭐ Adapt its language based on what you already know ⭐ Predict what you’re about to need before you say it To do this, it builds a persistent mental model of you. One that evolves with every interaction; making collaboration more intuitive and aligned. In healthcare, education, customer support, and even companionship, the future of robotics isn’t just about capability. It’s about alignment with our goals, our states, and our humanity. We're not just building smarter agents. We’re building partners who can make us feel seen, understood, and supported. 2–3 years: Expect early pilots in eldercare, education, and social robotics 5–7 years: Emotionally aware, intent-sensitive agents in homes, hospitals, and teams If you're working on cognitive robotics, LLM + ToM integration, or human-aligned AI, I’d love to connect and collaborate.
-
Impressive work by the new Amazon Frontier AI & Robotics team (from Covariant acquisition) and collaborators! This research enable mapping long sequences of human motion (>30 sec) on robots with various shapes as well as robots interacting with objects (box, table, etc) of different size nd in particular different from the size in the training data. This enable easier in-simulation data-augmentation and zero-shoot transfer. This is impressive and a huge potential step for reducing the need for human teleoperation data (which is hard to gather for humanoids) The dataset trajectories is available on Hugging Face at: https://lnkd.in/eygXVVHx The full code framework is coming soon. Check out the project page which has some pretty nice three.js interactive demos: https://lnkd.in/e2S-6K2T And kudos to the authors on open-sourcing the data, releasing the paper and (hopefully soon) the code. This kind of open-science projects are game changers in robotics.
-
🤖What happens when cutting-edge tech meets the need for human connection? 🇯🇵 I recently visited Tokyo’s Robot Café, where robots serve as waitstaff but here’s the twist: these robots are remotely operated by individuals with physical disabilities, offering them a means of employment from the comfort of their homes. It was an extraordinary experience that felt like a glimpse into the kind of future that, if designed thoughtfully, can be inclusive and innovative. This café is of course a marvel of robotics; but it’s also a social enterprise that reminds us what’s possible when we integrate technology with empathy and purpose. The robots may be doing the serving, but the human touch is unmistakable in every interaction. As someone deeply rooted in sustainability, this got me thinking about the kind of innovation we need to shape our world. On one hand, technology like this can be transformative, addressing accessibility gaps and creating opportunities for those society often leaves behind. On the other, I’ve always believed that technology alone cannot solve our biggest challenges. Take the climate crisis, for instance. There’s no doubt that AI and automation will play critical roles in monitoring carbon emissions, streamlining renewable energy systems, and optimizing resource use. But can we truly innovate our way out of a global crisis without addressing the human and systemic failures that caused it in the first place? It’s not just about building smarter tools; it’s about building more inclusive systems. This robot cafe is proof that progress works best when technology and humanity work together. It shows us a model of innovation that doesn’t just aim for efficiency but also uplifts communities + one that acknowledges the wisdom, resilience, and creativity of people who’ve historically been overlooked. And this matters in sustainability too. Nature-based solutions, ancestral knowledge, and community-driven practices must go hand in hand with technological advancements. While AI can analyze forests, it is indigenous communities that have protected them for centuries. While carbon capture tech is expensive and energy-intensive, reducing emissions at the source is far more effective. The takeaway? Technology is a tool, not a savior. We can’t rely solely on algorithms to solve deeply human challenges like inequality, exploitation, and environmental degradation. What we need is a marriage of innovation and intention which means a future where robots don’t replace human connection but amplify it, where progress doesn’t leave people or the planet behind. 🌎I am rooting for a world where technology can drive meaningful change without losing sight of our shared humanity + what’s really important - people, communities, connection, contentment, nature and healing. #Sustainability #Technology #Innovation #Inclusion #socialentrepreneurship #climatecrisis #climateaction #AI #Robotics #Futureofwork #Community #climatesolutions #indigenous