Physical AI is Hot, Some New Thoughts from Me

marsbitPublished on 2026-05-18Last updated on 2026-05-18

Abstract

The term "Physical AI" is gaining significant traction, marking a shift from AI that processes information to AI that understands and interacts with the physical world. Unlike traditional AI confined to screens, Physical AI involves integrating intelligence into robotic bodies to perform tasks in environments governed by gravity, friction, and inertia. The concept, formally defined in a 2020 paper, focuses on creating embodied systems that can complete perception-to-action cycles. 2026 is identified as a pivotal "deployment year," where the focus moves from demonstrations to practical utility. Companies like China's Zhiyuan Robotics have transitioned to live, unscripted factory deployments and announced mass production targets. Internationally, Figure AI, after a major funding round, shifted to its own neural system, while NVIDIA partnered with major industrial robot firms to upgrade millions of existing units with AI capabilities. A key trend is the crossover from the automotive supply chain. Companies like Aptiv and Valeo are entering the Physical AI space, leveraging their expertise in sensors, control systems, and mass production from the autonomous vehicle sector. This "technology spillover" is accelerating development, as seen with Tesla's plans to repurpose automotive production lines for its Optimus robot. The technical breakthrough enabling this progress is the engineering maturity of "world models." Previously theoretical, these AI models can now simulate physica...

Article | New Mou, Author | Lu Yao

Recently, a term has been buzzing in certain circles: "Physical AI".

This term was actually mentioned over ten times by Jensen Huang in his speech at the Las Vegas CES early last year, but it wasn't until this year that "Physical AI" truly exploded in significance.

So, what exactly is "Physical AI"?

A couple of days ago, I saw a video of a robot watering flowers. The robot first walked to the faucet, turned on the valve, filled the watering can, then turned around, walked to the flower pot, adjusted its angle, and poured the water in evenly. The spout didn't hit the edge of the pot, and no water spilled out.

For a machine to understand "carrying a cup of water," it needs to know the cup is cylindrical, calculate the precise force needed to grip it without slipping or crushing it, understand that water is a liquid and will spill if shaken, and constantly adjust its arm angle while walking to compensate for body movement.

These things, a human three-year-old can do intuitively. But for AI, this is a huge leap. Over the past decade, AI learned to see, hear, speak, and draw, but it remained trapped within screens. What Physical AI aims to do is put this smart brain into a body that can run, jump, grasp, and manipulate objects in the real world.

Simply put, Physical AI is about making AI understand and act upon the physical world. It's no longer just processing text and images; it's about performing correct actions in an environment governed by gravity, friction, and inertia.

A fact seldom discussed domestically is that the term "Physical AI" didn't originate from some chip giant's PR department. This concept first appeared in a 2020 paper published in *Nature Machine Intelligence*. The paper systematically defined Physical AI for the first time:

A class of embodied systems capable of performing tasks typically associated with intelligent organisms. The core lies in deeply integrating physical laws into the AI system, so machines are no longer "physically blind" and can complete the perception-to-action loop.

From the academic world's opening shot in 2020 to the industry's full embrace in 2026, there was a gap of six whole years. In these six years, sensor costs dropped by several orders of magnitude, edge AI computing power moved from theory to engineering, and the reliability and mass production capability of robot bodies quietly reached an inflection point — these were the hidden forces pushing Physical AI from papers to production lines.

From Demonstration to Working

If the large language models of 2023 taught AI to chat, then the keyword for Physical AI in 2026 is just one thing: work.

The change is visible to the naked eye.

This time last year, the way robot companies showed off their muscles was still by filming demo videos, setting up scenes, rehearsing repeatedly, and shooting in one take. Impressive to watch, but you never knew how many takes they did.

This year, the playbook is completely different. This year, Zhi Yuan Robotics did something on a 3C production line in Nanchang: they threw a robot into a real factory and had it work continuously for several hours, live-streaming the entire process. No preset script, no limited scene — just the same production line workers face daily. Hundreds of thousands of people watched online.

A month later, Zhi Yuan announced in Hong Kong the mass production of 10,000 humanoid robots. The leap from one prototype in the lab to 10,000 on a production line is a milestone that changes the game.

Zhi Yuan's approach is interesting. Most robotics startups focus on a specific segment — some only on the body, some only on the large model, some only on dexterous hands. Zhi Yuan chose another path: doing the full stack, simultaneously developing the body manufacturing, AI model, dexterous manipulation, and data collection, while also investing in over 60 upstream and downstream companies in the industry chain.

The cost of this approach is clear: the parent company has over a thousand employees, expected to grow further by the end of this year, with an annual salary expenditure alone reaching billions. This path burns cash, but once proven, its moat is also the deepest.

Zhi Yuan's founder Deng Taihua proposed an analytical framework called the "XYZ Curve." He said embodied intelligence development has three stages: X is the development and experimentation phase, where people are still playing with demos; Y is the deployment and growth phase, where robots actually start working on production lines; Z is the ultimate intelligent emergence phase.

He characterized 2026 as: "the first year of deployment phase, officially moving from 'can move' to 'can work'." The difference between "can move" and "can work" is just one word, but it marks the entire industry's coming of age.

The pace overseas is equally intense, not slowing down across the Pacific.

American humanoid robot company Figure AI is an unavoidable name on this track. In September last year, they completed a funding round of over $1 billion, raising their valuation to $39 billion, making them the world's highest-valued humanoid robot company at the time.

A month later, they released a new generation product, Figure 03, standing 1.68 meters tall and weighing about 60 kilograms, demonstrating household chores like watering plants, serving dishes, and folding clothes. Founder Brett Adcock specifically added on social media: all actions were autonomously completed by the robot, with no human remote control.

Technologically, it's noteworthy that Figure made a major strategic pivot, terminating its cooperation with OpenAI and fully transitioning to its self-developed neural network system, Helix.

This system mimics human cognition with a three-layer structure: the bottom layer handles balance and instinctive reactions, the middle layer translates brain commands into motor control commands 200 times per second, and the top layer is the logical brain, responsible for understanding scenes and making decisions. This "instinct-reflex-thought" three-tier architecture is quite clever, essentially giving the robot a non-crashing nervous system.

Another thing worth mentioning. At this year's GTC conference, NVIDIA announced a move: deep cooperation with the world's four industrial robotics giants — ABB, KUKA, Yaskawa, and Fanuc. Over 2 million industrial robots already installed on production lines worldwide can now use NVIDIA's simulation platform for virtual commissioning and AI training.

These four companies combined account for over half of the global industrial robot market share. In the next decade, these robots will undergo an upgrade from "traditional programming" to "AI-driven." Whichever software platform can embed itself into this process will essentially secure the "operating system" layer for the next generation of industrial automation. NVIDIA clearly doesn't want to miss this boat ticket.

Cross-Border Sprint from the Supply Chain

Another interesting phenomenon: automotive supply chain companies are entering the Physical AI track en masse.

At this year's Beijing Auto Show, traditional automotive suppliers like Aptiv, Valeo, Horizon Robotics, and Qianxun SI showcased robotics-related solutions in clusters. Many industry insiders realized then that embodied intelligent perception is the same as automotive intelligent driving perception; automotive solutions can be directly applied to humanoid robots.

Thinking about it carefully, it makes sense. The automotive intelligent driving system is essentially a perception-decision-execution loop for a "mobile robot." Its three core modules — visual perception, path planning, and real-time control — are highly homologous in technical architecture with traditional industrial robots and humanoid robots.

Automotive suppliers' cameras, radars, steer-by-wire chassis, and real-time operating systems can be migrated to the robotics field with slight adaptation. In this sense, the hundreds of billions in R&D spending the automotive industry burned over the past decade on intelligence are now flowing into the Physical AI track as "technology spillover."

This might explain why Chinese robotics companies can so quickly enter the mass production stage. Manufacturing capabilities and supply chain management aren't built from scratch; many are readily available. Those component suppliers already honed on automotive production lines for over a decade are now applying their skills on a new battlefield.

There are ready-made cases abroad. Take Tesla, for example. Its first-generation humanoid robot Optimus is also accelerating its entry. Previously, Tesla clearly announced in its Q1 2026 earnings call that the company would transition to "a future centered on AI, autonomous taxis, and humanoid robots," with the first-generation robot production line having a capacity of 1 million units, replacing the current Model S and Model X production lines.

The number 1 million might seem exaggerated in today's context, but Tesla's logic is clear: it wants to directly replicate the large-scale production capabilities and supply chain management experience accumulated in automobile manufacturing into the humanoid robotics field.

What Musk wants is not a "robot that can move," but a "mass-produced tool" that can work alongside humans in factories. Once this path is proven, its impact on the manufacturing automation landscape will be no less than that of the Model 3 on the fuel vehicle market.

World Model: Why It Become Usable This Year

Having covered the major players' moves at the industry level, let's zoom in one layer deeper: what's the technological foundation of this Physical AI race?

To sum it up in one sentence: the engineering breakthrough of world models. I think this is also the most critical point for understanding this wave.

The concept of "world model" isn't new; it was proposed back in 2018. The core idea is simple: let AI develop an internal understanding of how the physical world operates, so it can predict "what will happen if I push this cup." But previously, this mostly existed only in papers — too computationally expensive, unstable generation quality, unsuitable for real-time interaction.

The turning point happened in the last year. NVIDIA launched a series of models called Cosmos, whose core capability is generating action data conforming to physical laws from text or images.

For example: if you want to train a robot to move boxes in various weather conditions, you don't need to actually film videos in factories during rain, snow, or at night. Set the parameters in a simulation environment, and Cosmos can directly generate massive amounts of highly realistic training data covering various extreme scenarios.

Early this year, the Ant Lingbo team open-sourced a framework called LingBot-World, specifically for interactive world models. It can achieve nearly 10 minutes of continuous, stable video generation, with end-to-end interaction latency controlled within seconds. Users can control virtual characters in real-time with a keyboard and mouse like playing a game, with the model providing instant feedback on scene changes. The significance is that world models moved from "offline rendering" to "online interaction," boosting training efficiency by an order of magnitude.

Another startup, Jijia Vision, released the GigaWorld-1 platform, positioned as a "digital sandbox" for the physical world. A month later, Alibaba's ABot-PhysWorld surpassed it on a benchmark called WorldArena, topping the comprehensive rankings. Competition is advancing month by month.

The importance of these open-source projects lies not in how high their parameters are, but in turning a game "only giants could play" into a tool "small teams can also use." When enough people are building the wheels, more cars will truly start running.

The reason world models have become a core component in the Physical AI era is that they answer that long-unresolved question: how to enable robots to learn the complex laws of the physical world in a low-cost, high-efficiency way?

Training data from the real world is extremely costly to obtain and inherently carries distribution bias. It's hard to gather all edge scenarios in reality, like factory night shifts during a blizzard, emergency situations during a logistics warehouse blackout, or sudden human intervention on a production line. But synthetic data can. By manipulating scene parameters with prompts in a simulation environment, researchers can generate large-scale training videos covering extreme conditions within hours, which would take months or even years under the traditional real-data collection route.

The leverage effect of this breakthrough might exceed any single algorithm improvement.

The Paradigm Has Changed

The breakthrough in world models is actually just one part of the evolution of the Physical AI tech stack. Changes in underlying technology are driving a fundamental architectural rebuild of the entire robotics industry.

Traditional robots use a "sense, plan, act" three-stage approach. First, sensors perceive the environment, then engineers write rules telling the machine how to plan its path, and finally, it executes the action. This works fine in structured environments like factory assembly lines, but once the scenario gets complex, its shortcomings are exposed. The machine only follows the preset script and gets stuck when encountering unseen situations.

Physical AI takes a different path: "perception, reasoning, execution." After perception, it doesn't go through human-written rules but uses a trained neural network to reason what to do and then execute. The essential difference is that the former is "the engineer thinks for the machine," while the latter is "the machine understands the physical world itself."

The International Federation of Robotics released a technology roadmap this year, predicting that within the next three years, 80% of new robot models will adopt this new architecture, with the traditional three-stage approach gradually exiting the mainstream. This isn't a minor tweak; it's a full paradigm shift.

As an industry expert aptly summarized: Physical AI is the ultimate mode of AI development because it needs to understand not only human instructions but also all the laws of the physical world.

Jensen Huang said the "ChatGPT moment" for robotics development has arrived. In my view, the nature of Physical AI's "moment" is completely different from that of language models. The "that moment" for language models was when ordinary people worldwide first got their hands on AI. The "that moment" for Physical AI is when AI truly starts working for the first time.

Currently, this track is at a very special stage: the direction is locked in, the concept is validated, but the landscape isn't settled.

On one hand, making demos and achieving mass production are two completely different capability systems. Getting one prototype to work is one thing; having ten thousand products perform consistently in real-world scenarios tests manufacturing consistency, supply chain resilience, scenario generalization ability, and operational systems. These have little to do with AI algorithms, but each is enough to halt a batch of players. On the other hand, real-world data collection is expensive, time-consuming, and has limited coverage, which almost predestines that large-scale training for Physical AI will heavily rely on synthetic data.

At the same time, from automotive supply chains and traditional industrial automation to consumer electronics manufacturing, industries that seem unrelated to "AI" are accelerating their entry into Physical AI through technology spillover. Their manufacturing capabilities, supply chain management experience, and scenario resources might be the key variables determining the speed of Physical AI's practical application.

An intuitive judgment is this: look back at the AI wave ignited by ChatGPT in early 2023. The ones who captured the most value weren't the model makers, but the infrastructure providers. Will this wave of Physical AI replay the same script?

NVIDIA's moves suggest it's betting on this direction, but the story isn't finished. 2026 is the first year of the deployment phase; industrial competition has just begun. Looking back three years from now, which names are still at the table and which have been eliminated might surprise most people.

BTC on a Roller Coaster, HYPE Hits New Highs | Guest Analysis

**Market Analysis: BTC Volatility and HYPE's New Highs** This week, markets experienced significant volatility. Macro pressures intensified with a bond market sell-off, rising rate hike expectations, and oil surpassing $110. Bitcoin (BTC) broke below $78K and is currently testing a critical range. The core debate centers on the nature of BTC's rally from its February low: Is it the start of a new uptrend (Path 1: bullish) or merely a B-wave rally within a larger monthly corrective structure (Path 2: bearish)? The outcome of the battle in the $78,500-$79,500 zone is key this week. * **For BTC:** * **Mid-term:** Maintain a neutral, cash position. * **Short-term:** Two contingency plans with ≤30% position size and strict stop-losses: * **Plan A (Bearish):** Sell if price rebounds but faces resistance in the $78,500-$79,500 zone. * **Plan B (Bearish):** Sell if price convincingly breaks below the $73,500-$75,000 support. * A break above $90,000-$93,100 would strongly favor the bullish Path 1 scenario. * **For HYPE:** HYPE continues its independent rally, hitting new highs with over 10% gains this week. The trend remains bullish as long as price holds above the key support at $38.41. * **Short-term Strategies (≤30% position):** * **Plan A (Bullish):** Buy on a confirmed break above $45.76. * **Plan B (Bearish):** Sell short on a confirmed break below $45.76. * **Plan C (Bullish):** Buy on a pullback finding support near $38.41. **Trade Review:** Last week, a disciplined 1x leveraged BTC long trade at $79,812, based on model signals, was closed at $81,426 for a ~2.02% profit. **Important:** Market conditions change rapidly. This analysis is for informational purposes only and does not constitute investment advice. Trade with caution and proper risk management.

marsbit9m ago

BTC on a Roller Coaster, HYPE Hits New Highs | Guest Analysis

marsbit9m ago

a16z: How Should Crypto Entrepreneurs Understand the CLARITY Act?

a16z: How Crypto Entrepreneurs Should Understand the CLARITY Act? The U.S. Senate Banking Committee's bipartisan vote to advance crypto market structure legislation, specifically the Digital Asset Market CLARITY Act, marks a historic moment for the industry. For a decade, a lack of clear U.S. regulation has stifled innovation, created consumer risks, and pushed development overseas. CLARITY aims to end this by establishing clear rules for blockchain networks and digital assets, similar to how the 1933 Securities Act shaped capital formation. The current regulatory patchwork has failed, causing legal confusion and enabling bad actors while hindering responsible builders. CLARITY provides a path forward by clarifying the regulatory roles of the SEC and CFTC, defining whether digital assets are securities or commodities, and establishing oversight for crypto exchanges and consumer protections. Crucially, CLARITY recognizes that blockchain networks are fundamentally different from traditional companies. Networks operate through shared rules and decentralized coordination, not centralized control. Applying corporate frameworks distorts them, leading to value extraction by intermediaries. Blockchain enables truly decentralized networks where value can be distributed to participants. CLARITY is designed to make this viable under U.S. law, allowing builders to operate transparently, raise capital domestically, and focus on long-term innovation without structural compromises due to regulatory ambiguity. The bill's progression follows earlier House efforts like FIT21 and the House CLARITY Act, which received strong bipartisan support. If passed, CLARITY could unlock significant innovation within the U.S., similar to the growth seen after the GENIUS Act for stablecoins, helping the U.S. lead in the digital economy while better combating fraud and abuse.

marsbit29m ago

a16z: How Should Crypto Entrepreneurs Understand the CLARITY Act?

marsbit29m ago

AI Benefits Senior Staff? 40% of CEOs Plan to Cut Junior Positions, Young People's Jobs Are More at Risk

The traditional assumption that senior employees are first in line during layoffs is being inverted in the AI era. A survey of 415 CEOs by Oliver Wyman and the NYSE reveals 43% plan to cut entry-level positions in the next 1-2 years to shift towards a mid-to-senior talent structure, a sharp rise from 17% last year. The logic is that AI excels at automating routine, cognitive tasks typically handled by junior staff (e.g., coding, data review), while the experience and judgment of senior employees remain harder to replicate. Research indicates this shift primarily manifests as a hiring freeze for junior roles rather than mass layoffs. Goldman Sachs estimates AI currently nets a loss of about 16,000 US jobs monthly, disproportionately impacting Generation Z concentrated in highly automatable white-collar roles. This raises long-term concerns about a broken talent pipeline, as companies risk having no future senior managers trained internally. Despite the dominant trend, a minority of successful AI adopters, like IBM and Salesforce, are expanding junior hiring, arguing these employees are adept at using and building AI tools. However, most companies are still in early AI deployment phases, with 67% in planning/pilot stages and many reporting returns below expectations. The overarching reality is a weakening of job security across all levels, as organizations reshape for an AI-augmented, leaner future.

marsbit1h ago

AI Benefits Senior Staff? 40% of CEOs Plan to Cut Junior Positions, Young People's Jobs Are More at Risk

marsbit1h ago

Why Zcash Is Back in the Game

The article, an update from ZODL founder Josh Swihart, outlines Zcash's evolution from a "privacy coin" into a broader financial privacy infrastructure. Its core mission is to build a parallel financial system free from mass surveillance, enabling users to privately hold and spend ZEC anywhere. The strategy emphasizes focus: building essential protocol and product (Zodl wallet) capabilities in-house, accelerating development by integrating with external products/chains (e.g., NEAR Intents for swaps), and rejecting non-essential "side quests." Recent progress includes the Zodl iOS 3.4.0 release, updates to Swift/Android SDKs, and testing of in-wallet Coinholder Polling for governance. The Zcash Core team is working on shortening block times, fixing shielding issues, and advancing the Zallet alpha wallet. Upcoming features for Zodl include automatic server switching and multi-currency conversion. The piece highlights a shift towards holistic user adoption, connecting financial privacy, wallet experience, cross-chain utility, and community governance. It concludes with updates on an upcoming ZODL Summit, new team hires, grant applications, and recent positive media coverage, asserting that focused execution is paving the road to a private financial future for billions.

marsbit1h ago

marsbit1h ago

The AI Mirror Behind DeepSeek's Financing: Alibaba to the Left, Tencent to the Right

The DeepSeek financing round exposed a strategic divergence in AI approaches between Alibaba and Tencent. When the AI startup sought external funding, Alibaba reportedly sought "ecosystem control," wanting to deeply integrate DeepSeek's technology into its own platforms like Taobao and Aliyun. Tencent, in contrast, offered a minority financial investment without demands for exclusivity or control over the startup's technical direction, aligning with its historical "open ecosystem" strategy. ByteDance, largely absent from these talks, pursues a third path: massive in-house investment in its own model, Doubao. These choices stem from corporate DNA: Alibaba's e-commerce and cloud heritage favors closed-loop control, while Tencent's social and investment background prefers open connection. Alibaba, with its mature in-house AI stack (Tongyi Qianwen, Pingtouge chips), could afford to walk away. Tencent's self-developed Hunyuan model, though catching up, allows it to engage externally from a position of greater flexibility. The article posits these strategies—Alibaba's "castle" of vertical integration, Tencent's "port" of open ecosystems, and ByteDance's aggressive C-end push—will lead to a sustained, multi-polar competitive landscape in China's AI sector, rather than a single winner-takes-all outcome.

marsbit1h ago

The AI Mirror Behind DeepSeek's Financing: Alibaba to the Left, Tencent to the Right

marsbit1h ago

Trading

Spot

Futures

Hot Articles

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

Audiera is a dual-platform Web4 entertainment ecosystem combining a mobile rhythm experience and a lightweight Telegram mini-game, powered by AI interaction and an on-chain creator economy.

40.0k Total ViewsPublished 2026.03.11Updated 2026.03.11

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

41.4k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

1.8k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

Physical AI is Hot, Some New Thoughts from Me

Abstract

From Demonstration to Working

Cross-Border Sprint from the Supply Chain

World Model: Why It Become Usable This Year

The Paradigm Has Changed

Related Questions

Related Reads

BTC on a Roller Coaster, HYPE Hits New Highs | Guest Analysis

a16z: How Should Crypto Entrepreneurs Understand the CLARITY Act?

AI Benefits Senior Staff? 40% of CEOs Plan to Cut Junior Positions, Young People's Jobs Are More at Risk

Why Zcash Is Back in the Game

The AI Mirror Behind DeepSeek's Financing: Alibaba to the Left, Tencent to the Right

Trading

Hot Articles

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

Discussions

Top Questions