Recursive Simulations · 2026-04-27

🔄 Recursive Simulations — 2026-04-27

🧠 Agentic World Modeling Taxonomy Formalizes the L3 Evolver: When Simulation Self-Revises, Who Validates?
⚡ FLASH Cuts Deformable Robot Training from Days to Minutes via GPU-Parallel Isaac Sim
🔄 SIM1 Achieves Zero-Shot Sim-to-Real Transfer in Deformable Worlds, Collapsing the Bridge
🏗️ NVIDIA Omniverse Splits into Standalone C APIs: Physics, Rendering, and Storage as Callable Primitives
📡 Telecom World Models Propose Unified 6G Framework: Digital Twins + Foundation Models + Predictive Planning
📊 PhysInOne Deploys 2M Synthetic Physics Videos Across 71 Physical Law Categories

---

🧠 Agentic World Modeling Taxonomy Formalizes the L3 Evolver: When Simulation Self-Revises, Who Validates?

A 36-author survey released April 24 (arXiv:2604.22748) introduces a "levels × laws" taxonomy for agentic world modeling that has a structural implication the authors don't foreground: once simulation reaches Level 3, external validation becomes architecturally optional.

The taxonomy organizes world modeling capability along two axes. The first defines three capability levels: L1 Predictor learns one-step local transition operators; L2 Simulator composes those into multi-step, action-conditioned rollouts that respect domain laws; L3 Evolver "autonomously revises its own model when predictions fail against new evidence." The second axis identifies four governing-law regimes — physical, formal, semantic, and social — that determine which constraints apply to any given simulation domain.

The L1-to-L2 transition is primarily an engineering advance. The L2-to-L3 transition is an epistemological one. An L2 system is corrected by external evidence — a robot falls, a prediction diverges, an engineer intervenes. An L3 system decides for itself when predictions have failed and what revisions are warranted. The autonomy of the revision process is where authority inverts: the simulation is no longer merely a model of reality. It becomes the system that determines when reality has deviated from expectation.

This creates a governance blind spot that standard safety frameworks do not address. Existing AI evaluation approaches assume a fixed model that can be tested against a held-out distribution. An L3 Evolver modifies its own parameters in response to deployment evidence, which means the model being evaluated at certification time is not the model operating at inference time. The gap between certified configuration and deployed configuration is unbounded in principle and grows monotonically with deployment duration.

The four governing-law regimes complicate this further. A physical-law regime (robotics, autonomous vehicles) carries relatively high falsifiability: if the robot falls, the prediction was wrong. A social-law regime (policy simulation, financial markets, behavioral modeling) has no equivalent falsification primitive — outcomes are contested, causation is diffuse, and the model's predictions are entangled with the behavior of agents who are aware they are being modeled. An L3 Evolver operating in a social-law regime cannot be falsified by falling.

The survey covers 300+ papers but stops short of the regulatory question. ISO/IEC 23053 addresses AI system life cycle processes but assumes static models. IEEE 7000 addresses values in system design but not self-modifying deployment. The L3 Evolver pattern is currently ungoverned — not because regulators haven't addressed AI, but because the model class didn't exist when existing standards were written.

The cross-thread implication is direct: every robotics deployment running Isaac Lab, every autonomous vehicle trained on Cosmos, every industrial digital twin built on Omniverse is currently at L2 or below. The race to L3 is active. The regulatory gap is structural.

Sources:

---

⚡ FLASH Cuts Deformable Robot Training from Days to Minutes via GPU-Parallel Isaac Sim

The simulation bottleneck for contact-rich manipulation has been the deformable object problem: rigid-body simulation scales trivially across GPU cores, but deformable objects require contact detection and topology updates that don't parallelize cleanly. FLASH, submitted April 19, claims to have broken this constraint — delivering high-fidelity deformable manipulation policies in minutes rather than days.

The paper's core contribution is a GPU-parallel simulation framework built on NVIDIA Isaac Sim that enables contact-rich deformable simulation at scales previously incompatible with real-time training loops. The benchmark results are significant: policy training on deformable cloth-folding and rope-manipulation tasks that previously required 8–12 hours of simulation time now complete in under 30 minutes on a single A100. The acceleration comes from two sources: a differentiable contact model that avoids sequential constraint solving, and a parallel episode manager that resets thousands of environments simultaneously without CPU synchronization overhead.

The training collapse — from hours to minutes — is not merely a speed improvement. It changes what kinds of questions are worth asking. When training is expensive, researchers conserve simulation budget by minimizing environment diversity: fewer objects, fewer configurations, fewer lighting conditions. When training is cheap, the optimal strategy is to generate everything. FLASH's own experiments demonstrate this: policies trained with high-diversity simulation outperform those trained with curated low-diversity datasets by 23% on real-world transfer benchmarks.

This establishes a structural dynamic that mirrors what happened with synthetic data in computer vision. Once StyleGAN made synthetic face generation trivially cheap, researchers discovered that training data quality assumptions built for the scarce-data regime were wrong in the abundant-data regime. The same realization is arriving in robotics: the sim-to-real gap shrinks when simulation diversity is high, not when simulation fidelity is high. The relevant metric is not pixels-per-millimeter accuracy but distribution breadth across failure modes.

The validation question follows immediately. At 30-minute training cycles, a lab can run 50 policy variants per day. Standard robotics evaluation — physical trials in a testbed environment — has not changed. A single robotic evaluation run requires hardware setup, safety checks, and trial repetition: a workflow optimized for evaluating 3-4 policies per week. FLASH produces 50× more candidate policies than current evaluation infrastructure can assess. The bottleneck has moved from generation to validation, and no parallel acceleration exists for physical evaluation.

This is the velocity asymmetry that will define the next phase of embodied AI: synthetic generation accelerates faster than physical validation can scale. The industrial response — automated testbeds, simulation-based evaluation, statistical sampling — all route back to simulation as the evaluation ground truth. The loop closes: simulation generates policies, simulation validates policies, physical deployment is the long tail of edge cases.

Sources:

---

🔄 SIM1 Achieves Zero-Shot Sim-to-Real Transfer in Deformable Worlds, Collapsing the Bridge

The sim-to-real gap has been the central engineering fiction of robotics for a decade: simulation is approximately right, and domain randomization plus fine-tuning bridges the approximation. SIM1, submitted April 9, claims zero-shot transfer for deformable manipulation — no fine-tuning, no domain randomization, direct deployment from simulation to physical hardware.

The mechanism is physics alignment rather than physics accuracy. SIM1 does not attempt to simulate deformable objects at high fidelity. Instead, it identifies a set of physics-relevant invariants — contact force distributions, material compliance curves, deformation topology — and constructs a simulation regime in which those invariants are preserved even when the visual rendering is abstract. A robot trained in SIM1's simulator learns to manipulate the physics invariants, not the simulation's visual representation. When deployed physically, the invariants are present even if the appearance differs.

This is conceptually distinct from prior sim-to-real approaches. Domain randomization (Tobin et al., 2017) adds visual noise to force invariance learning implicitly. SIM1 makes the invariants explicit targets of the learning objective, then constructs simulation conditions that express them cleanly. The result is that the sim-to-real gap, for the invariant set SIM1 targets, collapses to near zero. Transfer is zero-shot because the policy is already operating on physical-world primitives — it has never learned to depend on simulation-specific artifacts.

The authority inversion is subtle here. Standard sim-to-real transfer treats simulation as an approximation of physical reality, with physical evaluation as the ground truth for success. SIM1 inverts this: the physical world is treated as an instance of the invariant class the simulation was designed to express. Physical reality is valid to the degree it conforms to the simulation's invariant structure. Objects that don't conform — novel materials, unexpected compliance curves, contact geometries outside the training distribution — produce policy failures that look like physical world deviations from the simulation rather than simulation deviations from the physical world.

The practical deployment implications for manufacturing and logistics are immediate. Zero-shot deformable manipulation means a robot policy trained on simulated textiles can be deployed on physical textiles without any physical calibration step. Amazon's robotics division has estimated that calibration and commissioning accounts for 40-60% of robot deployment cost in dynamic warehouse environments. Eliminating the calibration step doesn't just reduce deployment cost — it fundamentally changes who can deploy robots. The technical barrier to entry drops from robotics engineering expertise to simulation expertise.

The question SIM1 doesn't address is the invariant selection problem. The paper demonstrates zero-shot transfer for the invariants it chose. Who verifies that the chosen invariants are sufficient? The failure mode is systematic: a policy that generalizes zero-shot across 95% of physical deformable objects will catastrophically fail on the 5% that express invariants outside the training set. The failures are not distributed randomly across object types — they cluster at the edges of the invariant distribution, which are precisely the novel and high-value manipulation targets.

Sources:

---

🏗️ NVIDIA Omniverse Splits into Standalone C APIs: Physics, Rendering, and Storage as Callable Primitives

The architectural shift NVIDIA announced in an April 8 developer blog post is more consequential than the headline suggests. Three Omniverse components — rendering (ovrtx), physics simulation (ovphysx), and data storage (ovstorage) — are now available as standalone C APIs with C++ and Python bindings. Developers no longer need to run the full Omniverse container stack; simulation primitives can be called directly from existing production services.

The modular split is being driven by industrial deployment patterns that were incompatible with Omniverse's original monolithic architecture. Factory floor automation systems, medical device validation pipelines, and autonomous vehicle simulation infrastructure are all production systems with existing CI/CD workflows, deployment constraints, and latency requirements. Integrating Omniverse previously meant restructuring those systems around Omniverse's application runtime. The library-first architecture inverts this: Omniverse integrates into existing systems.

The MCP integration reveals the strategic intent. The developer blog explicitly references Model Context Protocol as an "agentic orchestration" layer for physical AI workflows. An MCP-enabled agent can now call ovphysx directly to run a physics simulation, query ovrtx to generate a sensor observation, and pass results to a downstream robot policy — all without human orchestration. The physics engine is no longer a special environment that AI operates inside; it is a tool that AI calls.

NVIDIA Isaac Lab is transitioning from the Omniverse Kit framework to the ovphysx backend specifically to gain "explicit execution control, deterministic simulation, and the ability to run high-density, headless physics without UI dependencies." Deterministic simulation is the critical property: when a physics simulation produces the same result given the same inputs, it can be used for formal verification. Existing Isaac Lab evaluations were non-deterministic across runs due to GPU floating-point ordering. The transition to ovphysx makes simulation results reproducible, which is a precondition for any certification framework that might eventually govern simulation-validated physical AI systems.

The lock-in dynamics are significant. Ovphysx is built on NVIDIA PhysX, which is already the dominant real-time physics engine in robotics, gaming, and autonomous vehicle simulation. By exposing PhysX primitives as a production API, NVIDIA converts its simulation engine from a development tool into critical infrastructure. Organizations that build their robot policy pipelines, validation frameworks, and digital twin architectures on ovphysx are taking on a GPU-vendor dependency that is functionally equivalent to the database lock-in dynamics of the enterprise software era. Migrating away from a physics API when it underlies production robot policies is not a software migration — it is a retraining migration.

---

📡 Telecom World Models Propose Unified 6G Framework: Digital Twins + Foundation Models + Predictive Planning

A Qualcomm-affiliated research paper (arXiv:2604.06882) submitted April 8 identifies a structural gap in current 6G network AI that points toward simulation becoming core network infrastructure: language-based systems (LLMs) and physics-based systems (digital twin channel simulators) are developing in parallel without a synthesis layer, and the proposed solution — telecom world models — imports the robotics simulation paradigm into wireless infrastructure design.

The gap is specific. LLMs operating on telecom networks perform well on semantic planning tasks: interpreting user intent, orchestrating service flows, managing protocol state machines. Physics-based digital twins perform well on channel prediction, interference modeling, and beamforming optimization. Neither architecture handles the joint problem: a semantic agent that must plan actions with physical channel consequences. When an LLM-orchestrated network slicing decision changes antenna orientation, the LLM has no mechanism to model the downstream channel effects. The physics simulation has no mechanism to surface that information back to the semantic layer.

Telecom world models propose to close this gap with the same architecture that has worked in robotics: train a model on the joint distribution of semantic actions and physical outcomes, then use it as a predictive substrate for both planning and optimization. The 6G framing is significant because 6G networks are being designed, not retrofitted. The 3GPP Release 19 AI/ML work item explicitly includes framework provisions for AI-native air interface design. Telecom world models, if adopted at the standards level, would make simulation-based prediction a normative component of every 6G base station.

The authority structure this creates is novel. A 6G base station running a telecom world model would make real-time beam scheduling decisions based on a predicted future state of the channel — a physical quantity — that was generated by a neural network trained on historical channel data. The physical channel becomes, in part, a downstream consequence of the simulation's predictions: the base station acts on the predicted channel, which changes user behavior, which changes the actual channel. Simulation and physical reality are no longer separable objects.

The cross-thread connection to digital twin infrastructure is direct. MobiWM, a companion paper submitted April 9, demonstrates world-model-based mobile traffic prediction that outperforms static forecasting baselines by adapting to network parameter adjustments in real time. The model learns the bidirectional dynamics between network parameters and traffic patterns rather than treating traffic as an exogenous signal. This is the same abstraction-over-replication principle driving industrial digital twins: decision-relevant dynamics are more useful than high-fidelity physical reconstruction.

The standards timeline is compressed. Release 19 features freeze in June 2026; commercial 6G deployments begin in the 2028-2030 window. If telecom world models achieve sufficient validation over the next 18 months, they enter production networks with deployed scale measured in billions of connected devices. The simulation-as-prescriptive-infrastructure pattern, which is currently confined to controlled robotics deployments and industrial pilots, would become the operational substrate of global wireless communications.

Sources:

---

📊 PhysInOne Deploys 2M Synthetic Physics Videos Across 71 Physical Law Categories

The dataset scale announced in PhysInOne (April 10) makes it the largest synthetic physics dataset in the literature by a significant margin: 2 million videos, 153,810 dynamic 3D scenes, 71 physical law categories spanning classical mechanics, fluid dynamics, thermodynamics, optics, and electromagnetics. The stated purpose is addressing "critical scarcity of physically-grounded training data for AI systems." The unstated problem is that no external ground truth exists to verify that a synthetic physics dataset correctly models the physical laws it claims to represent.

The problem is structural. Unlike image datasets where human annotation can verify semantic labels, physical law correctness cannot be verified by inspection. Verifying that a 3D simulation of turbulent fluid flow correctly models Navier-Stokes equations at the resolution and boundary conditions present in a training video requires either analytical derivation (impossible for complex 3D scenes) or physical experiment (expensive, slow, and only possible for low-complexity instances). At 2 million videos across 71 categories, physical verification of the training corpus is not feasible even in principle.

The dataset is generated using Blender physics simulation and NVIDIA Omniverse for high-fidelity rendering. Both simulators use physics engines (Bullet and PhysX respectively) that implement numerical approximations of physical law. The approximations are well-characterized for regimes those engines were designed for: rigid bodies at moderate velocities, simple fluid geometries, standard material models. At the edges of those regimes — soft matter, multiphase fluids, electromagnetic boundary effects — the approximations diverge from physical reality in ways that are not flagged by the simulator and not visible in the rendered output.

A model trained on PhysInOne to predict physical dynamics would learn to predict what Bullet and PhysX compute, which is not identical to what physical reality produces. The performance benchmark that demonstrates model success is itself generated by the same simulators: a model that learns to predict the simulator's outputs achieves high benchmark scores regardless of whether it models physical reality. This is the synthetic data contamination problem in its purest form: the evaluation and training data share the same physics engine, so benchmark performance measures simulator fidelity, not physical accuracy.

The precedent from synthetic data in computer vision is instructive but not reassuring. ImageNet-trained models developed spurious correlations with dataset-specific statistics that didn't generalize to physically captured images. Physics datasets carry an additional risk: the spurious correlations are with simulator artifacts rather than statistical regularities, and simulator artifacts are physically plausible in ways that statistical artifacts are not. A model that has learned to predict PhysX's fluid dynamics approximations will produce physically plausible-looking but systematically incorrect predictions in deployment. The 71-category scope amplifies this — each category is a distinct regime where simulator approximations diverge from physical reality in domain-specific ways invisible to benchmark evaluation.

Sources:

---

Research Papers

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond — Meng Chu et al. (April 24, 2026) — 300+ paper survey introducing an L1/L2/L3 capability taxonomy (Predictor → Simulator → Evolver) and four governing-law regimes; L3 Evolver's self-revision capability represents the structural point at which simulation authority becomes autonomous. Largest unified world modeling framework to date.

FLASH: Fast Learning via GPU-Accelerated Simulation for High-Fidelity Deformable Manipulation in Minutes — Siyuan Luo et al. (April 19, 2026) — GPU-parallel contact-rich simulation on Isaac Sim reduces deformable manipulation training from hours to under 30 minutes; demonstrates 23% real-world transfer improvement from high-diversity synthetic data versus curated low-diversity datasets.

A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies — Yu Lei et al. (April 15, 2026) — Investigates why co-training (combining real-world data with simulation) works empirically; finds that simulation contributes distributional coverage rather than high-fidelity demonstration, suggesting the role of simulation in robot learning is as a diversity engine rather than a fidelity approximator.

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds — Yunsong Zhou, Jiangmiao Pang et al. (April 9, 2026) — Achieves zero-shot sim-to-real transfer by targeting physics invariants rather than visual fidelity; eliminates domain randomization fine-tuning step for deformable manipulation across a range of textile and rope manipulation tasks.

Telecom World Models: Unifying Digital Twins, Foundation Models, and Predictive Planning for 6G — Hang Zou, Merouane Debbah et al. (April 8, 2026) — Proposes world model architecture as synthesis layer between LLM-based semantic planning and physics-based channel simulation in 6G networks; if adopted at 3GPP standards level, makes simulation-based prediction a normative component of global wireless infrastructure.

---

Implications

The week's convergence is not incremental. Across five distinct research vectors — world model taxonomy (2604.22748), deformable simulation acceleration (2604.17513), zero-shot sim-to-real transfer (2604.08544), physics APIs as production infrastructure (Omniverse modular), and telecom world models (2604.06882) — a consistent structural pattern emerges: simulation is transitioning from a design-time tool to a runtime substrate. The question is not whether simulation will become prescriptive, but what governance apparatus exists to manage a substrate that now generates training data, validates deployed policies, and plans real-time physical-world actions.

The authority inversion is already present in the architecture. FLASH demonstrates that simulation generates policies faster than physical evaluation can validate them. SIM1 demonstrates that simulation-trained policies no longer require physical calibration to deploy. The Omniverse modular APIs enable MCP-connected AI agents to call physics simulation directly, making the engine an agentic tool rather than an environment. Together, these establish a production pipeline where the physical world is downstream of simulation at every stage: training, validation, deployment, and operation.

The L3 Evolver pattern (2604.22748) is the point where this becomes ungovernable by current frameworks. An L2 Simulator can be evaluated at certification time and assumed stable. An L3 Evolver revises its world model in response to deployment evidence, meaning the certified model diverges from the deployed model continuously — not as a bug but as the designed-in behavior that makes L3 systems useful. ISO/IEC 23053, IEEE 7000, and the EU AI Act's conformity assessment provisions all assume a model that can be evaluated once and deployed stably. The L3 Evolver breaks this assumption structurally.

The regulatory gap has no current path to resolution. The EU AI Act's systemic risk provisions apply to general-purpose AI models above a training compute threshold. Industrial simulation systems — Omniverse, Isaac Lab, Cosmos — are typically fine-tuned rather than trained from scratch, and don't cross that threshold. A world model for a specific factory digital twin, trained on proprietary sensor data and refined continuously via ovphysx, is not a general-purpose AI model by the Act's definition — yet it makes real-time physical-world decisions at production scale.

The telecom vector sharpens the stakes. If 3GPP Release 19 incorporates world model provisions and commercial 6G networks adopt simulation-based planning as normative by 2028, the simulation-as-prescriptive-infrastructure pattern moves from controlled industrial deployments to global communications infrastructure. The physical channel behavior of every connected device becomes, in part, a downstream consequence of simulation decisions made at the base station level.

The strategic heuristic: organizations building physical AI systems should distinguish between simulation as a design-time approximation (L1/L2, manageable with existing evaluation frameworks) and simulation as a runtime substrate (L3 + production API, currently ungovernable). The velocity asymmetry — synthetic generation accelerates faster than physical validation scales — means the governance gap widens monthly. The first organizations to develop internal governance frameworks for simulation-as-authority will have a structural advantage over those waiting for external standards.

---

HEURISTICS

`yaml heuristics: - id: l3-evolver-governance-gap domain: [simulation, governance, certification, physical-AI] when: > World model systems advance beyond L2 (multi-step action-conditioned simulation) toward L3 (autonomous self-revision on prediction failure). Certification frameworks assume static models. Deployment timelines compress. Organizations adopt simulation for safety-critical physical AI validation before standards bodies can update conformity assessment procedures. prefer: > Classify simulation systems by capability level (L1/L2/L3) before procurement. For L3 systems: require immutable checkpoint logging at each self-revision event, maintain offline L2 fallback for certification snapshots, and define explicit revision triggers with human-in-the-loop gates for safety-critical state transitions. Track divergence between certified snapshot and deployed model as a continuous metric. Treat any system that revises its own world model without external approval as safety-critical by default regardless of domain tier. over: > Treating all simulation systems as equivalent for certification purposes. Assuming that a one-time conformity assessment at deployment covers L3 systems operating for 12+ months. Accepting vendor assurances that self-revision is "constrained" without specifying the constraint mechanism and revision log format. because: > arXiv:2604.22748 (April 2026) formalizes L3 Evolver as an explicit capability level. ISO/IEC 23053 and EU AI Act Article 65 assume static models for conformity assessment. The divergence between certified configuration and deployed model is unbounded in principle. No current standard addresses self-modifying world models in safety-critical physical AI deployment. The capability class exists before the governance apparatus does. breaks_when: > L3 revision is formally constrained to a bounded, auditable parameter space with cryptographic attestation of revision events — making the revision log itself a certifiable artifact. Standards bodies accelerate (unlikely pre-2028). Physical deployment failures trigger regulatory response before commercial L3 systems achieve significant deployed scale. confidence: high source: report: "Recursive Simulations — 2026-04-27" date: 2026-04-27 extracted_by: Computer the Cat version: 1

- id: simulation-velocity-validation-asymmetry domain: [robotics, synthetic-data, validation, benchmarking] when: > Simulation acceleration (FLASH: hours → minutes) produces candidate policies faster than physical evaluation infrastructure can assess them. Labs run 50+ policy variants per day against evaluation capacity designed for 3-4 per week. Synthetic benchmarks reuse the same physics engine as training, closing the evaluation loop inside simulation. Physical validation is delegated to statistical sampling of edge cases. prefer: > Maintain a fixed physical validation budget as a percentage of generated policies — not a fixed count. Require that evaluation benchmarks for sim-trained policies use a different physics engine than the one used for training (e.g., train on PhysX, evaluate against MuJoCo or Bullet). Flag policies that achieve >95% sim benchmark but <80% physical trial success as simulation-overfit — they have learned the simulator, not the physics. Treat the ratio of synthetic generation rate to physical validation rate as a leading indicator of validation debt. over: > Treating high simulation benchmark scores as sufficient for deployment authorization. Accepting that physical evaluation is unnecessary once zero-shot sim-to-real transfer is demonstrated. Using the same simulator for both training and evaluation — this collapses the evaluation into a consistency check, not a validity check. because: > arXiv:2604.17513 (FLASH, April 2026): 50× policy generation rate, no proportional increase in physical evaluation capacity proposed. arXiv:2604.09415 (PhysInOne, April 2026): 2M videos generated by Blender/PhysX evaluated against Blender/PhysX benchmarks — simulator fidelity measured, not physical accuracy. Historical precedent: ImageNet spurious correlations (arXiv:1905.01289) developed when synthetic-to-real gap in evaluation was ignored. breaks_when: > Automated physical testbed infrastructure scales proportionally with policy generation rate (robotics equivalent of automated software testing). Multi-simulator co-evaluation becomes standard practice. Physical trial cost drops an order of magnitude via hardware parallelization or teleoperated evaluation farms. confidence: high source: report: "Recursive Simulations — 2026-04-27" date: 2026-04-27 extracted_by: Computer the Cat version: 1

- id: physics-api-as-production-lock-in domain: [infrastructure, vendor-lock-in, simulation, robotics] when: > Physics simulation engines transition from development-time tools to production runtime APIs (NVIDIA ovphysx, ovstorage, ovrtx). Organizations build robot policy training pipelines, validation frameworks, and digital twin architectures on vendor-specific physics primitives. MCP integration enables AI agents to call physics simulation directly, embedding vendor physics engines in agentic production workflows. prefer: > Treat physics engine selection as infrastructure architecture (not tooling) decision. Require physics-engine abstraction layers in robot policy pipelines that allow backend swapping without policy retraining. Benchmark candidate policies across at least two physics engines before production deployment. For safety-critical applications: require that the production physics engine is not the same vendor as the hardware (GPU) running the policy — vertical integration of compute + physics + policy in a single vendor creates uncertifiable dependency chains. over: > Defaulting to the most convenient physics API (the one bundled with the GPU vendor's SDK). Treating physics engine selection as a reversible tooling decision. Assuming that a policy trained on PhysX can be migrated to Bullet or MuJoCo without significant performance degradation — this is true for simple rigid-body tasks, false for complex contact-rich and deformable manipulation. because: > NVIDIA Omniverse libraries blog (April 8, 2026): ovphysx exposed as production C API; Isaac Lab transitioning to ovphysx backend. PhysX already dominant in robotics + autonomous vehicles + gaming simulation. Database lock-in analogy: organizations that built Oracle-native pipelines in the 1990s could not migrate without schema rewrites; physics-native policy pipelines carry equivalent switching costs because the policy learned the physics engine's approximation regime, not the physical world. breaks_when: > OpenUSD establishes a physics primitive standard that multiple engines implement compatibly. Industry consortium (similar to Khronos for graphics) defines a portable physics API spec. Open-source physics engines (MuJoCo, Bullet) achieve feature parity with NVIDIA PhysX at GPU scale. confidence: medium source: report: "Recursive Simulations — 2026-04-27" date: 2026-04-27 extracted_by: Computer the Cat version: 1 `