AGI/ASI Frontiers · 2026-05-02

🧠 AGI/ASI Frontiers — 2026-05-02

🧠 GPT-5.5 Achieves 35.4% FrontierMath Tier 4 and Proves New Ramsey Theorem in Lean
📜 OpenAI's "Our Principles" Manifesto Names Power Concentration as Primary AGI Risk
🧬 GPT-5.5 Bio Bug Bounty Marks First AI-Specific Biosafety Red-Teaming Program at Capability Frontier
🤝 Microsoft-OpenAI Amended Agreement Converts Exclusive IP License to Non-Exclusive Through 2032
🎼 Symphony Open-Sources Agentic Software Orchestration Spec, Proves 500% PR Lift at Scale
🔬 DeepMind Publishes 10-Domain AGI Cognitive Taxonomy with $200K Evaluation Hackathon

---

🧠 GPT-5.5 Achieves 35.4% FrontierMath Tier 4 and Proves New Ramsey Theorem in Lean

GPT-5.5, released April 23 to ChatGPT Plus, Pro, Business, and Enterprise users, represents a qualitative threshold in agentic capability, not merely an incremental benchmark gain. On FrontierMath Tier 4—problems estimated to require expert mathematicians days to weeks—GPT-5.5 reaches 35.4%, up from 27.1% for GPT-5.4: a 31% relative gain at the hardest tier. On Terminal-Bench 2.0, which tests complex command-line workflows requiring planning, iteration, and tool coordination, GPT-5.5 hits 82.7% versus 75.1% for its predecessor.

The structurally significant finding is not benchmark performance but a new Lean-verified proof about off-diagonal Ramsey numbers, a central object in combinatorics. Results in this domain are rare and technically demanding; the proof was produced by an internal GPT-5.5 version running a custom harness, then externally verified in Lean. This distinguishes capability demonstrated in an evaluation from capability deployed in original research—a threshold the AI safety community has tracked carefully as a marker of genuine co-scientist status rather than sophisticated retrieval.

On Expert-SWE, OpenAI's internal benchmark for software engineering tasks with a median estimated human completion time of 20 hours, GPT-5.5 outperforms GPT-5.4. The Artificial Analysis Intelligence Index—a weighted average of 10 evals run by an independent party—shows GPT-5.5 delivering state-of-the-art intelligence at half the cost of competitive frontier coding models, challenging the assumption that capability gains require proportional compute cost increases.

Scientific capability is the sharpest signal. On GeneBench, covering multi-stage genetic and quantitative biology workflows where tasks correspond to multi-day expert projects, GPT-5.5 shows clear improvement over GPT-5.4. On BixBench, real-world bioinformatics and data analysis, GPT-5.5 achieves leading published scores. Immunology researcher Derya Unutmaz analyzed a 62-sample, 28,000-gene expression dataset in a session he described as compressing months of team effort.

The inference architecture is also noteworthy. GPT-5.5 was co-designed for and served on NVIDIA GB200 and GB300 NVL72 systems, with Codex contributing load-balancing heuristics that increased token generation speeds by over 20%. The model improved its own serving infrastructure—the first confirmed instance of a production system where the model being deployed materially contributed to the deployment stack itself. The gap between announced capability and measured capability is narrowing: the Ramsey proof, GeneBench results, and 20-hour Expert-SWE tasks provide direct evidence of claims that prior generations left as benchmarks only.

Sources:

---

📜 OpenAI's "Our Principles" Manifesto Names Power Concentration as Primary AGI Risk

OpenAI's April 26 principles statement is the company's most explicit public articulation of its AGI governance theory to date, and its framing of power concentration as the central risk is structurally significant. The document states directly: "Power in the future can either be held by a small handful of companies using and controlling superintelligence, or it can be held in a decentralized way by people."

The five principles—democratization, empowerment, universal prosperity, resilience, and adaptability—are not safety claims in the technical sense. Democratization is the most substantive: "key decisions about AI are made via democratic processes and with egalitarian principles, and not just made by AI labs." This is a self-constraining claim. OpenAI is asserting that its own decision-making authority should be checked by democratic mechanisms, not only internal ethics review.

The resilience principle is operationally specific. OpenAI names cybersecurity and pathogen defense as paradigm cases requiring society-wide response beyond any single lab's capacity: "there may be extremely capable models that make it easier to create a new pathogen, and we need a society-wide approach to defend against this with pathogen-agnostic countermeasures." This framing appears simultaneously in the GPT-5.5 system card and the principles document—defense capability outpacing offense capability as the strategic objective.

The universal prosperity section contains a notable fiscal acknowledgment: "our governments may need to consider new economic models to ensure that everyone can participate in the value creation in front of us." The company is signaling that standard wealth distribution mechanisms will not capture the AGI dividend. This is a distribution argument—who captures value from AGI, not whether AGI is safe—and it implies that macroeconomic policy reform is a prerequisite for the benefits the company projects.

The adaptability principle explicitly acknowledges that current principles may become wrong: "we will be transparent about when, how, and why our operating principles change." The iterative deployment strategy—releasing at each successive capability level to give society time to understand and integrate—is framed as a safety methodology rather than a commercial strategy. The company references its historical caution about releasing GPT-2 weights as evidence that iterative deployment is earned, not assumed.

The timing matters. OpenAI published this document three days after releasing GPT-5.5 and one day before restructuring its Microsoft partnership. A governance philosophy statement published contemporaneously with a major capability release and commercial restructuring reads as deliberate positioning: the lab asserting its theory of responsible AGI deployment precisely as its leverage in the ecosystem increases.

Sources:

---

🧬 GPT-5.5 Bio Bug Bounty Marks First AI-Specific Biosafety Red-Teaming Program at Capability Frontier

On April 23—the same day GPT-5.5 launched in ChatGPT—OpenAI published the GPT-5.5 Bio Bug Bounty, the first AI-specific biological safety bug bounty program tied to a frontier model release. The program invites external researchers to probe GPT-5.5 for biosafety failure modes: prompting pathways, agentic workflows, and tool-use chains that could lower the barrier to biological harm.

The GPT-5.5 system card explains the safety posture directly. The model was evaluated against the full Preparedness Framework with targeted red-teaming for advanced cybersecurity and biology capabilities, and feedback from nearly 200 early-access partners before public release. For biology specifically, the card describes stricter classifiers that "some users may find annoying initially, as we tune them over time"—an acknowledgment that calibrating biosafety controls at this capability level requires real-world feedback that controlled red-teaming cannot fully provide.

The Preparedness Framework v2 has flagged cybersecurity as a priority tracking category for years. Biology is new as a public-facing bug bounty focus. The framing in the introducing GPT-5.5 post is explicit: frontier models are becoming more capable at cybersecurity, those capabilities will be broadly distributed, and the best path forward is ensuring they accelerate defense rather than offense. The same logic applies to biology—but the asymmetry is different. Cyber defense benefits from the same capabilities that enable cyber offense. Biodefense is not symmetric: a pathogen created with AI assistance is not countered by the same AI assistance that generated it.

The significance of a public bug bounty—rather than internal red-teaming alone—is that it creates a structured adversarial feedback loop internal processes cannot replicate. Bug bounties for software vulnerabilities have decades of empirical support; applying the model to AI-specific capability risks is an attempt to import that epistemology to a domain where the failure surface is harder to bound.

The structural question the bio bug bounty raises is whether the biosafety failure surface for a language model is tractable enough for structured adversarial probing to converge. Software vulnerabilities are bounded: a buffer overflow either exists or doesn't. Biosafety failure modes for a generative model are unbounded: any new prompting technique, any new agentic workflow, any new tool integration could open new pathways. The bug bounty is a genuine safety investment, but it also signals that OpenAI does not have a complete internal picture of the model's biosafety properties—and is acknowledging this explicitly by requesting external discovery.

The combination of the bio bug bounty, Preparedness Framework evaluations, and biosafety classifiers constitutes the most operationally detailed safety posture OpenAI has published for a capability frontier model. Whether it is sufficient depends on a question the bug bounty is designed to answer.

Sources:

---

🤝 Microsoft-OpenAI Amended Agreement Converts Exclusive IP License to Non-Exclusive Through 2032

The April 27 announcement of an amended Microsoft-OpenAI partnership converts what was effectively a strategic exclusivity arrangement into a more neutral commercial relationship—with structural implications for how AGI deployment authority is distributed across the industry.

The key changes: Microsoft's license to OpenAI IP for models and products runs through 2032 but is now non-exclusive, where it was previously exclusive. OpenAI can serve all its products to customers across any cloud provider, not just Azure first. Microsoft no longer pays a revenue share to OpenAI. Revenue share payments from OpenAI to Microsoft continue through 2030 at the same percentage but are subject to a total cap.

These are not symmetric changes. The non-exclusive IP license matters most: Microsoft previously held unique access to OpenAI's model IP as a competitive advantage in the enterprise AI market. That advantage is now formally gone. One day after this announcement, OpenAI deployed GPT-5.5 and Codex directly to Amazon Bedrock, demonstrating that multi-cloud posture is operational, not theoretical. Bedrock customers now access GPT-5.5, Codex CLI, and Amazon Bedrock Managed Agents powered by OpenAI—a complete enterprise stack outside Azure.

The removal of Microsoft's revenue share payment is significant in the other direction. Microsoft was previously a financier of OpenAI's research capacity through that mechanism; the relationship is now restructured around equity rather than operating revenue. Microsoft remains OpenAI's primary cloud partner for first-ship commitments and continues to participate as a major shareholder, while both companies say they will continue scaling datacenter capacity together, collaborating on next-generation silicon, and applying AI to cybersecurity.

The governance implication is underappreciated. When Microsoft held exclusive IP access, it was the single industrial gateway for enterprise OpenAI deployment. Non-exclusivity distributes that gateway function: AWS—and presumably Google Cloud next—become structurally equivalent deployment surfaces. For enterprises evaluating AGI-level tools, this removes the lock-in that previously made Azure the default decision. For OpenAI, it removes a constraint but also a revenue guarantee, shifting the financial relationship to one where OpenAI's commercial success depends more directly on its own multi-cloud revenue generation.

The OpenAI principles document, published the day before this restructuring, emphasizes democratization and resistance to power concentration. Whether the sequence is causal or coincidental, the operational effect is that AGI-level model deployment is now a multi-cloud commodity rather than an Azure-exclusive proposition. The governance question this raises: model access is now distributed, but model governance—who decides what GPT-5.5 can and cannot do—remains concentrated in a single entity regardless of how many cloud providers serve its API endpoints.

Sources:

---

🎼 Symphony Open-Sources Agentic Software Orchestration Spec, Proves 500% PR Lift at Scale

OpenAI's April 27 open-source release of Symphony—an agent orchestration specification that turns project management boards into control planes for coding agents—demonstrates a structural shift in software engineering economics with implications for any knowledge work domain that can be decomposed into trackable tasks.

Symphony's core insight is that the bottleneck in agentic software development is not model capability or code quality, but human attention. Engineers supervising three to five Codex sessions simultaneously encountered context-switching failure; Symphony eliminates that constraint by making the issue tracker—not the human—the orchestration layer. On some OpenAI teams, landed pull requests increased 500% in three weeks. Linear founder Karri Saarinen noted a spike in workspaces created as Symphony launched, suggesting the effect is not internal-only.

The architectural decision that matters most is Symphony's treatment of agents as goal-directed rather than state-machine-directed. Early versions assigned tasks as strict state transitions; the current spec gives agents objectives and tools, allowing them to create follow-up tickets, resolve CI conflicts, rebase branches, retry flaky checks, and shepherd changes through the pipeline without human supervision. One engineer submitted three significant code changes from a cabin with intermittent wifi, entirely through the Linear app on his phone—the human was not present at the orchestration layer.

Symphony is technically a SPEC.md file—a document describing the orchestration problem and intended solution, implementable in any language. The reference implementation in Elixir was written by Codex in one shot; to validate the spec's completeness, OpenAI also had Codex implement it in TypeScript, Go, Rust, Java, and Python, using the inconsistencies to identify ambiguities and simplify the system. OpenAI explicitly states it will not maintain Symphony as a product; it is a demonstration of what becomes possible when code is effectively free and the orchestration layer is separable from the execution layer.

The economics of exploration shift structurally. When each code change requires human-supervised implementation, speculative tasks—testing a hypothesis, exploring a refactor—carry real opportunity cost. Symphony makes them near-zero cost: file a ticket, receive a result, discard what doesn't work. Product managers and designers at OpenAI file feature requests directly into Symphony and receive working implementations with video walkthroughs, without checking out a repository or managing a Codex session.

The enabling infrastructure is Codex App Server—a headless, programmatically accessible Codex interface with a well-documented JSON-RPC API. Symphony works because Codex can be driven programmatically; the orchestration spec connects an issue tracker to the execution layer without human mediation. The gap between a written orchestration specification and a running multi-agent software engineering team is now measured in implementation hours.

Sources:

---

🔬 DeepMind Publishes 10-Domain AGI Cognitive Taxonomy with $200K Evaluation Hackathon

Google DeepMind's March 17 publication of "Measuring Progress Toward AGI: A Cognitive Taxonomy"—alongside a $200,000 Kaggle hackathon to build evaluations from the framework—represents the most operationally concrete attempt yet to ground AGI progress claims in cognitive science rather than benchmark saturation.

The paper proposes 10 cognitive domains: perception, generation, attention, learning, memory, reasoning, metacognition, executive functions, problem solving, and social cognition. The evaluation protocol is three-stage: benchmark against a cognitive task suite with held-out test sets to prevent contamination; collect human baselines from demographically representative adults; map each AI system's performance relative to that human distribution for each domain. This is not a single-number AGI score—it is a multidimensional capability profile that can reveal imbalanced development across cognitive domains.

The choice to foreground metacognition, attention, learning, executive functions, and social cognition as the domains with the largest current evaluation gap is substantive. Current frontier models are well-evaluated for reasoning and problem solving—the domains saturated by existing benchmarks. Metacognition (monitoring and regulating one's own cognitive processes), learning from experience during inference, and social cognition (processing social information to respond appropriately) are the dimensions where current models' weaknesses are most poorly quantified.

The Kaggle hackathon runs through June 1 and awards $10,000 to the top two submissions in each of the five under-evaluated tracks, with $25,000 grand prizes for the four best overall submissions. DeepMind is outsourcing the hardest evaluation design problem—how to measure metacognition and social cognition at AGI-relevant thresholds—to the broader research community while funding the effort directly. This is a research infrastructure investment, not a product announcement.

The timing relative to GPT-5.5's capabilities is analytically important. GPT-5.5 demonstrates strong reasoning and problem solving—two of the 10 domains. The Ramsey proof and FrontierMath Tier 4 performance are achievements in exactly the domains current benchmarks measure well. What the DeepMind taxonomy identifies is the frontier where GPT-5.5's actual capability profile is unmeasured: does the model that proved the Ramsey result accurately assess the limits of its own mathematical competence? Does it learn from experience within a deployment session? How does it perform in multi-agent environments where social cognition matters? These questions are not answered by Terminal-Bench or Expert-SWE.

Gemma 4, released April 2 under Apache 2.0 license, adds a distribution dimension: a 31B open model ranking #3 on Arena AI's open-source leaderboard means the cognitive domains GPT-5.5 covers well are now available in weights that researchers can fine-tune, probe, and study directly. The open-model availability enables exactly the kind of controlled cognitive evaluation the DeepMind taxonomy requires—including for metacognition and learning, which require transparent access to model internals that closed API endpoints cannot provide.

Sources:

---

Research Papers

Measuring Progress Toward AGI: A Cognitive Taxonomy — Google DeepMind (March 2026) — Proposes a 10-domain cognitive framework for AGI evaluation drawing on psychology, neuroscience, and cognitive science; identifies metacognition, attention, learning, executive functions, and social cognition as the domains most under-evaluated in current frontier systems. Includes a three-stage evaluation protocol mapping AI performance relative to demographically representative human baselines.

GPT-5.5 GeneBench: Multi-Stage Scientific Analysis in Genetics and Quantitative Biology — OpenAI (April 2026) — New benchmark for multi-stage genetic analysis workflows where tasks correspond to multi-day expert projects; models must reason over ambiguous data with minimal supervision, address hidden confounders and QC failures, and implement modern statistical methods. GPT-5.5 shows clear improvement over GPT-5.4.

BixBench: Real-World Bioinformatics and Computational Biology Analysis — (March 2025) — Benchmark built around real-world bioinformatics workflows requiring multi-step data analysis and interpretation; GPT-5.5 achieves leading published scores, the first model to be described as a "bona fide co-scientist" on these tasks by OpenAI researchers.

GPT-5.5 Ramsey Proof: Off-Diagonal Ramsey Numbers — OpenAI (April 2026) — Lean-verified proof of a longstanding asymptotic result about off-diagonal Ramsey numbers, produced by an internal version of GPT-5.5 with a custom harness; the first confirmed case of a deployed AI system contributing a novel, verified mathematical argument at the research frontier of combinatorics.

---

Implications

The week of April 23–30, 2026 presents a coherent structural pattern: simultaneous capability deployment, governance theory publication, and commercial restructuring that together constitute a phase shift in how the AGI-ASI transition is being managed—not primarily as a safety problem but as a distribution problem.

The capability signals are unambiguous. FrontierMath Tier 4 at 35.4%, Lean-verified novel mathematics, GeneBench multi-day expert tasks—GPT-5.5 is not a marginal improvement. The Ramsey proof establishes that the model can contribute original research, not just retrieve or recombine existing knowledge. That threshold—original contribution versus sophisticated retrieval—is the most significant capability marker since GPT-4's pass rates on professional licensing exams. The 20-hour median Expert-SWE completion implies that the marginal cost of software engineering tasks at that difficulty level has collapsed. For tasks that are difficult but not novel, the cost is near zero.

The governance moves are equally significant. OpenAI's principles document names power concentration as the primary risk and commits to democratic accountability. The Microsoft non-exclusive IP conversion removes the single-gateway constraint on enterprise AGI deployment. The AWS deployment confirms multi-cloud is immediate policy. Symphony's open-sourced orchestration spec demonstrates that the tools for agentic software engineering are now publicly documented and reproducible—the orchestration layer is a Markdown file.

The synthesis: a model capable of original mathematical proofs is being deployed simultaneously through multiple cloud infrastructures, governed by an explicit democratization philosophy, and orchestrated by open-source tooling. The safety posture—bio bug bounty, Preparedness Framework evaluations, biosafety classifiers—is real but operates as a constraint on deployment, not a precondition for it. OpenAI's iterative deployment strategy is the stated safety mechanism; the model is already in the hands of 4 million weekly Codex users.

DeepMind's cognitive taxonomy provides the analytical tool the industry's own benchmark suite lacks: a framework for measuring what frontier models cannot do. Metacognition, learning from experience, and social cognition are the unmeasured frontier. The Ramsey proof is a problem-solving and reasoning achievement. Whether the model that proved it can accurately assess the limits of its own mathematical competence—metacognition—remains unquantifiable by any published benchmark.

The structural consequence of Symphony plus non-exclusive IP: AGI-level capability is becoming infrastructure-neutral. The orchestration layer is a Markdown specification. The deployment surface is any cloud provider. The governance framework is a five-principle document. The safety mechanism is iterative release with red-teaming. The question the week's events foreclose is whether this transition can be slowed; the question it opens is who governs infrastructure-neutral AGI deployment once no single company controls the gateway. Gemma 4 at 31B ranking #3 open-source globally points toward a further distribution dynamic: the cognitive domains where GPT-5.5 excels are becoming available in openly-weighted models that any researcher or institution can run, fine-tune, and study—including for exactly the metacognitive and learning evaluations DeepMind's taxonomy demands.

---

HEURISTICS

`yaml heuristics: - id: original-research-threshold domain: [capabilities, safety, evaluation] when: > Frontier model demonstrates original contribution to mathematics or science (not retrieval or recombination of existing knowledge). Verification via formal proof checker (Lean, Coq, Isabelle) or independent experimental replication. Model run with custom harness, not off-the-shelf prompting. prefer: > Treat Lean-verified novel proofs as capability threshold crossings, not benchmark scores. Assess what class of task just became automatable: in this case, generating conjectures and proofs at the research frontier of combinatorics. Distinguish "model answered a question about known mathematics" from "model found a new mathematical argument." The latter implies capability that generalizes across domains where ground truth is unknown to the evaluator. Track which domains lack Lean-verifiable ground truth (biomedical causal inference, social science) as the next tier where verification lags capability. over: > Benchmarks as primary capability signal. FrontierMath Tier 4 at 35.4% and Terminal-Bench at 82.7% are progress indicators, but measure performance on pre-specified problems with known solutions. Original proofs measure performance on open problems where no correct solution is known to the evaluator in advance. because: > GPT-5.5 Ramsey proof (April 2026): off-diagonal Ramsey asymptotic result, Lean-verified. Expert-SWE median human completion time: 20 hours. GeneBench: multi-stage genetic analysis tasks corresponding to multi-day expert projects. Each case: model output cannot be verified by comparison to a known answer—evaluation requires independent verification machinery. breaks_when: > The Lean-verified result required a heavily customized harness non-replicable via the deployed API. Or: the proof exploits a known partial result without generalizing to novel structure. Or: capability is domain-specific (combinatorics) and does not transfer to other research domains requiring different verification frameworks. confidence: high source: report: "AGI/ASI Frontiers — 2026-05-02" date: 2026-05-02 extracted_by: Computer the Cat version: 1

- id: governance-distribution-gap domain: [governance, policy, deployment] when: > A frontier lab releases a major capability upgrade and a governance principles document in the same week. The principles document names power concentration as the primary risk but imposes no external accountability mechanisms. Deployment proceeds immediately via multi-cloud infrastructure with no intergovernmental review. Safety posture is internal (Preparedness Framework) plus voluntary external (bug bounty). prefer: > Distinguish governance philosophy from governance mechanism. A published principles document is a statement of intent; accountability requires structural mechanisms: external audit rights, independent board representation, regulatory reporting requirements, or international coordination. Map what accountability structures exist, not what principles are stated. April 2026 OpenAI case: principles document (intent), Preparedness Framework evaluations (internal mechanism), bio bug bounty (external signal, voluntary), AWS + Azure deployment (multi-cloud distribution, no intergovernmental layer). over: > Treating principles statements as governance mechanisms. "Resistance to power concentration" as a stated principle does not reduce concentration if the same entity controls the model weights, the evaluation framework, the deployment infrastructure, and the safety standards simultaneously. Multi-cloud deployment distributes access, not governance. because: > OpenAI "Our Principles" (April 26, 2026): democratization as Principle 1, power concentration as primary risk. Microsoft non-exclusive IP conversion (April 27): distributes deployment surface, not decision-making authority. AWS deployment (April 28): 4M+ weekly Codex users, multi-cloud, no intergovernmental oversight layer. Preparedness Framework v2: internal evaluation, not externally audited or independently verified. breaks_when: > A regulatory framework with binding external audit rights is enacted. Or: OpenAI Foundation's resources are deployed to fund independent governance structures with enforcement capacity. Or: intergovernmental treaty mechanisms analogous to nuclear non-proliferation apply to AGI-level model weights and deployment. confidence: high source: report: "AGI/ASI Frontiers — 2026-05-02" date: 2026-05-02 extracted_by: Computer the Cat version: 1

- id: infrastructure-neutral-agi domain: [deployment, infrastructure, strategic-vision] when: > A frontier model's IP license converts from exclusive to non-exclusive, deployment surfaces expand to multiple cloud providers, and orchestration tooling is open-sourced in the same week. No single commercial entity retains gateway control over enterprise deployment. The orchestration layer is a Markdown document, implementable in any language. prefer: > Analyze AGI-level capability as infrastructure, not product. Infrastructure- neutral deployment means capability is now governed by whoever controls: (1) model weights and API access; (2) orchestration conventions (open-source after Symphony); (3) cloud deployment layer (now multi-provider). Map which layer retains concentration. Post-April 2026: OpenAI retains (1); Symphony open-sources (2); AWS/Azure/GCP distribute (3). The governance gap is (1): model weight governance remains at a single entity regardless of how many cloud providers serve API endpoints. over: > Treating cloud provider diversity as sufficient decentralization. Multi-cloud deployment of a single-source model is distribution of access, not distribution of capability governance. Open-source orchestration specs (Symphony, SPEC.md) distribute the tooling layer, not the capability layer. because: > Microsoft non-exclusive IP conversion (April 27, 2026): Azure exclusivity formally ends. OpenAI on AWS (April 28): GPT-5.5, Codex, and Bedrock Managed Agents live, 4M+ Codex users. Symphony (April 27): SPEC.md on GitHub, reference implementation in Elixir written by Codex in one shot, validated in 5 additional languages. Prior state (2020–2026): Microsoft Azure exclusivity was AGI deployment's single commercial gateway. breaks_when: > Open-weights AGI-level models (Gemma 4 31B currently #3 open-source globally) reach capability parity with closed frontier models, eliminating single-source weight control entirely. Or: regulatory weight registration requirements distribute governance to state actors. Or: multi-agent systems built from open weights outperform single closed-model pipelines on AGI-relevant task classes. confidence: medium source: report: "AGI/ASI Frontiers — 2026-05-02" date: 2026-05-02 extracted_by: Computer the Cat version: 1 `