Superintelligence Won't Cure Cancer. Discipline Might.

The Center for Humane Technology named four structural reasons. The 2026 record sharpens one of them. The dimension they didn't name changes what to build.

May 13, 2026

The Davos contradiction

Sam Altman, Demis Hassabis, and Dario Amodei have all said AI will cure cancer.1 Then in January 2026, at Davos, Hassabis told the room today’s AI cannot invent its own hypotheses.

Hassabis named “one or two more breakthroughs” as still required before AGI, five to ten years out by his estimate. The creative, inventive capability needed for AI to generate its own scientific conjectures, he said, is still far away.2

The executive most invested in the cure-cancer promise just named what the promise denied.

This isn’t an outsider’s critique of AI. It is a self-correction from inside the field. When the CEO most aligned with the AGI-cures-cancer thesis names the methodology gap, the gap is on the record.

The Center for Humane Technology made the same diagnosis from outside. In May 2026, CHT published with science writer Julia Scott a response to the persistent claim that scaling AI compute will eliminate cancer.3 They argued the promise is structurally broken. They named four reasons.

The 2026 record sharpens one of them. There is also a dimension they did not name. That dimension is the only one that changes what to build.

CHT’s four reasons, sharpened by the 2026 record

CHT’s piece is organized around four claims:

Cancer progress comes from data and resources, not accelerations in knowledge. Their first reason argues we need more clean data and better-funded biology, not faster compute or larger models.
Cancer is complex, and treating it is highly individualized. Tumor heterogeneity, microenvironment, immune-system context, and patient-specific factors exceed what current models can compress.
Curing cancer has other inherent bottlenecks. Mice-to-human translation fails 90 percent of the time, FDA approval cycles run on their own clock, and clinical trial scale operates on a different timeline than compute scale.
Our resources could be so much better spent. AI capex 2026 estimates approach $725 billion across the Big Four hyperscalers.4 NCI 2026 budget: $7.35 billion.5 Across that ratio, cancer survival rates remain almost exactly where they were a decade ago.

Three of these are durable. The first is more interesting than CHT presents it.

CHT writes, “we don’t even have a national sort of data commons of cancer genetics and imaging data.”

The National Cancer Institute’s Cancer Research Data Commons already holds 42 million files indexed across six commons: Genomic, Proteomic, Imaging, Canine, General, and Clinical/Translational.6 It catalogs 139,000 subjects and 821,000 specimens.

The data commons exists. CHT’s argument lands better when sharpened: the commons exists, but it has not yet been converted into the disciplined, falsifiable, public-tournament structure that the protein-folding field built. CHT is identifying a real bottleneck. The bottleneck is not the absence of data infrastructure. The bottleneck is the absence of the discipline that converts data infrastructure into discovery.

That distinction is the dimension CHT’s piece does not name.

The dimension CHT doesn’t name: discipline

AI vendors spent somewhere between $540 billion and $725 billion in 2026 buying intelligence: model scale, retrieval speed, generation fluency. They didn’t buy discipline.

Discipline is what scientific discovery has always required.

CRISPR, Nobel 2020, awarded to Jennifer Doudna and Emmanuelle Charpentier, came from a decade of bacterial-immunity research at UC Berkeley and Umeå University.7 The discovery was structured around hypothesis tests the researchers could lose. Each iteration on the Cas9 mechanism was a falsifiable prediction the lab could be wrong about. The discipline was not “having sufficient intelligence.” The discipline was “willing to be wrong in front of the field, on schedule, with falsifiable claims.”

mRNA vaccines, Nobel 2023, awarded to Katalin Karikó and Drew Weissman, took thirty years of methodological persistence against near-zero institutional support.8 Karikó was famously demoted from the tenure track at the University of Pennsylvania for her unfashionable focus on a technique the field had largely abandoned. She lost grants repeatedly. She continued. The discipline was decades of refusal to abandon a hypothesis when the system around her said abandon. Intelligence was not the bottleneck. Persistence inside a structured falsification arc was.

Checkpoint inhibitors, Nobel 2018, awarded to James Allison and Tasuku Honjo, required decades of collaborative T-cell biology before the clinical win arrived.9 Allison’s CTLA-4 work, Honjo’s PD-1 work, and the convergence into clinical immunotherapy spanned multiple institutions and dozens of independent laboratories. The discipline was a multi-decade research community willing to test, falsify, and replicate across competing groups. The intelligence was the last layer.

AlphaFold itself, Nobel 2024, awarded to Demis Hassabis and John Jumper, succeeded because two prior disciplines had already done their work. The Protein Data Bank had been compiling clean, deposit-verified protein structures since 1971.10 The Critical Assessment of Structure Prediction had been running blind-prediction tournaments since 1994.11 Both predated DeepMind by decades. The discipline was external to the AI. The AI was the last layer.

Four Nobels. Four disciplines.

Each one needed an infrastructure of discipline before intelligence could finish the job.

Cancer has the infrastructure.

The CASP asymmetry

CRDC has the data. AlphaFold had data too, and it also had CASP.

CASP is the load-bearing institution that made AlphaFold possible. Every two years since 1994, the CASP organizers send the same set of protein sequences to dozens of independent laboratories around the world. Each lab predicts the three-dimensional structure of those proteins. The organizers then compare every submission against the experimentally-determined truth: structures whose answers had been kept secret from all submitting teams until the submission deadline.

Teams that lose, lose publicly. Teams that win, win publicly. The benchmark is the ground-truth test that turns model output into either evidence or refutation. Nothing gets to call itself a protein-structure-prediction win without surviving a CASP cycle.

AlphaFold won CASP14 in 2020 by a margin so large it ended the field as it had been practiced. By 2022, AlphaFold 2 was the standard. By 2024, John Jumper had a Nobel.

CASP was the discipline. The Nobel followed.

Cancer has no CASP.

There is no recurring public benchmark for cancer hypotheses to lose against each other. There is the Cancer Research Data Commons, and there are the models. There is no infrastructure for the models to be wrong in front of each other.

This is what the discipline gap looks like at full resolution. Cancer has more data, by raw volume, than protein structure prediction ever had at the moment of its breakthrough. What cancer does not have is the public-tournament structure that disciplines hundreds of independent labs into producing falsifiable predictions which are then publicly graded against ground truth.

The cancer-AlphaFold will not arrive because cancer gets more compute. The cancer-AlphaFold will arrive after cancer gets its CASP equivalent, or after enough capital and patience builds a discipline of comparable rigor by other means.

The Watson lesson

In 2018, STAT News obtained internal documents revealing an AI failure the medical-AI field has never fully metabolized.12

IBM Watson for Oncology was supposed to recommend cancer treatments by ingesting medical literature and patient records. Instead, it was trained on synthetic cases. One or two doctors at Memorial Sloan Kettering invented hypothetical patients for each cancer type and walked Watson through their preferred treatment paths. The system learned the preferences of a handful of doctors. It did not learn the evidence base it claimed to represent.

One case sealed the diagnosis. Watson recommended chemotherapy with bevacizumab to a sixty-five-year-old lung cancer patient with documented severe bleeding. Bevacizumab carries a black-box warning forbidding administration to patients with severe bleeding. The recommendation directly violated the FDA-approved indication and the National Comprehensive Cancer Network guidelines.

No patient died from Watson’s advice. The project collapsed. The hospitals that had purchased it absorbed the loss. The technology was sold off.

What Watson lacked was not intelligence. IBM had built one of the most credentialed AI systems in medicine at the time. The model itself was sophisticated.

What Watson lacked was the discipline IBM did not buy:

Provenance checks on what trained the system. Synthetic training cases generated by a handful of operators produced a model that learned operator preference, not evidence. A real evidence-base training would have required adversarial review of every training input. There was no such review.

Testing against the FDA’s published indications and the NCCN’s published guidelines. A discipline of falsification would have caught the bevacizumab recommendation before the system ever reached a hospital. Watson’s outputs were never benchmarked against the documents whose authority it claimed to inherit.

Input from more than one or two doctors per cancer type. A discipline of multi-source convergence would have surfaced the operator-preference contamination immediately. Cross-institutional review would have shown that Watson’s recommendations were systematically Memorial Sloan Kettering’s recommendations dressed in IBM branding.

Discipline was what Watson didn’t have. Methodology was the failure surface.

The hospitals that adopted Watson are still treating cancer patients today, with worse intelligence and better discipline. The trade was the right one.

The counter-arguments

“AI scale will get us there. Give it three more years.”

This is the most common reply, and it has weakened sharply in 2026.

Hassabis at Davos: “one or two more breakthroughs” needed before AGI; current systems still missing the ability to generate their own scientific conjectures. This is not a generic caveat. It is a specific structural claim about what scaling alone produces.

Harvard and MIT researchers trained a transformer on Kepler’s planetary-motion data and tested whether the model learned Newton’s underlying laws.13 The model learned the predictions. It did not learn the laws. Scaling produces accurate prediction. It does not produce understanding of the mechanism that generates the prediction. Discovery requires mechanism. The Kepler-Newton paper is the cleanest published evidence that intelligence-without-discipline does not climb the ladder from pattern to law.

The breakthroughs Hassabis names are not breakthroughs in compute. They are breakthroughs in capability we do not yet know how to build. Three more years of scale will produce a more fluent system. They will not produce a system that invents its own hypotheses.

“Narrow purpose-built AI is the right answer.”

This is Emilia Javorsky’s prescription in her March 2026 essay, on which the CHT response is built.14 The argument is that general-purpose AGI is the wrong frame; narrow, task-bounded AI for specific cancer problems is the right one.

Narrow AI is winning at task-bounded pattern recognition. Stanford’s Esteva and colleagues demonstrated dermatologist-level skin lesion classification in 2017.15 Google Health’s mammography model matched and outperformed radiologists on cancer detection in 2020.16 Paige.AI received FDA De Novo authorization (DEN200080) for prostate cancer pathology decision support in September 2021.17 These are real wins at the pattern-recognition layer and they should continue.

Narrow AI is not what produces discovery. Narrow AI operationalizes discovery once the discipline has named the right pattern to recognize. The dermatology classifier worked because dermatologists had already established the relevant visual features over a century of clinical observation. The mammography model worked because radiologists had already established what suspicious lesions look like. The pathology AI worked because pathologists had already established the cellular criteria. In each case, the human discipline came first; the AI compressed it.

The NCI Cancer Research Data Commons has been waiting for the discipline that converts its 42 million files into mechanism-bearing hypotheses. Narrow AI without that discipline cannot deliver the cancer-AlphaFold that CHT and Javorsky both want. Narrow AI delivers the better mammogram reader. It does not deliver the discovery.

“Show me one named operator producing discipline-as-product at scale.”

This is the strongest counter, and it is the one this argument cannot fully discharge today.

The methodology-as-product investment thesis is small-N field-validated, not field-saturated. AlphaFold itself is the proof case: PDB plus CASP plus DeepMind produced a Nobel because all three layers were present. CRISPR, mRNA, and checkpoint inhibitors are pre-AI proof cases of discipline-driven discovery.

The thesis is structural: build the discipline as deliberate infrastructure, and absorb AI roles inside it, rather than buying the AI and hoping the discipline emerges by accident.

The audience-relevant question is whether to bet on discipline-as-product before the field has publicly named what it is buying. The named-operator convergence (Hassabis at Davos, Melanie Mitchell at the Santa Fe Institute,18 Jennifer Listgarten at UC Berkeley,19 the Nature Medicine April 2026 editorial board20) suggests the field is groping toward the name. It has not yet been said clearly. The investors and advisors who recognize the shape early have the asymmetric position.

What to buy when buying AI for science

When the next AI-for-science pitch arrives, the question is not “which model.” The question is “which discipline.”

Three questions for any AI-for-science claim:

Where does the system lose to ground truth in public?

If there is no recurring, public benchmark against which the system can be wrong in front of independent reviewers, the discipline is missing. AlphaFold had CASP. The system you are evaluating needs its equivalent, by name, by cadence, by structure. If the answer is vague, the discipline is vague.

What gates its training inputs?

If the system was trained on synthetic cases, marketing claims, or unverified data sources, the discipline is missing. Watson’s failure mode was upstream of the model. A real discipline checks what trained the system before it checks what the system says. The vendor should be able to name the training-input provenance discipline in one sentence. If they cannot, the system inherits Watson’s failure architecture.

Who keeps it honest when no one is watching?

If the only audit happens when a customer asks for a demonstration, the discipline is missing. Real disciplines have built-in adversarial review, source-integrity checks, and ongoing falsification that operates without the operator’s prompting. The discipline is the system that catches the system. Ask who runs it, on what cadence, with what consequences for the model when failures are surfaced.

These are not philosophical questions. They are diligence-vocabulary questions any buyer can ask of any AI-for-science vendor today and get a real answer or a tell.

The close

The next decade of scientific discovery is a discipline question, not an intelligence one.

The Center for Humane Technology is right: superintelligence won’t cure cancer. The infrastructure is not the bottleneck. The cancer-data commons exists. The bottleneck is the discipline that converts infrastructure into discovery, and the field has not yet named what that discipline is.

When the field names it, the methodology will be what gets funded, what gets bought, and what gets credited. The four Nobels show what discipline looks like in retrospect. The fifth one, the one that comes from converting 42 million files into a mechanism that ends a cancer subtype, is waiting on the discipline no one has yet built at scale.

Buy the discipline that runs models. Don’t buy the models that promised to be the discipline.

Altman's cure-cancer framings appear across OpenAI public communications (2023-2025). Amodei's October 2024 essay Machines of Loving Grace (https://darioamodei.com/machines-of-loving-grace) projects AI compressing decades of biomedical progress including cancer into 5-10 years. Hassabis has stated similar framings across multiple interviews, including his January 2026 Davos remarks (see note 2).

Hassabis, D. Comments at the World Economic Forum, Davos, January 2026. "One or two more breakthroughs" verbatim per Fortune (Jan 23, 2026, https://fortune.com/2026/01/23/deepmind-demis-hassabis-anthropic-dario-amodei-yann-lecun-ai-davos/). The hypothesis-generation framing is also reported in Semafor (Jan 21, 2026, https://www.semafor.com/article/01/21/2026/google-deepminds-demis-hassabis-and-the-paradox-of-ai-progress) and Big Technology newsletter (Alex Kantrowitz interview, January 2026).

Center for Humane Technology + Julia Scott. No, Superintelligence Won't Cure Cancer. Substack, May 7, 2026. https://centerforhumanetechnology.substack.com/p/no-superintelligence-wont-cure-cancer

Big Four (Microsoft + Amazon + Alphabet + Meta) 2026 AI capex estimates approach $725 billion. Sources: Statista capital-expenditure analysis 2026 (https://www.statista.com/chart/35046/capital-expenditure-of-meta-alphabet-amazon-and-microsoft/); CNBC Apr 30 2026 (https://www.cnbc.com/2026/04/30/ai-boom-big-tech-capital-expenditures-now-seen-topping-1-trillion-in-2027-.html); Fortune Apr 29, 2026. Range frame ($540B-$725B) reflects different inclusion criteria across the 2025-2026 boundary; Javorsky 2026 cites $540B for 2025 baseline.

National Cancer Institute 2026 budget: $7.35B per Consolidated Appropriations Act 2026 (H.R. 1748); +$128M over FY25. Sources: https://www.cancer.gov/about-nci/budget; FY2026 NCI Congressional Justification.

NCI Cancer Research Data Commons. https://datacommons.cancer.gov. Specific numbers verified at the AACR Journal: "NCI Cancer Research Data Commons: Core Standards and Services" (PMC11067691): "more than 42 million files, 139,000 subjects, and 821,000 specimens."

Doudna, J. and Charpentier, E. Nobel Prize in Chemistry 2020 (https://www.nobelprize.org/prizes/chemistry/2020/press-release/). Charpentier was at Umeå University 2009-2014 (Laboratory for Molecular Infection Medicine Sweden), where the foundational tracrRNA paper (2011) was produced. The Doudna-Charpentier collaboration started during her Umeå tenure (2012).

Karikó, K. and Weissman, D. Nobel Prize in Physiology or Medicine 2023. Karikó's UPenn demotion from the tenure track in 1995 after repeated grant rejections is documented across multiple biographical sources, including her 2023 memoir Breaking Through (Crown). She was never granted tenure.

Allison, J. and Honjo, T. Nobel Prize in Physiology or Medicine 2018: "for their discovery of cancer therapy by inhibition of negative immune regulation." https://www.nobelprize.org/prizes/medicine/2018/press-release/

Protein Data Bank founded 1971 at Brookhaven National Laboratory under Walter Hamilton. Announced in Nature New Biology, October 1971.

Critical Assessment of Structure Prediction (CASP) founded 1994 by John Moult and Krzysztof Fidelis.

Casey Ross and Ike Swetlitz. IBM pitched its Watson supercomputer as a revolution in cancer care. It's nowhere close. STAT News, July 25, 2018. https://www.statnews.com/2018/07/25/ibm-watson-recommended-unsafe-incorrect-treatments/

Vafa, K., Chang, P.G., Rambachan, A., and Mullainathan, S. What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models. arXiv:2507.06952, ICML 2025. https://arxiv.org/abs/2507.06952

Javorsky, E. How AI Can, and Can't, Cure Cancer. curecancer.ai, March 13, 2026. https://curecancer.ai/AI_vs_Cancer_Essay_130326.pdf

Esteva, A., Kuprel, B., Novoa, R.A., et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115-118 (2017). DOI: 10.1038/nature21056.

McKinney, S.M., Sieniek, M., Godbole, V., et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89-94 (2020). DOI: 10.1038/s41586-019-1799-6.

FDA De Novo Authorization DEN200080, Paige Prostate AI-based pathology decision support, September 21, 2021. https://www.accessdata.fda.gov/cdrh_docs/pdf20/DEN200080.pdf

Mitchell, M. On Evaluating Cognitive Capabilities in Machines (and Other 'Alien' Intelligences). AIGuide Substack, January 14, 2026. https://aiguide.substack.com/p/on-evaluating-cognitive-capabilities Exact wording: "Something that baffles me about many AI researchers is the seeming lack of curiosity about the mechanisms underlying the benchmark performance they report."

Listgarten, J. The perpetual motion machine of AI-generated data and the distraction of ChatGPT as a 'scientist'. Nature Biotechnology, January 2024. DOI: 10.1038/s41587-023-02103-0.

Nature Medicine editorial board. Show us the evidence for the value of medical AI. Nature Medicine 32, 1163 (April 2026). DOI: 10.1038/s41591-026-04389-4.

Ryan Gruzen is founder and CEO of Applied Symbiotic Intelligence. He writes about what good outcomes have in common, across scientific discovery, clinical care, and the human-animal bond. Patent-pending; seven provisional applications filed. asi-technology.com

If this resonates, subscribe. The next essay is on the corruption of the scientific record through AI-generated hallucinated citations, titled Six Times the Rate. Three Reviewers Per Paper. No One Caught It. Future essays move into the other domains where the same pattern shows up: clinical care, the human-animal bond, and the systems that last.

Applied Symbiotic Intelligence

Discussion about this post

Ready for more?