When AI Enters the Laboratory

Abstract

Artificial intelligence is becoming a material part of scientific discovery not because it replaces theory, experiment, or human judgment, but because it changes the architecture of the discovery process. Classical research usually advances through a slow sequence of conjecture, experimental trial, measurement, interpretation, and revision. AI compresses that loop. It can search vast candidate spaces, approximate expensive simulations, extract signals from high-dimensional experimental data, plan the next experiment under uncertainty, and, when connected to robotic systems, execute laboratory workflows with a degree of continuity and reproducibility that is difficult to obtain through manual experimentation alone.

This article examines that transformation through the case of machine-learning-guided discovery of the kagome superconductors YRu₃B₂ and LuRu₃B₂. In that work, machine learning and first-principles calculations were used to identify promising superconducting candidates in a large materials space; however, the scientific result became real only after synthesis and experimental validation through magnetization, specific heat, and electrical transport measurements. The case is therefore useful because it separates prediction from discovery. AI did not simply discover a superconductor in isolation; it helped construct a disciplined prediction-to-realization pipeline in which computational screening, physical theory, laboratory synthesis, and measurement each performed a distinct epistemic function.

The same pattern is now visible in other laboratory domains. Self-driving laboratories combine literature-derived knowledge, robotic synthesis, active learning, and instrument feedback to accelerate inorganic materials discovery. AI systems in organic chemistry increasingly coordinate protocol design, reaction optimization, and experimental execution. In protein science, structure-prediction and generative-design models propose molecules that must still be expressed, purified, characterized, and tested. In antimicrobial discovery, deep-learning systems mine enormous biological search spaces for candidate molecules, while assays and animal models determine whether the predictions survive contact with biology. In fusion research, reinforcement-learning controllers expand the accessible operating regimes of plasma experiments by stabilizing configurations that would otherwise be difficult to sustain.

The central claim is deliberately bounded. AI is not a substitute for empirical validation, mechanistic explanation, uncertainty quantification, or replication. A generated molecule is not a drug; a predicted material is not a synthesized phase; a calculated superconducting transition temperature is not a measured thermodynamic transition. The real contribution of AI is more structural: it turns scientific work into a tighter cybernetic loop of prediction, intervention, measurement, and correction. The scientific advantage will therefore accrue not merely to better models, but to research environments able to integrate curated data, reliable automation, calibrated instruments, uncertainty-aware algorithms, and human scientists capable of converting model output into durable knowledge.

An analysis of how artificial intelligence is changing scientific discovery by closing the loop between prediction, laboratory experimentation, measurement, and model correction, with examples from superconductors, self-driving laboratories, protein design, antibiotics, and fusion-plasma control.

Introduction

For more than a century, the discovery of superconductors has been only partially theory-driven. The physical phenomenon is well defined: below a critical temperature, electrical resistance vanishes and magnetic flux is expelled. Yet the route from that definition to a new material is not deductive. Even after BCS theory clarified the role of electron-phonon coupling in conventional superconductivity, the actual discovery of new superconducting families remained largely empirical, contingent, and often serendipitous. Cuprates, pnictides, heavy-fermion superconductors, organic superconductors, nickelates, and even MgB₂ were not found by a complete predictive theory that pointed directly to a compound before experimental work. The space of possible materials is too large, the relevant interactions are too complex, and accurate first-principles calculations are too expensive to apply exhaustively.

The paper at the center of this article is interesting because it reports a different kind of discovery pipeline. In Machine-learning-guided discovery of kagome superconductors YRu₃B₂ and LuRu₃B₂, Rose Albu Mustaf and co-authors report the experimental discovery of bulk superconductivity in two kagome-lattice compounds, YRu₃B₂ and LuRu₃B₂. The compounds were not invented by the model in isolation: they were selected through machine-learning-accelerated high-throughput screening, then refined through first-principles calculations, synthesized, and experimentally characterized. The measured superconducting critical temperatures are low, 0.81 K for YRu₃B₂ and 0.95 K for LuRu₃B₂, but the epistemic result is stronger than the temperature scale might suggest: superconductivity was confirmed by magnetization, specific heat, and electrical transport measurements, with nearly complete superconducting volume fractions.¹

The material class matters. Kagome lattices are geometrically distinctive because their electronic structures can host quasiflat bands, and quasiflat bands may enhance the density of states near the Fermi level. In a superconductor, density of states, phonon spectra, electron-phonon coupling, orbital character, and band geometry all affect whether a superconducting phase appears and what critical temperature it can reach. The paper therefore does not merely say that a machine-learning model ranked candidates. It connects a specific structural motif, the Ru kagome network, to a specific physical hypothesis about superconductivity; then it tests that hypothesis through the full chain of synthesis, diffraction, thermodynamics, transport, and density-functional-theory-based interpretation.

The most important feature of the result is its division of labor. Machine learning reduced the candidate space. First-principles calculations imposed physical constraints. Laboratory synthesis converted a candidate into matter. Magnetization, heat-capacity, and resistivity measurements established the superconducting transition. Subsequent electronic-structure and phonon calculations explained why the observed critical temperatures were lower than the original predictions: compared with LaRu₃Si₂, the Y and Lu borides have more dispersive Ru-derived quasiflat bands, a reduced density of states at the Fermi level, and harder phonon spectra, all of which weaken electron-phonon coupling. The error is therefore not an embarrassment to the method; it is part of the scientific loop. Prediction, deviation, measurement, and mechanistic correction belong to the same process.

This is the broader point. Artificial intelligence contributes to scientific discovery when it changes the economics and architecture of the discovery loop: it reduces the cost of searching large hypothesis spaces, compresses experimental data into usable signals, selects the next experiment, and, increasingly, controls laboratory equipment. But a scientific claim still becomes knowledge only when it survives contact with measurement, replication, and mechanistic criticism. The kagome-superconductor case is therefore a useful anchor for a wider argument: AI is not replacing experiment with prediction; it is reorganizing scientific discovery into a tighter cybernetic system in which models, simulations, instruments, robots, and human scientists form a feedback loop. Comparable closed-loop patterns are now visible in inorganic materials synthesis, autonomous organic chemistry, protein design, antimicrobial discovery, and fusion-plasma control.

The bedrock model: what counts as scientific discovery?

At the most elementary level, a scientific discovery is not a text, a prediction, or a ranked list. It is a stable increase in reliable knowledge about the world. That requires at least four elements: a hypothesis, a predicted observable, an intervention or measurement, and a correction mechanism when the observation disagrees with the hypothesis.

AI can improve each element, but it cannot abolish the loop. If a model predicts a molecule, a material, a protein, or a plasma-control policy, that output remains a conjecture until it is connected to physical measurement. In this sense, AI is not a replacement for experiment; it is a means of changing the search strategy before experiment and the learning strategy after experiment.

The 2024 Nobel Prize in Chemistry is useful not as an appeal to authority, but as a signal that AI has crossed from computational assistance into core scientific infrastructure. The prize recognized computational protein design and protein-structure prediction: David Baker for designing new proteins, and Demis Hassabis and John Jumper for AlphaFold-based structure prediction, which the Nobel committee described as solving a decades-old protein-structure problem.²

From search to closed-loop discovery

The classical laboratory workflow is linear: choose a candidate, synthesize it, measure it, publish the result, and let other researchers iterate. The AI-enabled workflow is recursive. A model proposes candidates; a robot or experimental team tests them; instruments generate data; machine-learning models interpret the data; active learning selects the next experiment; and the loop continues until a stopping criterion is reached.

flowchart TD
    A[Scientific objective] --> B[Candidate generation]
    B --> C[Physics / chemistry / biology constraints]
    C --> D[Experiment planning]
    D --> E[Laboratory execution]
    E --> F[Measurement and characterization]
    F --> G[Model update]
    G --> B
    F --> H[Human mechanistic interpretation]
    H --> A

Figure 1: AI contributes most when it closes the loop between hypothesis generation, laboratory execution, measurement, and model correction.

The contribution of AI is therefore best understood as a set of operators applied to the discovery process:

Search compression: reduce a huge candidate space to a smaller set worth testing.
Surrogate modelling: approximate expensive simulations or experiments.
Active learning: choose the next experiment by expected information gain.
Instrument interpretation: convert spectra, diffraction patterns, microscopy, or sensor streams into structured evidence.
Robotic execution: perform experiments reproducibly and continuously.
Control: steer unstable physical systems in real time.
Scientific agency: coordinate literature search, code, protocols, instruments, and documentation.

The strongest cases combine several of these operators. A ranking model alone is useful; a ranking model connected to robotic experimentation and measurement is a different type of scientific machine.

Case study: AI-guided discovery of kagome superconductors

The paper reports the experimental discovery of bulk superconductivity in YRu₃B₂ and LuRu₃B₂, two kagome-lattice compounds predicted through machine-learning-accelerated high-throughput screening combined with first-principles calculations.³

The logic is precise. Kagome lattices can host quasiflat electronic bands; quasiflat bands can increase density of states near the Fermi level; increased density of states can strengthen pairing channels under suitable interactions; therefore kagome compounds are plausible candidates for superconductivity. But that physical intuition does not identify a material by itself. The materials space is combinatorial. Machine learning is used as a filter over that space, while density-functional theory and electron-phonon calculations impose physics-based constraints before synthesis.

The experimental result is modest in critical temperature but strong in epistemic value. The compounds show superconducting transitions around 0.81 K for YRu₃B₂ and 0.95 K for LuRu₃B₂, with bulk superconductivity confirmed through magnetization, specific heat, and electrical transport. The paper also reports nearly complete superconducting volume fractions, making the result more than a surface or impurity effect.⁴

The discrepancy between prediction and measurement is as important as the positive discovery. The predicted critical temperatures were higher than the measured values. The authors explain the reduction by electronic and vibrational details: relative to LaRu₃Si₂, the Y and Lu borides show more dispersive Ru-derived quasiflat bands, lower density of states at the Fermi level, and harder phonon spectra, which reduce electron-phonon coupling. In other words, AI narrowed the search, but physics explained the error.

This is the correct division of labor. AI proposes; first-principles theory constrains; experiment decides; theory then repairs the model. The discovery is not AI found a superconductor in the crude sense. It is a prediction-to-realization pipeline in which AI made the search tractable.

The kagome-superconductor result is therefore not an isolated curiosity. It is one instance of a broader change in the organization of scientific work. Across several laboratory domains, AI is becoming useful when it is connected to experiment rather than separated from it: models propose, instruments measure, robotic systems execute, and the resulting data correct the next cycle of prediction.

Other laboratory fields where the same pattern is already visible

Inorganic materials and self-driving synthesis

The A-Lab at Berkeley/Lawrence Berkeley National Laboratory is one of the clearest examples of AI entering the physical laboratory. In the Nature paper introducing it, the system used computations, literature-derived knowledge, machine learning, active learning, robotics, and X-ray diffraction analysis to synthesize inorganic powders. Over 17 days of continuous operation, it realized 36 compounds from 57 targets, with recipes proposed by language models trained on synthesis literature and optimized through active learning.⁵

GNoME, Google DeepMind’s graph-network system for materials exploration, shows the computational side of the same pipeline. The Nature paper reports more than 2.2 million structures predicted stable relative to previous work, with 381,000 new entries on the updated convex hull. Those predictions are not equivalent to manufactured materials; the authors explicitly note remaining problems such as phase transitions, dynamic stability, configurational entropy, and synthesizability. The scientific direction is nevertheless clear: AI expands the candidate universe, while autonomous labs test which candidates are physically realizable.⁶

Organic chemistry and autonomous experimental execution

Coscientist, reported in Nature, is an LLM-driven system that designs, plans, and performs chemistry experiments using tools such as documentation search, code execution, and laboratory automation. The reported demonstrations include reaction optimization for palladium-catalysed cross-couplings. The relevant point is not that an LLM understands chemistry like a chemist; the concrete advance is orchestration. It connects protocol reasoning, instrument control, and experimental feedback into a semi-autonomous workflow.⁷

This matters because much of chemistry is not only theory but procedural knowledge: temperatures, solvents, purification steps, failed routes, instrument settings, and tacit heuristics. A model connected to the literature and to automated execution can turn such knowledge into a more searchable design space.

Protein structure, protein design, and wet-lab validation

Protein science is the field where the AI-to-lab pathway is most visible to non-specialists. AlphaFold changed the cost of moving from sequence to plausible structure, while generative systems such as RFdiffusion move in the opposite direction: from desired structure or function toward candidate proteins.

The RFdiffusion Nature paper reports experimental characterization of hundreds of designed symmetric assemblies, metal-binding proteins, and protein binders, including cryo-EM confirmation of a designed binder in complex with influenza haemagglutinin that closely matched the design model.⁸ This is not just computational annotation; it is generative design followed by laboratory expression, structural characterization, and functional testing.

The first-principles framing is simple. Proteins are physical polymers. Their function depends heavily on three-dimensional structure and interaction surfaces. If a model can learn constraints mapping sequence, structure, and binding geometry, it can propose objects that did not exist in nature. The wet lab remains decisive because folding, expression, solubility, affinity, immunogenicity, and cellular context are not fully captured by the model.

Antibiotic discovery

AI is also changing antimicrobial discovery, where the search space is enormous and conventional screening is slow. A 2025 Nature Microbiology study used deep learning to mine 233 archaeal proteomes, identifying 12,623 candidate antimicrobial molecules. The researchers synthesized 80 archaeasins, reported antimicrobial activity for 93% of them in vitro against several pathogens, and validated archaeasin-73 in mouse infection models against A. baumannii.⁹

This is a strong laboratory example because the AI contribution is bounded and testable. It does not claim to solve antibiotic resistance. It identifies candidate peptides from a biological search space that would be inefficient to explore manually, then subjects them to synthesis, in vitro assays, and in vivo validation. The discovery claim is credible only because the computational filter is followed by biological measurement.

Fusion-plasma control

AI contributes not only by proposing new matter but by controlling difficult experimental regimes. Deep reinforcement learning has been used to control tokamak plasmas. In one Nature study, an RL-designed magnetic controller learned in simulation and was experimentally verified on the TCV tokamak, controlling diverse plasma configurations, including advanced shapes and two simultaneous plasma droplets.¹⁰

A later Nature paper applied AI control to tearing-instability avoidance in DIII-D, the largest magnetic fusion facility in the United States. The controller maintained tearing likelihood under a threshold even under difficult operating conditions, enabling the plasma to track a stable path in time-varying operational space.¹¹

This is discovery-adjacent rather than discovery in the materials sense. The AI system does not discover a molecule or compound; it expands the reachable experimental state space. That can still accelerate science because many physical questions can only be asked if the apparatus can be held in the required regime long enough to measure it.

What is really new?

The novelty is not that computers assist science. Numerical simulation, database search, and statistical inference have been part of science for decades. The novelty is the integration of three layers that were historically separated.

First, AI models can represent high-dimensional candidate spaces: crystal structures, molecules, proteins, recipes, spectra, or control policies. Second, laboratory automation can execute enough experiments to make active learning meaningful. Third, the feedback loop can be closed, so failed experiments are not merely negative results but training signals.

This creates a different discovery regime. In the old regime, the bottleneck was often human candidate selection. In the new regime, the bottleneck shifts toward measurement quality, robotic reliability, data provenance, safety constraints, and the ability to distinguish genuine novelty from database artefacts.

Limits and failure modes

The central risk is category confusion. A prediction is not a discovery. A stable point on a computed convex hull is not a synthesized material. A designed binder is not a therapeutic. An antimicrobial peptide active in a mouse model is not an approved antibiotic. A tokamak controller validated in one device is not a commercial fusion reactor.

There are also technical failure modes. Materials models can misjudge synthesizability. Biological models can miss toxicity, dynamics, post-translational modifications, or cellular context. LLM-based laboratory agents can generate plausible but invalid protocols unless constrained by tools, safety rules, and verification. Autonomous laboratories can optimize what they measure rather than what scientists actually care about.

Therefore, the correct epistemic stance is neither rejection nor hype. AI should be treated as a new experimental architecture whose outputs must be audited through physical measurement, mechanistic interpretation, and reproducibility.

The laboratory as an AI-native architecture

The AI-native laboratory has five core components.

flowchart TB
    D[Data layer<br/>literature, databases, prior experiments] --> M[Model layer<br/>surrogates, generators, predictors]
    M --> P[Planning layer<br/>active learning, constraints, safety checks]
    P --> R[Robotic / procedural execution]
    R --> I[Instrumentation<br/>XRD, spectroscopy, microscopy, assays, sensors]
    I --> Q[Quality and provenance layer<br/>calibration, metadata, uncertainty]
    Q --> D
    Q --> H[Human governance<br/>mechanism, ethics, publication, replication]
    H --> P

Figure 2: Reference architecture for an AI-native laboratory.

The data layer contains literature, experimental logs, structures, assays, spectra, and failed attempts. The model layer proposes candidates or policies. The planning layer chooses experiments under cost, safety, and uncertainty constraints. The execution layer performs experiments. The instrumentation layer measures outcomes. The quality layer records provenance, calibration, uncertainty, and negative results. Human scientists remain necessary because model optimization is not equivalent to scientific understanding.

Conclusion

AI contributes to scientific discovery when it makes the conjecture-test-correction loop faster, broader, and more instrumented. The kagome-superconductor case shows the pattern in quantum materials: machine learning and first-principles calculations selected candidates; laboratory synthesis and measurements established the fact; physical theory explained the mismatch between predicted and observed critical temperatures.

The same structure is visible in self-driving inorganic synthesis, autonomous organic chemistry, protein design, antimicrobial discovery, and fusion-plasma control. The common denominator is not AI creativity. It is controlled search under constraints, joined to physical feedback.

The next scientific advantage will therefore belong less to isolated models than to institutions that can build reliable discovery loops: curated data, calibrated instruments, automated laboratories, uncertainty-aware models, safety constraints, and scientists able to turn model output into mechanistic knowledge.

Footnotes

Mustaf, R. A., Sajilesh, S. K. P., Mishra, S., Deng, J., Jiang, Y., Hiorth, K. H., Lamponen, E. O., Gutierrez-Amigo, M., Törmä, P., Marques, M. A. L., Bernevig, B. A., & Morosan, E. (2026). Machine-learning-guided discovery of kagome superconductors YRu₃B₂ and LuRu₃B₂. Physical Review Research, 8, 023308. DOI ↩︎
The Royal Swedish Academy of Sciences. (2024). Press release: The Nobel Prize in Chemistry 2024. NobelPrize.org. URL ↩︎
Mustaf, R. A., Sajilesh, S. K. P., Mishra, S., Deng, J., Jiang, Y., Hiorth, K. H., Lamponen, E. O., Gutierrez-Amigo, M., Törmä, P., Marques, M. A. L., Bernevig, B. A., & Morosan, E. (2026). Machine-learning-guided discovery of kagome superconductors YRu₃B₂ and LuRu₃B₂. Physical Review Research, 8, 023308. DOI ↩︎
Mustaf, R. A., Sajilesh, S. K. P., Mishra, S., Deng, J., Jiang, Y., Hiorth, K. H., Lamponen, E. O., Gutierrez-Amigo, M., Törmä, P., Marques, M. A. L., Bernevig, B. A., & Morosan, E. (2026). Machine-learning-guided discovery of kagome superconductors YRu₃B₂ and LuRu₃B₂. Physical Review Research, 8, 023308. DOI ↩︎
Szymanski, N. J., Rendy, B., Fei, Y., Kumar, R. E., He, T., Milsted, D., McDermott, M. J., Gallant, M., Cubuk, E. D., Merchant, A., Kim, H., Jain, A., Bartel, C. J., Persson, K. A., Zeng, Y., & Ceder, G. (2023). An autonomous laboratory for the accelerated synthesis of inorganic materials. Nature, 624, 86–91. DOI ↩︎
Merchant, A., Batzner, S., Schoenholz, S. S., Aykol, M., Cheon, G., & Cubuk, E. D. (2023). Scaling deep learning for materials discovery. Nature, 624, 80–85. DOI ↩︎
Boiko, D. A., MacKnight, R., Kline, B., & Gomes, G. (2023). Autonomous chemical research with large language models. Nature, 624, 570–578. DOI ↩︎
Watson, J. L., Juergens, D., Bennett, N. R., Trippe, B. L., Yim, J., Eisenach, H. E., Ahern, W., Borst, A. J., Ragotte, R. J., Milles, L. F., Wicky, B. I. M., Hanikel, N., Pellock, S. J., Courbet, A., Sheffler, W., Wang, J., Venkatesh, P., Sappington, I., Torres, S. V., Lauko, A., De Bortoli, V., Mathieu, E., Ovchinnikov, S., Barzilay, R., Jaakkola, T. S., DiMaio, F., Baek, M., & Baker, D. (2023). De novo design of protein structure and function with RFdiffusion. Nature, 620, 1089–1100. DOI ↩︎
Torres, M. D. T., Wan, F., & de la Fuente-Nunez, C. (2025). Deep learning reveals antibiotics in the archaeal proteome. Nature Microbiology, 10, 2153–2167. DOI ↩︎
Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B. D., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de Las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., Keeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., Noury, S., Pesamosca, F., Pfau, D., Sauter, O., Sommariva, C., Coda, S., Duval, B., Fasoli, A., Kohli, P., Kavukcuoglu, K., Hassabis, D., & Riedmiller, M. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602, 414–419. DOI ↩︎
Seo, J., Kim, S., Jalalvand, A., Conlin, R., Rothstein, A., Abbate, J., Erickson, K., Wai, J., Shousha, R., & Kolemen, E. (2024). Avoiding fusion plasma tearing instability with deep reinforcement learning. Nature, 626, 746–751. DOI ↩︎

Introduction

The bedrock model: what counts as scientific discovery?

From search to closed-loop discovery

Case study: AI-guided discovery of kagome superconductors

Other laboratory fields where the same pattern is already visible

Inorganic materials and self-driving synthesis

Organic chemistry and autonomous experimental execution

Protein structure, protein design, and wet-lab validation

Antibiotic discovery

Fusion-plasma control

What is really new?

Limits and failure modes

The laboratory as an AI-native architecture

Conclusion

See also longforms

AI Adoption, Productivity, and the Missing Middle

A Glimpse of Agent Evolution

Beyond the Hype: What Microsoft’s Copilot Data Really Says About AI at Work

Darwin Gödel Machine: A Commentary on Novelty and Implications

Beyond Human Data: A Critical Examination of Silver & Sutton’s _Welcome to the Era of Experience_

Beyond the Urgency: A Commentary on Dario Amodei’s Vision for AI Interpretability

See also posts

The Industrialization of Intelligence

When a Nobel Laureate Uses an LLM to Prove a Theorem: A Turning Point for Mathematical Discovery

The Leiden Declaration and the Governance of AI-Assisted Mathematics

The New Cognitive Infrastructure of Science

The Erdős Moment of AI

When Vulnerability Databases Become Triage Systems

Footnotes