An analysis of how artificial intelligence is changing scientific discovery by closing the loop between prediction, laboratory experimentation, measurement, and model correction, with examples from superconductors, self-driving laboratories, protein design, antibiotics, and fusion-plasma control.
Introduction
For more than a century, the discovery of superconductors has been only partially theory-driven. The physical phenomenon is well defined: below a critical temperature, electrical resistance vanishes and magnetic flux is expelled. Yet the route from that definition to a new material is not deductive. Even after BCS theory clarified the role of electron-phonon coupling in conventional superconductivity, the actual discovery of new superconducting families remained largely empirical, contingent, and often serendipitous. Cuprates, pnictides, heavy-fermion superconductors, organic superconductors, nickelates, and even MgB2 were not found by a complete predictive theory that pointed directly to a compound before experimental work. The space of possible materials is too large, the relevant interactions are too complex, and accurate first-principles calculations are too expensive to apply exhaustively.
The paper at the center of this article is interesting because it reports a different kind of discovery pipeline. In Machine-learning-guided discovery of kagome superconductors YRu3B2 and LuRu3B2, Rose Albu Mustaf and co-authors report the experimental discovery of bulk superconductivity in two kagome-lattice compounds, YRu3B2 and LuRu3B2. The compounds were not invented by the model in isolation: they were selected through machine-learning-accelerated high-throughput screening, then refined through first-principles calculations, synthesized, and experimentally characterized. The measured superconducting critical temperatures are low, 0.81 K for YRu3B2 and 0.95 K for LuRu3B2, but the epistemic result is stronger than the temperature scale might suggest: superconductivity was confirmed by magnetization, specific heat, and electrical transport measurements, with nearly complete superconducting volume fractions.
The material class matters. Kagome lattices are geometrically distinctive because their electronic structures can host quasiflat bands, and quasiflat bands may enhance the density of states near the Fermi level. In a superconductor, density of states, phonon spectra, electron-phonon coupling, orbital character, and band geometry all affect whether a superconducting phase appears and what critical temperature it can reach. The paper therefore does not merely say that a machine-learning model ranked candidates. It connects a specific structural motif, the Ru kagome network, to a specific physical hypothesis about superconductivity; then it tests that hypothesis through the full chain of synthesis, diffraction, thermodynamics, transport, and density-functional-theory-based interpretation.
The most important feature of the result is its division of labor. Machine learning reduced the candidate space. First-principles calculations imposed physical constraints. Laboratory synthesis converted a candidate into matter. Magnetization, heat-capacity, and resistivity measurements established the superconducting transition. Subsequent electronic-structure and phonon calculations explained why the observed critical temperatures were lower than the original predictions: compared with LaRu3Si2, the Y and Lu borides have more dispersive Ru-derived quasiflat bands, a reduced density of states at the Fermi level, and harder phonon spectra, all of which weaken electron-phonon coupling. The error is therefore not an embarrassment to the method; it is part of the scientific loop. Prediction, deviation, measurement, and mechanistic correction belong to the same process.
This is the broader point. Artificial intelligence contributes to scientific discovery when it changes the economics and architecture of the discovery loop: it reduces the cost of searching large hypothesis spaces, compresses experimental data into usable signals, selects the next experiment, and, increasingly, controls laboratory equipment. But a scientific claim still becomes knowledge only when it survives contact with measurement, replication, and mechanistic criticism. The kagome-superconductor case is therefore a useful anchor for a wider argument: AI is not replacing experiment with prediction; it is reorganizing scientific discovery into a tighter cybernetic system in which models, simulations, instruments, robots, and human scientists form a feedback loop. Comparable closed-loop patterns are now visible in inorganic materials synthesis, autonomous organic chemistry, protein design, antimicrobial discovery, and fusion-plasma control.
The bedrock model: what counts as scientific discovery?
At the most elementary level, a scientific discovery is not a text, a prediction, or a ranked list. It is a stable increase in reliable knowledge about the world. That requires at least four elements: a hypothesis, a predicted observable, an intervention or measurement, and a correction mechanism when the observation disagrees with the hypothesis.
AI can improve each element, but it cannot abolish the loop. If a model predicts a molecule, a material, a protein, or a plasma-control policy, that output remains a conjecture until it is connected to physical measurement. In this sense, AI is not a replacement for experiment; it is a means of changing the search strategy before experiment and the learning strategy after experiment.
The 2024 Nobel Prize in Chemistry is useful not as an appeal to authority, but as a signal that AI has crossed from computational assistance into core scientific infrastructure. The prize recognized computational protein design and protein-structure prediction: David Baker for designing new proteins, and Demis Hassabis and John Jumper for AlphaFold-based structure prediction, which the Nobel committee described as solving a decades-old protein-structure problem.
From search to closed-loop discovery
The classical laboratory workflow is linear: choose a candidate, synthesize it, measure it, publish the result, and let other researchers iterate. The AI-enabled workflow is recursive. A model proposes candidates; a robot or experimental team tests them; instruments generate data; machine-learning models interpret the data; active learning selects the next experiment; and the loop continues until a stopping criterion is reached.
The contribution of AI is therefore best understood as a set of operators applied to the discovery process:
- Search compression: reduce a huge candidate space to a smaller set worth testing.
- Surrogate modelling: approximate expensive simulations or experiments.
- Active learning: choose the next experiment by expected information gain.
- Instrument interpretation: convert spectra, diffraction patterns, microscopy, or sensor streams into structured evidence.
- Robotic execution: perform experiments reproducibly and continuously.
- Control: steer unstable physical systems in real time.
- Scientific agency: coordinate literature search, code, protocols, instruments, and documentation.
The strongest cases combine several of these operators. A ranking model alone is useful; a ranking model connected to robotic experimentation and measurement is a different type of scientific machine.
Case study: AI-guided discovery of kagome superconductors
The paper reports the experimental discovery of bulk superconductivity in YRu3B2 and LuRu3B2, two kagome-lattice compounds predicted through machine-learning-accelerated high-throughput screening combined with first-principles calculations.
The logic is precise. Kagome lattices can host quasiflat electronic bands; quasiflat bands can increase density of states near the Fermi level; increased density of states can strengthen pairing channels under suitable interactions; therefore kagome compounds are plausible candidates for superconductivity. But that physical intuition does not identify a material by itself. The materials space is combinatorial. Machine learning is used as a filter over that space, while density-functional theory and electron-phonon calculations impose physics-based constraints before synthesis.
The experimental result is modest in critical temperature but strong in epistemic value. The compounds show superconducting transitions around 0.81 K for YRu3B2 and 0.95 K for LuRu3B2, with bulk superconductivity confirmed through magnetization, specific heat, and electrical transport. The paper also reports nearly complete superconducting volume fractions, making the result more than a surface or impurity effect.
The discrepancy between prediction and measurement is as important as the positive discovery. The predicted critical temperatures were higher than the measured values. The authors explain the reduction by electronic and vibrational details: relative to LaRu3Si2, the Y and Lu borides show more dispersive Ru-derived quasiflat bands, lower density of states at the Fermi level, and harder phonon spectra, which reduce electron-phonon coupling. In other words, AI narrowed the search, but physics explained the error.
This is the correct division of labor. AI proposes; first-principles theory constrains; experiment decides; theory then repairs the model. The discovery is not AI found a superconductor in the crude sense. It is a prediction-to-realization pipeline in which AI made the search tractable.
The kagome-superconductor result is therefore not an isolated curiosity. It is one instance of a broader change in the organization of scientific work. Across several laboratory domains, AI is becoming useful when it is connected to experiment rather than separated from it: models propose, instruments measure, robotic systems execute, and the resulting data correct the next cycle of prediction.
Other laboratory fields where the same pattern is already visible
Inorganic materials and self-driving synthesis
The A-Lab at Berkeley/Lawrence Berkeley National Laboratory is one of the clearest examples of AI entering the physical laboratory. In the Nature paper introducing it, the system used computations, literature-derived knowledge, machine learning, active learning, robotics, and X-ray diffraction analysis to synthesize inorganic powders. Over 17 days of continuous operation, it realized 36 compounds from 57 targets, with recipes proposed by language models trained on synthesis literature and optimized through active learning.
GNoME, Google DeepMind’s graph-network system for materials exploration, shows the computational side of the same pipeline. The Nature paper reports more than 2.2 million structures predicted stable relative to previous work, with 381,000 new entries on the updated convex hull. Those predictions are not equivalent to manufactured materials; the authors explicitly note remaining problems such as phase transitions, dynamic stability, configurational entropy, and synthesizability. The scientific direction is nevertheless clear: AI expands the candidate universe, while autonomous labs test which candidates are physically realizable.
Organic chemistry and autonomous experimental execution
Coscientist, reported in Nature, is an LLM-driven system that designs, plans, and performs chemistry experiments using tools such as documentation search, code execution, and laboratory automation. The reported demonstrations include reaction optimization for palladium-catalysed cross-couplings. The relevant point is not that an LLM understands chemistry like a chemist; the concrete advance is orchestration. It connects protocol reasoning, instrument control, and experimental feedback into a semi-autonomous workflow.
This matters because much of chemistry is not only theory but procedural knowledge: temperatures, solvents, purification steps, failed routes, instrument settings, and tacit heuristics. A model connected to the literature and to automated execution can turn such knowledge into a more searchable design space.
Protein structure, protein design, and wet-lab validation
Protein science is the field where the AI-to-lab pathway is most visible to non-specialists. AlphaFold changed the cost of moving from sequence to plausible structure, while generative systems such as RFdiffusion move in the opposite direction: from desired structure or function toward candidate proteins.
The RFdiffusion Nature paper reports experimental characterization of hundreds of designed symmetric assemblies, metal-binding proteins, and protein binders, including cryo-EM confirmation of a designed binder in complex with influenza haemagglutinin that closely matched the design model. This is not just computational annotation; it is generative design followed by laboratory expression, structural characterization, and functional testing.
The first-principles framing is simple. Proteins are physical polymers. Their function depends heavily on three-dimensional structure and interaction surfaces. If a model can learn constraints mapping sequence, structure, and binding geometry, it can propose objects that did not exist in nature. The wet lab remains decisive because folding, expression, solubility, affinity, immunogenicity, and cellular context are not fully captured by the model.
Antibiotic discovery
AI is also changing antimicrobial discovery, where the search space is enormous and conventional screening is slow. A 2025 Nature Microbiology study used deep learning to mine 233 archaeal proteomes, identifying 12,623 candidate antimicrobial molecules. The researchers synthesized 80 archaeasins, reported antimicrobial activity for 93% of them in vitro against several pathogens, and validated archaeasin-73 in mouse infection models against A. baumannii.
This is a strong laboratory example because the AI contribution is bounded and testable. It does not claim to solve antibiotic resistance. It identifies candidate peptides from a biological search space that would be inefficient to explore manually, then subjects them to synthesis, in vitro assays, and in vivo validation. The discovery claim is credible only because the computational filter is followed by biological measurement.
Fusion-plasma control
AI contributes not only by proposing new matter but by controlling difficult experimental regimes. Deep reinforcement learning has been used to control tokamak plasmas. In one Nature study, an RL-designed magnetic controller learned in simulation and was experimentally verified on the TCV tokamak, controlling diverse plasma configurations, including advanced shapes and two simultaneous plasma droplets.
A later Nature paper applied AI control to tearing-instability avoidance in DIII-D, the largest magnetic fusion facility in the United States. The controller maintained tearing likelihood under a threshold even under difficult operating conditions, enabling the plasma to track a stable path in time-varying operational space.
This is discovery-adjacent rather than discovery in the materials sense. The AI system does not discover a molecule or compound; it expands the reachable experimental state space. That can still accelerate science because many physical questions can only be asked if the apparatus can be held in the required regime long enough to measure it.
What is really new?
The novelty is not that computers assist science. Numerical simulation, database search, and statistical inference have been part of science for decades. The novelty is the integration of three layers that were historically separated.
First, AI models can represent high-dimensional candidate spaces: crystal structures, molecules, proteins, recipes, spectra, or control policies. Second, laboratory automation can execute enough experiments to make active learning meaningful. Third, the feedback loop can be closed, so failed experiments are not merely negative results but training signals.
This creates a different discovery regime. In the old regime, the bottleneck was often human candidate selection. In the new regime, the bottleneck shifts toward measurement quality, robotic reliability, data provenance, safety constraints, and the ability to distinguish genuine novelty from database artefacts.
Limits and failure modes
The central risk is category confusion. A prediction is not a discovery. A stable point on a computed convex hull is not a synthesized material. A designed binder is not a therapeutic. An antimicrobial peptide active in a mouse model is not an approved antibiotic. A tokamak controller validated in one device is not a commercial fusion reactor.
There are also technical failure modes. Materials models can misjudge synthesizability. Biological models can miss toxicity, dynamics, post-translational modifications, or cellular context. LLM-based laboratory agents can generate plausible but invalid protocols unless constrained by tools, safety rules, and verification. Autonomous laboratories can optimize what they measure rather than what scientists actually care about.
Therefore, the correct epistemic stance is neither rejection nor hype. AI should be treated as a new experimental architecture whose outputs must be audited through physical measurement, mechanistic interpretation, and reproducibility.
The laboratory as an AI-native architecture
The AI-native laboratory has five core components.
The data layer contains literature, experimental logs, structures, assays, spectra, and failed attempts. The model layer proposes candidates or policies. The planning layer chooses experiments under cost, safety, and uncertainty constraints. The execution layer performs experiments. The instrumentation layer measures outcomes. The quality layer records provenance, calibration, uncertainty, and negative results. Human scientists remain necessary because model optimization is not equivalent to scientific understanding.
Conclusion
AI contributes to scientific discovery when it makes the conjecture-test-correction loop faster, broader, and more instrumented. The kagome-superconductor case shows the pattern in quantum materials: machine learning and first-principles calculations selected candidates; laboratory synthesis and measurements established the fact; physical theory explained the mismatch between predicted and observed critical temperatures.
The same structure is visible in self-driving inorganic synthesis, autonomous organic chemistry, protein design, antimicrobial discovery, and fusion-plasma control. The common denominator is not AI creativity. It is controlled search under constraints, joined to physical feedback.
The next scientific advantage will therefore belong less to isolated models than to institutions that can build reliable discovery loops: curated data, calibrated instruments, automated laboratories, uncertainty-aware models, safety constraints, and scientists able to turn model output into mechanistic knowledge.
Back to top