The Leiden Declaration and the Governance of AI-Assisted Mathematics

How mathematicians are answering the new cognitive infrastructure of proof

The Leiden Declaration on Artificial Intelligence and Mathematics is a significant collective response from mathematicians to the rapid entry of generative AI, formal proof search, and machine-assisted discovery into mathematical research. It should not be understood as a rejection of artificial intelligence, nor as a purely technical statement about proof assistants. Its deeper purpose is to defend the conditions under which mathematics remains a reliable, intelligible, and humanly accountable form of knowledge. This article reads the declaration in the context of recent developments discussed in earlier articles on this blog: the emergence of AI as a new cognitive infrastructure for science, the symbolic importance of AI-generated progress on classical mathematical problems, and the growing role of formal proof environments coupled with large language models. These developments show that the main transformation is not simply that AI can produce mathematical text or assist with computation. The deeper transformation is that AI lowers the cost of generating plausible conjectures, proof sketches, formal fragments, and candidate results, while leaving the cost of validation, interpretation, attribution, and judgment largely in human hands. The Leiden Declaration therefore addresses a structural asymmetry. As mathematical production becomes cheaper and more scalable, review, verification, and meaning become the scarce resources. The declaration insists that proof is not only a certificate of correctness but also a vehicle of understanding; that authorship must remain tied to human responsibility; that attribution is essential to preserve the intellectual lineage of ideas; and that AI-assisted mathematics must be governed by stronger standards of disclosure, rigor, and independent verification. The article also emphasizes the declaration’s institutional and political dimension. AI companies increasingly treat mathematical publications, formal libraries, and proof systems as valuable resources for training and evaluating general-purpose models. This creates a new dependency risk: if the future infrastructure of mathematical research is controlled by proprietary platforms, the mathematical community may lose autonomy over access, reproducibility, disclosure, and the governance of its own cognitive tools. For this reason, the declaration’s call for public computational infrastructure is not merely a funding request; it is a demand for epistemic sovereignty. The central thesis of this article is that the Leiden Declaration is best read as a governance document for the age of AI-assisted proof. It accepts that AI will become part of mathematical work, but it refuses to let mathematical values be silently redesigned by technical capability, commercial incentives, or publication pressure. The future of mathematics will not be determined only by whether machines can generate correct proofs. It will be determined by whether human communities can preserve proof as understanding, authorship as responsibility, attribution as intellectual memory, and research as a disciplined search for meaning rather than an industrial process for producing valid-looking output.
machine learning
mathematics
research
🇬🇧
Author

Antonio Montano

Published

June 4, 2026

Modified

June 4, 2026

The Leiden Declaration on Artificial Intelligence and Mathematics is a significant collective response from mathematicians to the rapid entry of generative AI, formal proof search, and machine-assisted discovery into mathematical research. It should not be understood as a rejection of artificial intelligence, nor as a purely technical statement about proof assistants. Its deeper purpose is to defend the conditions under which mathematics remains a reliable, intelligible, and humanly accountable form of knowledge. This article reads the declaration in the context of recent developments discussed in earlier articles on this blog: the emergence of AI as a new cognitive infrastructure for science, the symbolic importance of AI-generated progress on classical mathematical problems, and the growing role of formal proof environments coupled with large language models. These developments show that the main transformation is not simply that AI can produce mathematical text or assist with computation. The deeper transformation is that AI lowers the cost of generating plausible conjectures, proof sketches, formal fragments, and candidate results, while leaving the cost of validation, interpretation, attribution, and judgment largely in human hands. The Leiden Declaration therefore addresses a structural asymmetry. As mathematical production becomes cheaper and more scalable, review, verification, and meaning become the scarce resources. The declaration insists that proof is not only a certificate of correctness but also a vehicle of understanding; that authorship must remain tied to human responsibility; that attribution is essential to preserve the intellectual lineage of ideas; and that AI-assisted mathematics must be governed by stronger standards of disclosure, rigor, and independent verification. The article also emphasizes the declaration’s institutional and political dimension. AI companies increasingly treat mathematical publications, formal libraries, and proof systems as valuable resources for training and evaluating general-purpose models. This creates a new dependency risk: if the future infrastructure of mathematical research is controlled by proprietary platforms, the mathematical community may lose autonomy over access, reproducibility, disclosure, and the governance of its own cognitive tools. For this reason, the declaration’s call for public computational infrastructure is not merely a funding request; it is a demand for epistemic sovereignty. The central thesis of this article is that the Leiden Declaration is best read as a governance document for the age of AI-assisted proof. It accepts that AI will become part of mathematical work, but it refuses to let mathematical values be silently redesigned by technical capability, commercial incentives, or publication pressure. The future of mathematics will not be determined only by whether machines can generate correct proofs. It will be determined by whether human communities can preserve proof as understanding, authorship as responsibility, attribution as intellectual memory, and research as a disciplined search for meaning rather than an industrial process for producing valid-looking output.

A declaration after the threshold

The recent discussion on artificial intelligence and mathematics has changed tone because the empirical situation has changed.

For decades, automated theorem proving and proof assistants were important but specialized tools. They could verify formal derivations, mechanize known arguments, and support large formalization projects. They were extraordinary instruments, but they did not obviously threaten the social structure of mathematical research. They did not, by themselves, make mathematical discovery appear industrially scalable.

The last wave of developments is different. In a previous article, I described this as the emergence of a new cognitive infrastructure for science: not merely another productivity tool, but a layered system in which language models, formal proof environments, search agents, human reviewers, shared mathematical libraries, and institutional publication processes begin to form a composite research architecture.1

In another article, I used the OpenAI unit-distance result as a symbolic threshold. The crucial point was not only that an AI system contributed to a result connected to a classical Erdős problem. The crucial point was that the system appeared to generate a non-obvious mathematical construction in an abstract search space where the relevant difficulty is not routine computation but structural invention.2

That is why the Leiden Declaration matters. It arrives after the psychological threshold. It is not asking whether AI will enter mathematics. It assumes that AI has already entered mathematics, and then asks a harder question:

What must be preserved so that mathematics remains mathematics?

The answer given by the declaration is not reducible to correctness. Correctness is necessary, but it is not enough.

The first-principles problem: mathematics is not only output

At the lowest level, mathematics can be described as the production of statements and proofs. A theorem states that, under specified assumptions, a conclusion follows. A proof is a finite structure of valid inferences connecting assumptions to conclusion.

From that narrow viewpoint, an AI system that generates correct formal proofs looks like an almost perfect mathematical machine. It can search, propose, formalize, test, repair, and repeat. If proof is only derivation, and if derivation can be checked mechanically, then why should the identity of the generator matter?

The Leiden Declaration rejects this reduction. Its starting point is that mathematical research has values that cannot be collapsed into the binary predicate proved / not proved. The declaration identifies proof, attribution, transparency, independent verification, evaluation of depth and significance, and the formation of human mathematical judgment as characteristic values of the discipline.3

This is a first-principles distinction. A proof object answers one question:

Does this conclusion follow from these premises by valid inference?

Mathematical practice answers a larger set of questions:

What is the right statement? Why is it true? Which definitions make the structure visible? Who discovered the idea? Who is responsible for the argument? Can other mathematicians inspect it? Does it explain anything? Does it open a new direction? Should scarce attention be spent on it? Can students learn from it? Does it strengthen or weaken the research community?

Artificial intelligence can improve the first task, the generation or verification of formal derivations, while also transforming the second and broader layer: the human practice through which mathematical results acquire meaning, attribution, responsibility, pedagogical value, and disciplinary significance.

The new production function of mathematics

Recent AI developments change the economics of mathematical work. The older bottlenecks were human time, human memory, manual symbolic manipulation, slow literature search, and the difficulty of constructing proofs. AI systems reduce some of these costs. They can generate candidate lemmas, proof sketches, counterexamples, synthetic explanations, formal proof fragments, and large numbers of variants.

The DeepMind paper on AI-driven formal proof search is especially important because it is not merely an anecdote about one impressive problem. It describes a reusable architecture: language-model generation coupled with formal verification in Lean, evaluated at scale on open mathematical problems and conjectures.4

The OpenAI unit-distance result is different but complementary. It suggests that model-driven exploration may generate mathematical constructions that are not merely rephrasings of known proofs, but new objects requiring expert digestion, verification, and contextualization.5

Together, these developments imply a shift:

old scarcity: candidate mathematical production, new scarcity: validation, attribution, interpretation, significance, and governance.

This is exactly the problem anticipated in the article on cognitive infrastructure. When generation becomes cheap, judgment becomes the scarce resource.6

The Leiden Declaration is therefore not a reaction against technology as such. It is a governance response to a change in scarcity.

What the Leiden Declaration says

The declaration is organized around values, threats, and recommendations. Its structure is important because it does not begin with tools. It begins with the discipline.

Proof, certainty, and understanding

The declaration states that proof is central not only because it confers certainty, but also because it gives understanding.7

This distinction is decisive. A machine-checkable proof may establish that a formal statement follows from formal assumptions. But the mathematical community must still determine whether the formal statement corresponds to the intended informal theorem, whether the definitions capture the right objects, whether the proof explains the phenomenon, and whether the result deserves attention.

This is the same distinction developed in the previous article on cognitive infrastructure: formal verification is necessary but insufficient. A proof assistant can check deduction, but it cannot by itself decide meaning, taste, relevance, or mathematical fertility.8

The declaration therefore resists both naïve skepticism and naïve automation. It does not deny the value of formal tools. It denies that formal success exhausts mathematical value.

Human authorship and responsibility

One of the strongest principles in the declaration is that credit and responsibility remain human. AI systems should not receive authorship.9 This is not sentimentality. It follows from the logic of responsibility.

Authorship in mathematics has at least three functions:

credit: who deserves recognition? responsibility: who stands behind the correctness? accountability: who can answer criticism and repair errors?

An AI system cannot assume responsibility in the human, institutional, and ethical sense. It cannot be held accountable by a journal, a department, a funding body, or a community of peers. It does not bear reputational risk in the way a mathematician does. It does not participate in the long-term life of the discipline.

Therefore, the declaration’s insistence on human authorship is not an arbitrary rule. It is a structural requirement for accountability.

Attribution as active labor

The declaration also emphasizes attribution. Because current AI systems often synthesize from large bodies of prior work without reliable citation, mathematicians using such systems have a stronger duty to identify and credit the human sources behind ideas.10

This point is important because attribution is not administrative decoration. In mathematics, attribution reconstructs intellectual dependency. It tells the community where an idea came from, which earlier methods it extends, which results it depends on, and how a new contribution should be situated.

If AI-generated text or proof sketches blur these dependencies, then the problem is not only unfairness to authors. It is damage to the map of mathematical knowledge. A field advances not merely by adding results, but by preserving the lineage of concepts.

The pressure on review

The declaration warns that automated techniques can produce plausible but unreliable arguments, including in formalized settings where the problem may lie in the translation between human concepts and computer encodings.11

This is the operational risk. If AI makes it cheap to produce plausible mathematical manuscripts, then journals, referees, arXiv moderators, conference organizers, and informal expert networks face a scaling problem. The number of submissions can grow faster than the available human capacity to evaluate them.

This creates a failure mode:

%%{init: {"theme": "neo", "look": "handDrawn", "layout": "elk"}}%%

flowchart TD
    A["AI lowers the unit cost of producing mathematical artifacts"] --> B["More candidate proofs, conjectures, examples, counterexamples, and papers"]
    B --> C["A larger fraction of output is plausible enough to demand expert attention"]
    C --> D["Human review capacity becomes the bottleneck"]
    D --> E["Referees, editors, moderators, and informal expert networks are overloaded"]
    E --> F["Filtering quality weakens or becomes slower"]
    F --> G["Errors, weak claims, duplicate results, and unattributed ideas enter circulation"]
    G --> H["The literature becomes noisier and harder to trust"]
    H --> I["Future work may build on unstable foundations"]
    I --> J["The cost saved during generation reappears as higher verification and cleanup cost"]
Figure 1: The review bottleneck created by cheap AI-assisted mathematical generation

The threat is not only that individual papers may be wrong. The threat is that the review system may be forced to absorb an adversarially large quantity of plausible mathematical artifacts.

In this sense, AI-generated mathematics creates a problem similar to cybersecurity: the attacker’s unit cost falls, while the defender’s verification cost remains high.

Standards of rigor for AI-assisted results

For mathematical organizations and funders, the declaration recommends policies for publishing and review, including disclosure of tools and computational resources, attribution, authorship rules, and standards of conduct.12

It also recommends maintaining rigor through measures such as human descriptions of central arguments, formal verification where appropriate, cross-checking of theoretical and computational results, and external pre-submission review.13 This is a practical compromise. The declaration does not say that AI-assisted results should be rejected. It says that they should be evaluated according to the specific risks introduced by the method.

That principle can be stated generally:

the stronger the automation, the stronger the required disclosure, verification, and human explanation.

A result obtained through ordinary human reasoning requires proof. A result obtained through opaque AI search may require proof plus provenance, tool disclosure, formal checks, independent review, and a human-readable account of the central idea.

This is not bureaucracy. It is epistemic risk control.

The declaration is also about power

A narrow reading would treat the Leiden Declaration as a document about proof rules. That reading is incomplete.

The declaration is also about institutional power. It notes that commercial AI developers use mathematical publications and formal mathematical libraries as training data, and that automatically checkable proofs are attractive because they provide scalable feedback for training general-purpose models.14

This matters because mathematics has become strategically useful to AI companies in two ways:

  1. Mathematics is a benchmark of reasoning. If a model performs well in mathematics, companies can use that performance to support broader claims about intelligence.

  2. Formal mathematics can become a training substrate. A formal proof assistant provides a verifier. The model proposes; the verifier checks; the training loop receives a relatively clean signal. This makes mathematics not only a domain of application, but part of the industrial machinery of AI development.

The declaration therefore asks mathematicians to recognize that their work is now entangled with industrial systems whose objectives may not align with the values of mathematical research.

This is not anti-industry rhetoric. It is a structural observation about asymmetric incentives. Academic mathematics values truth, explanation, attribution, autonomy, and durable understanding. Commercial AI companies may value capability, scale, market position, proprietary advantage, and publicity. These value systems can overlap, but they are not identical.

Public infrastructure as epistemic sovereignty

One of the most strategically important recommendations is the call to invest in public computational infrastructure.15 This is not a secondary policy point. It is central.

If the future of AI-assisted mathematics depends entirely on proprietary models, proprietary compute, proprietary datasets, and proprietary evaluation pipelines, then the mathematical community loses control over part of its own cognitive infrastructure.

The result would be a dependency chain:

%%{init: {"theme": "neo", "look": "handDrawn", "layout": "elk"}}%%

flowchart TD
    A["Mathematical research"] --> B["AI-assisted search and formalization"]
    B --> C["Dependence on proprietary tools, models, datasets, and compute"]
    C --> D["Research workflows become mediated by industrial platforms"]
    D --> E["External governance over access, cost, disclosure, and reproducibility"]
    E --> F["Reduced autonomy of the mathematical community"]
    F --> G["Weaker public control over the cognitive infrastructure of proof"]
Figure 2: The dependency chain created by proprietary AI infrastructure in mathematical research

For a discipline built on transparency and independent verification, this is a dangerous architecture. Public computational infrastructure is therefore not just a question of fairness or access. It is a condition for preserving the reproducibility and autonomy of mathematical research. This connects directly with the broader thesis of cognitive infrastructure. Once AI becomes embedded in research workflows, the governance of infrastructure becomes the governance of thought.

Don’t believe the hype is not skepticism; it is calibration

The declaration explicitly advises policymakers not to rely on press releases or popular reporting when evaluating AI capabilities in mathematics.16 This should not be misunderstood as denial of AI progress. The recent evidence is real. AI-assisted mathematics is advancing quickly. The OpenAI unit-distance result and the DeepMind formal proof-search work are not trivial demonstrations.

But accurate calibration requires separating four different claims:

  • AI can help produce mathematical results.
  • AI can autonomously produce reliable mathematics.
  • AI can replace mathematical communities.
  • AI-generated mathematics should be trusted without human governance.

These claims have very different epistemic status:

  • The first claim is increasingly supported.
  • The second is context-dependent and still limited.
  • The third does not follow from the first two.
  • The fourth is false.

The declaration’s position is strongest when read this way. It is not saying that nothing happened. It is saying that something important happened, and therefore the community must become more precise, not less.

The human role moves upward, but does not disappear

The most plausible future is not a simple replacement of mathematicians by machines. It is a redistribution of mathematical labor.

AI systems may increasingly perform:

  • candidate generation,
  • lemma search,
  • example construction,
  • counterexample search,
  • formal proof attempts,
  • translation between informal and formal proof,
  • literature summarization,
  • proof repair,
  • large-scale conjecture testing.

Human mathematicians will still be needed for:

  • problem selection,
  • definition formation,
  • interpretation,
  • judgment of significance,
  • explanation,
  • responsibility,
  • ethical assessment,
  • field-level prioritization,
  • education,
  • community governance.

This is not a demotion of human mathematicians. It is a migration toward higher-order judgment. But that migration is not automatic. If education is weakened, if early-career researchers become dependent on tools before acquiring taste, and if institutions reward volume over insight, then the community may lose the very judgment it needs most.

The declaration is therefore also a document about apprenticeship. Mathematical culture is not transmitted only through final proofs. It is transmitted through failed attempts, examples, informal explanations, seminars, mentorship, taste, and the slow formation of intuition. If AI removes too much of that process too early, the short-term productivity gain may produce a long-term deficit in human understanding.

A useful model: proof pipeline versus meaning pipeline

We can represent the emerging AI-assisted research process as two coupled pipelines.

%%{init: {"theme": "neo", "look": "handDrawn", "layout": "elk"}}%%

flowchart TD
    A["Problem or conjecture"] --> B["AI-assisted exploration"]
    B --> C["Candidate construction or proof sketch"]
    C --> D["Formalization or computational check"]
    D --> E["Verified or refuted technical artifact"]

    E --> F["Human mathematical explanation"]
    F --> G["Attribution and provenance"]
    G --> H["Assessment of depth and significance"]
    H --> I["Peer review and publication"]
    I --> J["Integration into mathematical knowledge"]

    E -. insufficient by itself .-> X["Technically valid but possibly meaningless output"]
Figure 3: From proof artifact to mathematical knowledge: the proof pipeline and the meaning pipeline

The first pipeline produces proof artifacts. The second pipeline produces mathematical knowledge. The Leiden Declaration is, in effect, a defense of the second pipeline.

Why this matters beyond mathematics

Mathematics is the cleanest test case because mathematical truth has unusually strict validation conditions. But the same pattern will appear elsewhere.

In software engineering, AI can generate code, but architecture, maintainability, security, and accountability remain human governance problems. In cybersecurity, AI can generate attack paths and defensive hypotheses, but prioritization, risk ownership, and operational response remain institutional problems. In enterprise architecture, AI can generate documentation, process maps, and design alternatives, but transformation quality depends on stakeholder judgment, constraints, incentives, and strategic coherence. In scientific research, AI can generate hypotheses, simulations, and candidate explanations, but experimental design, causal interpretation, and ethical deployment remain human responsibilities.

The mathematics case is therefore not isolated. It is a preview. When generative systems reduce the cost of producing plausible intellectual artifacts, every serious knowledge domain must strengthen the institutions that evaluate meaning.

Conclusion: the answer is governance, not nostalgia

The Leiden Declaration is important because it refuses the two easy errors:

  • The first error is technological fatalism: AI will transform mathematics, therefore the community must simply adapt to whatever industry builds.

  • The second error is nostalgic denial: AI does not understand mathematics like humans do, therefore nothing fundamental has changed.

Both positions are weak. From first principles, a technology does not need to reproduce human cognition internally in order to reorganize human activity externally. Calculators did not need number sense. Compilers did not need software engineering judgment. Search engines did not need scholarship. What matters is whether a system can perform enough of a formerly scarce function at sufficient scale to change the economics of the surrounding practice.

AI is beginning to do that in mathematics. The Leiden Declaration is the mathematical community’s attempt to say: if the economics of proof, search, and formalization are changing, then the values of mathematics must be made explicit before they are accidentally redesigned by tools, platforms, incentives, and markets.

That is the correct response. Not refusal or surrender but governance.

The future of mathematics will not be determined only by whether AI systems can prove theorems. It will be determined by whether human communities can preserve proof as understanding, authorship as responsibility, attribution as intellectual memory, and research as a disciplined search for meaning rather than an industrial process for generating valid-looking output.

See also longforms

See also posts

Back to top

Footnotes

  1. Montano A. (2026). The New Cognitive Infrastructure of Science. Author’s blog. URL↩︎

  2. Montano A. (2026). The Erdős Moment of AI. Author’s blog. URL↩︎

  3. Leiden Declaration on Artificial Intelligence and Mathematics, section About our values. URL↩︎

  4. Tsoukalas, G., Kovsharov, A., Shirobokov, S., Surina, A., Firsching, M., Bérczi, G., Ruiz, F. J. R., Suggala, A., Wagner, A. Z., Wieser, E., Yu, L., Huang, A., Horváth, M. Z., Ferrauiolo, A., Michalewski, H., Grosu, C., Hubert, T., Balog, M., Kohli, P., & Chaudhuri, S. (2026). Advancing mathematics research with AI-driven formal proof search. arXiv. DOI↩︎

  5. OpenAI (2026). An OpenAI model has disproved a central conjecture in discrete geometry. Company website. URL↩︎

  6. Montano A. (2026). The New Cognitive Infrastructure of Science. Author’s blog. URL↩︎

  7. Leiden Declaration on Artificial Intelligence and Mathematics, section About our values. URL↩︎

  8. Montano A. (2026). The New Cognitive Infrastructure of Science. Author’s blog. URL↩︎

  9. Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Affirm the humanity of authorship. URL↩︎

  10. Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Put effort into proper attribution. URL↩︎

  11. Leiden Declaration on Artificial Intelligence and Mathematics, section Potential threats. URL↩︎

  12. Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Take the lead on policies for publishing and reviewing. URL↩︎

  13. Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Maintain standards of rigor. URL↩︎

  14. Leiden Declaration on Artificial Intelligence and Mathematics, section Recommendations for commercial artificial intelligence. URL↩︎

  15. Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Invest in public computational infrastructure. URL↩︎

  16. Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Don’t believe the hype. URL↩︎