The Leiden Declaration and the Governance of AI-Assisted Mathematics

Abstract

This article analyzes the Leiden Declaration on Artificial Intelligence and Mathematics as a governance document for the emerging age of AI-assisted proof. Its central thesis is that the declaration should not be read as a rejection of artificial intelligence, nor as a narrow technical statement about proof assistants. It is a collective attempt by the mathematical community to make explicit the values that must survive as generative AI, formal proof search, automated conjecture generation, and machine-assisted discovery become part of ordinary mathematical work.

The article starts from a changed empirical situation. For decades, automated theorem proving and proof assistants were powerful but specialized tools. They could mechanize known arguments, verify formal derivations, and support formalization projects, but they did not obviously reorganize the social structure of mathematical research. The recent wave is different. Large language models, search agents, formal proof systems, mathematical libraries, human reviewers, and publication processes are beginning to form a composite cognitive infrastructure for mathematics. AI is no longer only a calculator, database, or verifier. It can generate candidate lemmas, proof sketches, examples, counterexamples, formal fragments, reformulations, and possible research directions.

The Leiden Declaration matters because it arrives after this threshold. It does not ask whether AI will enter mathematics. It assumes that AI has already entered mathematics, and then asks what must be preserved so that mathematics remains mathematics. The article argues that this question cannot be answered by correctness alone. A formally checked derivation may prove that a conclusion follows from a set of assumptions, but mathematical practice asks more: whether the statement is the right one, whether the definitions illuminate the structure, whether the proof explains anything, whether the idea has proper attribution, whether the argument can be inspected by humans, whether it opens a new direction, and whether it strengthens the discipline’s capacity for judgment.

The first-principles distinction is therefore between proof as artifact and mathematics as practice. A proof object answers the narrow logical question of valid inference. Mathematical research also includes meaning, explanation, responsibility, provenance, pedagogy, significance, taste, and community memory. The article presents the Leiden Declaration as a defense of this wider layer. AI systems may improve the production or verification of derivations, but they do not by themselves settle the human, institutional, and epistemic questions through which mathematical results become knowledge.

The article then analyzes the new production function of mathematics. AI reduces the cost of generating plausible mathematical artifacts. It can produce many candidate proof sketches, search paths, formalizations, and variants. This changes the scarcity structure of the discipline. The old bottlenecks were human time, symbolic manipulation, literature search, and manual exploration. The new bottlenecks become validation, attribution, interpretation, significance assessment, and governance. In this sense, the Leiden Declaration is not a nostalgic response to technology. It is a governance response to a shift in scarcity.

The article connects this shift to recent symbolic developments in AI-assisted mathematics, including AI-driven formal proof search in Lean and model-generated progress on difficult mathematical problems. These cases are not treated as proof that machines have replaced mathematicians. They are treated as evidence that AI systems can participate in the production of mathematically relevant artifacts at a scale and speed that existing review systems were not designed to absorb. Once candidate generation becomes cheap, human judgment becomes more valuable, not less.

A central part of the article examines the declaration’s account of proof, certainty, and understanding. The declaration insists that proof is central not only because it provides certainty, but because it gives understanding. This is decisive. A machine-checkable proof may establish formal validity, but it may still fail to explain why a theorem is true, how it relates to earlier ideas, which definitions matter, or why the result deserves attention. Formal verification is therefore necessary but insufficient. It checks deduction; it does not by itself supply meaning, taste, relevance, or mathematical fertility.

The article then turns to human authorship and responsibility. The declaration’s insistence that AI systems should not receive authorship is interpreted not as sentimentality, but as a structural requirement of accountability. Authorship in mathematics performs at least three functions: credit, responsibility, and accountability. An AI system cannot stand behind a theorem, respond to criticism, accept institutional responsibility, repair errors in the social sense, bear reputational risk, or participate in the long-term life of the discipline. Human authorship therefore remains necessary because mathematical publication is not merely the release of a text; it is an accountable act.

Attribution is treated as another essential value. Because generative AI may synthesize from large bodies of prior work without reliable citation, mathematicians who use AI tools have a stronger duty to reconstruct the intellectual lineage of ideas. Attribution is not administrative decoration. It is part of the map of mathematical knowledge. It tells the community where a method came from, which earlier results it depends on, which concepts it extends, and how a new contribution should be situated. If AI-generated text or proof sketches blur those dependencies, the damage is not only unfairness to individual authors. It is degradation of the discipline’s memory.

The article gives particular attention to the review bottleneck. If AI lowers the unit cost of producing plausible mathematical artifacts, then journals, referees, arXiv moderators, editors, conference organizers, and informal expert networks may face a scaling problem. The number of submissions, proof sketches, conjectures, and candidate results can grow faster than the available human capacity to evaluate them. The result may be slower review, weaker filtering, more duplicated claims, noisier literature, and a higher risk that future work builds on unstable foundations. The article compares this asymmetry to cybersecurity: the cost of producing problematic artifacts falls, while the cost of verification remains high.

The declaration’s proposed standards of rigor are therefore interpreted as epistemic risk controls. AI-assisted results should not be rejected merely because AI was used, but the stronger the automation, the stronger the required disclosure, verification, provenance, and human explanation. A conventionally produced proof requires proof. A result obtained through opaque AI search may require proof plus tool disclosure, computational-resource disclosure, prompt or workflow documentation where relevant, formal verification where appropriate, independent checking, cross-validation against theoretical or computational evidence, and a human-readable account of the central idea.

The article then expands the analysis from proof rules to institutional power. The Leiden Declaration is also about the political economy of mathematical research. AI companies increasingly treat mathematical publications, formal proof libraries, and proof assistants as resources for training and evaluating general-purpose models. Mathematics becomes useful to AI firms in two ways: as a benchmark of reasoning and as a training substrate, because formal proof environments provide relatively clean feedback loops. The model proposes; the verifier checks; the training process receives a signal. Mathematics is therefore not only a domain affected by AI. It is becoming part of the industrial machinery of AI development.

This creates an incentive asymmetry. Academic mathematics values truth, explanation, attribution, autonomy, reproducibility, and durable understanding. Commercial AI systems may prioritize capability, scale, proprietary advantage, market position, publicity, and control of infrastructure. These values can overlap, but they are not identical. The declaration asks mathematicians to recognize that their work is now entangled with industrial systems whose objectives may not be aligned with the long-term values of mathematical research.

Public computational infrastructure is therefore presented as one of the declaration’s most strategic recommendations. The article interprets this not as a secondary funding request, but as a demand for epistemic sovereignty. If AI-assisted mathematics depends entirely on proprietary models, proprietary compute, proprietary datasets, proprietary interfaces, and proprietary evaluation pipelines, then the mathematical community loses control over part of its own cognitive infrastructure. Access, reproducibility, disclosure, auditability, cost, and tool behavior become mediated by external platforms. For a discipline built on transparency and independent verification, this is a dangerous architecture.

The article also clarifies the declaration’s skepticism toward hype. “Don’t believe the hype” is not denial of progress. It is calibration. Four claims must be separated: AI can help produce mathematical results; AI can autonomously produce reliable mathematics; AI can replace mathematical communities; and AI-generated mathematics should be trusted without human governance. The first claim is increasingly supported. The second is context-dependent and limited. The third does not follow from the first two. The fourth is false. The declaration is strongest when read as a demand for precise calibration rather than either technological fatalism or nostalgic denial.

The article then describes the likely redistribution of mathematical labor. AI systems may increasingly perform candidate generation, lemma search, example construction, counterexample search, formal proof attempts, translation between informal and formal proof, literature summarization, proof repair, and large-scale conjecture testing. Human mathematicians will still be needed for problem selection, definition formation, interpretation, judgment of significance, explanation, responsibility, field-level prioritization, education, ethical assessment, and community governance. The human role moves upward, but it does not disappear.

This upward movement is not automatically benign. The article warns that mathematical culture is transmitted through more than final proofs. It is transmitted through failed attempts, informal explanations, seminars, mentorship, examples, taste, intuition, and apprenticeship. If AI tools remove too much of this formative process too early, short-term productivity may produce long-term weakness in human mathematical judgment. The declaration is therefore also a document about education and apprenticeship: the community must preserve the processes through which people learn to distinguish deep arguments from shallow ones.

A useful conceptual model in the article is the distinction between the proof pipeline and the meaning pipeline. The proof pipeline runs from problem or conjecture to AI-assisted exploration, candidate construction, formalization or computational check, and a verified or refuted technical artifact. The meaning pipeline begins where the proof artifact is not enough: human explanation, attribution, provenance, assessment of depth and significance, peer review, publication, and integration into mathematical knowledge. The Leiden Declaration is, in effect, a defense of the meaning pipeline.

The article generalizes the lesson beyond mathematics. Mathematics is the cleanest test case because it has unusually strict validation conditions, but the same structure will appear in software engineering, cybersecurity, enterprise architecture, scientific research, law, medicine, and policy. In each domain, generative systems can lower the cost of producing plausible intellectual artifacts. But architecture, maintainability, risk ownership, causal interpretation, accountability, ethical deployment, and strategic coherence remain human and institutional responsibilities. The mathematics case is therefore a preview of a broader governance problem: when generation becomes cheap, institutions must become better at evaluating meaning.

The conclusion rejects two weak responses. The first is technological fatalism: AI will transform mathematics, so the community must simply adapt to whatever industry builds. The second is nostalgic denial: AI does not understand mathematics like humans do, so nothing fundamental has changed. From first principles, a technology does not need to reproduce human cognition internally in order to reorganize human activity externally. Calculators did not need number sense; compilers did not need software-engineering judgment; search engines did not need scholarship. What matters is whether a system performs enough of a formerly scarce function at sufficient scale to change the surrounding practice.

The article concludes that AI is beginning to do exactly that in mathematics. The Leiden Declaration is therefore the correct kind of response: not refusal, not surrender, but governance. The future of mathematics will not be determined only by whether AI systems can prove theorems. It will be determined by whether human communities can preserve proof as understanding, authorship as responsibility, attribution as intellectual memory, public infrastructure as epistemic sovereignty, and research as a disciplined search for meaning rather than an industrial process for generating valid-looking output.

A reading of the Leiden Declaration as a governance framework for AI-assisted mathematics, defending proof as understanding, authorship as responsibility, attribution as intellectual memory, and public infrastructure as epistemic sovereignty.

A declaration after the threshold

The recent discussion on artificial intelligence and mathematics has changed tone because the empirical situation has changed.

For decades, automated theorem proving and proof assistants were important but specialized tools. They could verify formal derivations, mechanize known arguments, and support large formalization projects. They were extraordinary instruments, but they did not obviously threaten the social structure of mathematical research. They did not, by themselves, make mathematical discovery appear industrially scalable.

The last wave of developments is different. In a previous article, I described this as the emergence of a new cognitive infrastructure for science: not merely another productivity tool, but a layered system in which language models, formal proof environments, search agents, human reviewers, shared mathematical libraries, and institutional publication processes begin to form a composite research architecture.¹

In another article, I used the OpenAI unit-distance result as a symbolic threshold. The crucial point was not only that an AI system contributed to a result connected to a classical Erdős problem. The crucial point was that the system appeared to generate a non-obvious mathematical construction in an abstract search space where the relevant difficulty is not routine computation but structural invention.²

That is why the Leiden Declaration matters. It arrives after the psychological threshold. It is not asking whether AI will enter mathematics. It assumes that AI has already entered mathematics, and then asks a harder question:

What must be preserved so that mathematics remains mathematics?

The answer given by the declaration is not reducible to correctness. Correctness is necessary, but it is not enough.

The first-principles problem: mathematics is not only output

At the lowest level, mathematics can be described as the production of statements and proofs. A theorem states that, under specified assumptions, a conclusion follows. A proof is a finite structure of valid inferences connecting assumptions to conclusion.

From that narrow viewpoint, an AI system that generates correct formal proofs looks like an almost perfect mathematical machine. It can search, propose, formalize, test, repair, and repeat. If proof is only derivation, and if derivation can be checked mechanically, then why should the identity of the generator matter?

The Leiden Declaration rejects this reduction. Its starting point is that mathematical research has values that cannot be collapsed into the binary predicate proved / not proved. The declaration identifies proof, attribution, transparency, independent verification, evaluation of depth and significance, and the formation of human mathematical judgment as characteristic values of the discipline.³

This is a first-principles distinction. A proof object answers one question:

Does this conclusion follow from these premises by valid inference?

Mathematical practice answers a larger set of questions:

What is the right statement? Why is it true? Which definitions make the structure visible? Who discovered the idea? Who is responsible for the argument? Can other mathematicians inspect it? Does it explain anything? Does it open a new direction? Should scarce attention be spent on it? Can students learn from it? Does it strengthen or weaken the research community?

Artificial intelligence can improve the first task, the generation or verification of formal derivations, while also transforming the second and broader layer: the human practice through which mathematical results acquire meaning, attribution, responsibility, pedagogical value, and disciplinary significance.

The new production function of mathematics

Recent AI developments change the economics of mathematical work. The older bottlenecks were human time, human memory, manual symbolic manipulation, slow literature search, and the difficulty of constructing proofs. AI systems reduce some of these costs. They can generate candidate lemmas, proof sketches, counterexamples, synthetic explanations, formal proof fragments, and large numbers of variants.

The DeepMind paper on AI-driven formal proof search is especially important because it is not merely an anecdote about one impressive problem. It describes a reusable architecture: language-model generation coupled with formal verification in Lean, evaluated at scale on open mathematical problems and conjectures.⁴

The OpenAI unit-distance result is different but complementary. It suggests that model-driven exploration may generate mathematical constructions that are not merely rephrasings of known proofs, but new objects requiring expert digestion, verification, and contextualization.⁵

Together, these developments imply a shift:

old scarcity: candidate mathematical production, new scarcity: validation, attribution, interpretation, significance, and governance.

This is exactly the problem anticipated in the article on cognitive infrastructure. When generation becomes cheap, judgment becomes the scarce resource.⁶

The Leiden Declaration is therefore not a reaction against technology as such. It is a governance response to a change in scarcity.

What the Leiden Declaration says

The declaration is organized around values, threats, and recommendations. Its structure is important because it does not begin with tools. It begins with the discipline.

Proof, certainty, and understanding

The declaration states that proof is central not only because it confers certainty, but also because it gives understanding.⁷

This distinction is decisive. A machine-checkable proof may establish that a formal statement follows from formal assumptions. But the mathematical community must still determine whether the formal statement corresponds to the intended informal theorem, whether the definitions capture the right objects, whether the proof explains the phenomenon, and whether the result deserves attention.

This is the same distinction developed in the previous article on cognitive infrastructure: formal verification is necessary but insufficient. A proof assistant can check deduction, but it cannot by itself decide meaning, taste, relevance, or mathematical fertility.⁸

The declaration therefore resists both naïve skepticism and naïve automation. It does not deny the value of formal tools. It denies that formal success exhausts mathematical value.

Human authorship and responsibility

One of the strongest principles in the declaration is that credit and responsibility remain human. AI systems should not receive authorship.⁹ This is not sentimentality. It follows from the logic of responsibility.

Authorship in mathematics has at least three functions:

credit: who deserves recognition? responsibility: who stands behind the correctness? accountability: who can answer criticism and repair errors?

An AI system cannot assume responsibility in the human, institutional, and ethical sense. It cannot be held accountable by a journal, a department, a funding body, or a community of peers. It does not bear reputational risk in the way a mathematician does. It does not participate in the long-term life of the discipline.

Therefore, the declaration’s insistence on human authorship is not an arbitrary rule. It is a structural requirement for accountability.

Attribution as active labor

The declaration also emphasizes attribution. Because current AI systems often synthesize from large bodies of prior work without reliable citation, mathematicians using such systems have a stronger duty to identify and credit the human sources behind ideas.¹⁰

This point is important because attribution is not administrative decoration. In mathematics, attribution reconstructs intellectual dependency. It tells the community where an idea came from, which earlier methods it extends, which results it depends on, and how a new contribution should be situated.

If AI-generated text or proof sketches blur these dependencies, then the problem is not only unfairness to authors. It is damage to the map of mathematical knowledge. A field advances not merely by adding results, but by preserving the lineage of concepts.

The pressure on review

The declaration warns that automated techniques can produce plausible but unreliable arguments, including in formalized settings where the problem may lie in the translation between human concepts and computer encodings.¹¹

This is the operational risk. If AI makes it cheap to produce plausible mathematical manuscripts, then journals, referees, arXiv moderators, conference organizers, and informal expert networks face a scaling problem. The number of submissions can grow faster than the available human capacity to evaluate them.

This creates a failure mode:

%%{init: {"theme": "neo", "look": "handDrawn", "layout": "elk"}}%%

flowchart TD
    A["AI lowers the unit cost of producing mathematical artifacts"] --> B["More candidate proofs, conjectures, examples, counterexamples, and papers"]
    B --> C["A larger fraction of output is plausible enough to demand expert attention"]
    C --> D["Human review capacity becomes the bottleneck"]
    D --> E["Referees, editors, moderators, and informal expert networks are overloaded"]
    E --> F["Filtering quality weakens or becomes slower"]
    F --> G["Errors, weak claims, duplicate results, and unattributed ideas enter circulation"]
    G --> H["The literature becomes noisier and harder to trust"]
    H --> I["Future work may build on unstable foundations"]
    I --> J["The cost saved during generation reappears as higher verification and cleanup cost"]

Figure 1: The review bottleneck created by cheap AI-assisted mathematical generation

The threat is not only that individual papers may be wrong. The threat is that the review system may be forced to absorb an adversarially large quantity of plausible mathematical artifacts.

In this sense, AI-generated mathematics creates a problem similar to cybersecurity: the attacker’s unit cost falls, while the defender’s verification cost remains high.

Standards of rigor for AI-assisted results

For mathematical organizations and funders, the declaration recommends policies for publishing and review, including disclosure of tools and computational resources, attribution, authorship rules, and standards of conduct.¹²

It also recommends maintaining rigor through measures such as human descriptions of central arguments, formal verification where appropriate, cross-checking of theoretical and computational results, and external pre-submission review.¹³ This is a practical compromise. The declaration does not say that AI-assisted results should be rejected. It says that they should be evaluated according to the specific risks introduced by the method.

That principle can be stated generally:

the stronger the automation, the stronger the required disclosure, verification, and human explanation.

A result obtained through ordinary human reasoning requires proof. A result obtained through opaque AI search may require proof plus provenance, tool disclosure, formal checks, independent review, and a human-readable account of the central idea.

This is not bureaucracy. It is epistemic risk control.

The declaration is also about power

A narrow reading would treat the Leiden Declaration as a document about proof rules. That reading is incomplete.

The declaration is also about institutional power. It notes that commercial AI developers use mathematical publications and formal mathematical libraries as training data, and that automatically checkable proofs are attractive because they provide scalable feedback for training general-purpose models.¹⁴

This matters because mathematics has become strategically useful to AI companies in two ways:

Mathematics is a benchmark of reasoning. If a model performs well in mathematics, companies can use that performance to support broader claims about intelligence.
Formal mathematics can become a training substrate. A formal proof assistant provides a verifier. The model proposes; the verifier checks; the training loop receives a relatively clean signal. This makes mathematics not only a domain of application, but part of the industrial machinery of AI development.

The declaration therefore asks mathematicians to recognize that their work is now entangled with industrial systems whose objectives may not align with the values of mathematical research.

This is not anti-industry rhetoric. It is a structural observation about asymmetric incentives. Academic mathematics values truth, explanation, attribution, autonomy, and durable understanding. Commercial AI companies may value capability, scale, market position, proprietary advantage, and publicity. These value systems can overlap, but they are not identical.

Public infrastructure as epistemic sovereignty

One of the most strategically important recommendations is the call to invest in public computational infrastructure.¹⁵ This is not a secondary policy point. It is central.

If the future of AI-assisted mathematics depends entirely on proprietary models, proprietary compute, proprietary datasets, and proprietary evaluation pipelines, then the mathematical community loses control over part of its own cognitive infrastructure.

The result would be a dependency chain:

%%{init: {"theme": "neo", "look": "handDrawn", "layout": "elk"}}%%

flowchart TD
    A["Mathematical research"] --> B["AI-assisted search and formalization"]
    B --> C["Dependence on proprietary tools, models, datasets, and compute"]
    C --> D["Research workflows become mediated by industrial platforms"]
    D --> E["External governance over access, cost, disclosure, and reproducibility"]
    E --> F["Reduced autonomy of the mathematical community"]
    F --> G["Weaker public control over the cognitive infrastructure of proof"]

Figure 2: The dependency chain created by proprietary AI infrastructure in mathematical research

For a discipline built on transparency and independent verification, this is a dangerous architecture. Public computational infrastructure is therefore not just a question of fairness or access. It is a condition for preserving the reproducibility and autonomy of mathematical research. This connects directly with the broader thesis of cognitive infrastructure. Once AI becomes embedded in research workflows, the governance of infrastructure becomes the governance of thought.

Don’t believe the hype is not skepticism; it is calibration

The declaration explicitly advises policymakers not to rely on press releases or popular reporting when evaluating AI capabilities in mathematics.¹⁶ This should not be misunderstood as denial of AI progress. The recent evidence is real. AI-assisted mathematics is advancing quickly. The OpenAI unit-distance result and the DeepMind formal proof-search work are not trivial demonstrations.

But accurate calibration requires separating four different claims:

AI can help produce mathematical results.
AI can autonomously produce reliable mathematics.
AI can replace mathematical communities.
AI-generated mathematics should be trusted without human governance.

These claims have very different epistemic status:

The first claim is increasingly supported.
The second is context-dependent and still limited.
The third does not follow from the first two.
The fourth is false.

The declaration’s position is strongest when read this way. It is not saying that nothing happened. It is saying that something important happened, and therefore the community must become more precise, not less.

The human role moves upward, but does not disappear

The most plausible future is not a simple replacement of mathematicians by machines. It is a redistribution of mathematical labor.

AI systems may increasingly perform:

candidate generation,
lemma search,
example construction,
counterexample search,
formal proof attempts,
translation between informal and formal proof,
literature summarization,
proof repair,
large-scale conjecture testing.

Human mathematicians will still be needed for:

problem selection,
definition formation,
interpretation,
judgment of significance,
explanation,
responsibility,
ethical assessment,
field-level prioritization,
education,
community governance.

This is not a demotion of human mathematicians. It is a migration toward higher-order judgment. But that migration is not automatic. If education is weakened, if early-career researchers become dependent on tools before acquiring taste, and if institutions reward volume over insight, then the community may lose the very judgment it needs most.

The declaration is therefore also a document about apprenticeship. Mathematical culture is not transmitted only through final proofs. It is transmitted through failed attempts, examples, informal explanations, seminars, mentorship, taste, and the slow formation of intuition. If AI removes too much of that process too early, the short-term productivity gain may produce a long-term deficit in human understanding.

A useful model: proof pipeline versus meaning pipeline

We can represent the emerging AI-assisted research process as two coupled pipelines.

%%{init: {"theme": "neo", "look": "handDrawn", "layout": "elk"}}%%

flowchart TD
    A["Problem or conjecture"] --> B["AI-assisted exploration"]
    B --> C["Candidate construction or proof sketch"]
    C --> D["Formalization or computational check"]
    D --> E["Verified or refuted technical artifact"]

    E --> F["Human mathematical explanation"]
    F --> G["Attribution and provenance"]
    G --> H["Assessment of depth and significance"]
    H --> I["Peer review and publication"]
    I --> J["Integration into mathematical knowledge"]

    E -. insufficient by itself .-> X["Technically valid but possibly meaningless output"]

Figure 3: From proof artifact to mathematical knowledge: the proof pipeline and the meaning pipeline

The first pipeline produces proof artifacts. The second pipeline produces mathematical knowledge. The Leiden Declaration is, in effect, a defense of the second pipeline.

Why this matters beyond mathematics

Mathematics is the cleanest test case because mathematical truth has unusually strict validation conditions. But the same pattern will appear elsewhere.

In software engineering, AI can generate code, but architecture, maintainability, security, and accountability remain human governance problems. In cybersecurity, AI can generate attack paths and defensive hypotheses, but prioritization, risk ownership, and operational response remain institutional problems. In enterprise architecture, AI can generate documentation, process maps, and design alternatives, but transformation quality depends on stakeholder judgment, constraints, incentives, and strategic coherence. In scientific research, AI can generate hypotheses, simulations, and candidate explanations, but experimental design, causal interpretation, and ethical deployment remain human responsibilities.

The mathematics case is therefore not isolated. It is a preview. When generative systems reduce the cost of producing plausible intellectual artifacts, every serious knowledge domain must strengthen the institutions that evaluate meaning.

Conclusion: the answer is governance, not nostalgia

The Leiden Declaration is important because it refuses the two easy errors:

The first error is technological fatalism: AI will transform mathematics, therefore the community must simply adapt to whatever industry builds.
The second error is nostalgic denial: AI does not understand mathematics like humans do, therefore nothing fundamental has changed.

Both positions are weak. From first principles, a technology does not need to reproduce human cognition internally in order to reorganize human activity externally. Calculators did not need number sense. Compilers did not need software engineering judgment. Search engines did not need scholarship. What matters is whether a system can perform enough of a formerly scarce function at sufficient scale to change the economics of the surrounding practice.

AI is beginning to do that in mathematics. The Leiden Declaration is the mathematical community’s attempt to say: if the economics of proof, search, and formalization are changing, then the values of mathematics must be made explicit before they are accidentally redesigned by tools, platforms, incentives, and markets.

That is the correct response. Not refusal or surrender but governance.

The future of mathematics will not be determined only by whether AI systems can prove theorems. It will be determined by whether human communities can preserve proof as understanding, authorship as responsibility, attribution as intellectual memory, and research as a disciplined search for meaning rather than an industrial process for generating valid-looking output.

Footnotes

Montano A. (2026). The New Cognitive Infrastructure of Science. Author’s blog. URL ↩︎
Montano A. (2026). The Erdős Moment of AI. Author’s blog. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, section About our values. URL ↩︎
Tsoukalas, G., Kovsharov, A., Shirobokov, S., Surina, A., Firsching, M., Bérczi, G., Ruiz, F. J. R., Suggala, A., Wagner, A. Z., Wieser, E., Yu, L., Huang, A., Horváth, M. Z., Ferrauiolo, A., Michalewski, H., Grosu, C., Hubert, T., Balog, M., Kohli, P., & Chaudhuri, S. (2026). Advancing mathematics research with AI-driven formal proof search. arXiv. DOI ↩︎
OpenAI (2026). An OpenAI model has disproved a central conjecture in discrete geometry. Company website. URL ↩︎
Montano A. (2026). The New Cognitive Infrastructure of Science. Author’s blog. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, section About our values. URL ↩︎
Montano A. (2026). The New Cognitive Infrastructure of Science. Author’s blog. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Affirm the humanity of authorship. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Put effort into proper attribution. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, section Potential threats. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Take the lead on policies for publishing and reviewing. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Maintain standards of rigor. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, section Recommendations for commercial artificial intelligence. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Invest in public computational infrastructure. URL ↩︎
Leiden Declaration on Artificial Intelligence and Mathematics, recommendation Don’t believe the hype. URL ↩︎

The Leiden Declaration and the Governance of AI-Assisted Mathematics

A declaration after the threshold

The first-principles problem: mathematics is not only output

The new production function of mathematics

What the Leiden Declaration says

Proof, certainty, and understanding

Human authorship and responsibility

Attribution as active labor

The pressure on review

Standards of rigor for AI-assisted results

The declaration is also about power

Public infrastructure as epistemic sovereignty

Don’t believe the hype is not skepticism; it is calibration

The human role moves upward, but does not disappear

A useful model: proof pipeline versus meaning pipeline

Why this matters beyond mathematics

Conclusion: the answer is governance, not nostalgia

See also machine learning longforms

Scaling Laws After the Hype

AI Adoption, Productivity, and the Missing Middle

A Glimpse of Agent Evolution

See also mathematics longforms

Sudoku and Satisfiability Modulo Theories

The Relationship Between Category Theory, Lambda Calculus, and Functional Programming in Haskell

See also posts

The First Scroll Read by Machine Learning

When AI Enters the Laboratory

The Industrialization of Intelligence

When a Nobel Laureate Uses an LLM to Prove a Theorem: A Turning Point for Mathematical Discovery

The New Cognitive Infrastructure of Science

The Erdős Moment of AI

Footnotes