AI Adoption, Productivity, and the Missing Middle

Abstract

This article reviews Banca d’Italia’s QEF 1009 on artificial intelligence adoption, productivity, and policy design. Its central argument is that the paper’s main contribution is not a simple forecast of AI-driven growth, but a disciplined distinction between nominal AI adoption, deep organizational integration, measurable productivity, and domestic value-added capture. QEF 1009 documents a rapid increase in AI use among Italian firms, from 27 percent in 2025 to 32 percent at the beginning of 2026, while showing that intensive integration remains limited, at about 5 percent. Its short-run firm-level econometric analysis does not yet identify systematic effects on revenue per employee, employment, or investment, despite strong task-level evidence from the international literature on writing, coding, customer service, and professional work.

The review interprets this apparent tension through a production-system lens. A task-level gain becomes firm-level productivity only when it relaxes a binding constraint, propagates through adjacent workflow redesign, and is supported by complementary assets such as clean data, integration architecture, process ownership, model governance, cybersecurity, managerial capability, and worker skills. This explains why AI adoption may follow a productivity J-curve: integration costs, organizational change, and intangible investment can precede measurable gains.

The article extends the Banca d’Italia argument from productivity potential to value-added realization. AI can generate durable national economic value only if embedded in an ecosystem capable of translating digital innovation into vertical applications, scalable suppliers, interoperable data infrastructures, accessible compute, procurement channels, skills, regulatory certainty, and operating-model transformation. This is especially relevant for Italy, where scientific, industrial, and supercomputing assets exist, but the conversion chain from research and compute to firm-level adoption and domestic value capture remains fragmented.

The appendix formalizes the same logic by distinguishing additive, bottleneck, serial-cycle-time, queueing, Leontief, and O-ring production structures. These models show why large local AI gains may be diluted, delayed, amplified, or destroyed before appearing in firm-level productivity statistics. The resulting policy implication is restrictive: public support should not merely increase AI adoption counts, but remove the conversion bottlenecks that prevent firms from turning AI tools into process productivity, firm productivity, and domestic value added.

A review of Banca d’Italia QEF 1009 on AI adoption, productivity, and the ecosystem required to convert task-level gains into firm productivity, industrial capability, and domestic value added.

Introduction: the main result of the paper

Banca d’Italia’s Questioni di Economia e Finanza n. 1009, L’adozione dell’intelligenza artificiale: effetti su produttività e politiche a sostegno, is a useful paper because it refuses the simplest narrative on artificial intelligence. It does not say that AI adoption automatically raises productivity, and it does not say that the absence of immediate productivity gains proves that AI is economically irrelevant. Its central proposition is more precise: AI may materially raise Italian productivity over the next decade, but only if adoption becomes broad, deep, organizationally integrated, and supported by complementary assets such as data, skills, compute capacity, managerial capability, vertical applications, and regulatory certainty.¹

The headline facts are deliberately asymmetric. On the one hand, adoption is rising. According to the Invind survey of firms with at least 20 employees, the share of Italian firms using AI doubled in 2025 to 27 percent and reached 32 percent at the beginning of 2026. On the other hand, intensive integration remains rare, at about 5 percent. Therefore, the first empirical distinction is not between adoption and non-adoption, but between shallow use and deep integration.

The second main result is that the paper does not yet find systematic short-run productivity effects at firm level. Its econometric analysis, based on Invind data for 2022–2026 and a staggered difference-in-differences design on a balanced panel of around 900 firms, finds no systematic effect of AI adoption on revenue per employee, employment, or investment. The result is not interpreted as evidence that AI has no economic value. It is interpreted as evidence that firm-level productivity requires time, integration, and reorganization.

The third main result is long-run and model-based. Simulations suggest that, over a ten-year horizon, AI adoption could increase Italian labour productivity growth by approximately 0.2 percentage points per year under slow adoption, 0.7 under medium adoption, and 1.1 under rapid adoption. The policy implication is direct: the difference between slow and rapid adoption is not a minor modelling detail; it is the difference between AI as a marginal tool and AI as a macroeconomic productivity lever.

The fourth result concerns policy. The paper argues that public intervention can be justified, but not as generic enthusiasm for AI. The justification is economic: information failures, coordination failures, missing standards, fragmented demand, externalities, network effects, and insufficient supply of vertical applications can cause under-adoption relative to the social optimum. The paper is therefore skeptical of broad, non-selective subsidies. It prefers a coordinated architecture: adoption support, proof-of-concept funding, stronger technology-transfer institutions, public procurement, data spaces, regulatory sandboxes, compute access, and human-capital investment.

The most important concept in the paper is the missing middle. The AI market may polarize between generic tools, which are accessible but weakly integrated into firm processes, and bespoke enterprise implementations, which are available mainly to large firms. For Italy, with its dense population of small and medium-sized enterprises, the economically decisive layer is in between: reusable, sector-specific, process-aware AI applications that can be adopted by firms that do not have the capital, data-science staff, or organizational slack of large corporations.

What the paper is really asking

The paper is not asking whether ChatGPT, coding assistants, or AI agents can make a single worker faster. That question has already received positive answers in parts of the empirical literature. The paper asks a harder question: when does a local improvement in a task become a measurable improvement in firm productivity, and when does firm-level adoption become macroeconomic growth?

The distinction is fundamental. A firm is not a collection of isolated tasks. It is a production system. If AI improves one task but the bottleneck lies elsewhere, aggregate output may not increase. If a sales team writes faster emails but the delivery process, pricing process, compliance process, or product configuration process remains unchanged, the gain may be absorbed as slack, quality improvement, or local convenience rather than revenue per employee. If an engineer produces code faster but testing, release governance, architecture review, cybersecurity validation, or customer acceptance remain the limiting factors, measured productivity may not rise proportionally.

This is why the paper’s first-principles logic is stronger than a simple adoption survey. Productivity is output divided by input. AI changes productivity only if it changes the production function, the feasible output set, or the input requirement of binding constraints. Mere tool availability is not sufficient.

Method used in the paper

The paper combines four layers of evidence:

It uses adoption surveys. The main Italian evidence comes from Invind, Banca d’Italia’s survey of firms with at least 20 employees. Comparable European evidence comes from Eurostat data on firms with at least 10 employees. These sources allow the authors to distinguish adoption rates, intensity of adoption, functions affected, and barriers to adoption.
It reviews task-level microeconomic evidence from the literature. The cited studies report strong productivity gains in specific tasks such as writing, customer service, coding, consulting-type work, and online retail. This evidence is causal or quasi-causal in many cases, but it is task-bound. The paper uses it to establish that AI can improve local efficiency, while warning that local efficiency is not equivalent to firm productivity.
It performs a firm-level econometric analysis using Invind data for 2022–2026. The method is a staggered difference-in-differences design. Firms adopted AI in different years, specifically 2024, 2025, or 2026. For each adoption cohort, the control group consists of firms that had not yet adopted AI in the same period. The panel includes around 900 firms and compares firms while controlling for sector, size class, geographical area, and propensity to invest in advanced technologies as of 2022.

The estimator produces treatment effects by cohort and period, then aggregates them into an average estimate and an event-study profile. The paper also states that covariates are incorporated through a doubly robust method and that inference is based on wild bootstrap. In substance, the empirical question is: conditional on observable similarity, do adopters show a different trajectory from comparable non-adopters after adoption?
It uses long-run macroeconomic simulations, but it does so in two distinct steps that should not be collapsed. First, QEF 1009 uses external task-based macroeconomic benchmarks, especially Acemoglu and Aghion and Bunel, to show why AI-productivity estimates are highly sensitive to assumptions about task exposure, economically convenient adoption, efficiency gains and labour shares.² ³ Second, it moves to OECD-style and Italy-specific simulations based on sectoral AI exposure, input-output linkages and general-equilibrium propagation. For Italy, the paper reports simulations using a model of shock transmission through the Italian production network, with updated input-output tables and elastic labour supply. The Italy-specific estimates are therefore not direct extrapolations from Acemoglu or Aghion and Bunel; those studies function as benchmark decompositions, while the Italian scenarios are built on the production-network modelling reported by Banca d’Italia.

The logic of the long-run model is this: AI raises efficiency in exposed tasks; exposed tasks belong to sectors; sectors are connected through input-output relationships; productivity shocks in central sectors propagate more strongly than shocks in peripheral sectors; the aggregate effect depends on adoption speed, adoption depth, sectoral structure, labour reallocation, and capital deepening.

Limitations of the method

The paper is careful about its own limits, and those limits are not cosmetic:

Measurement. AI adoption is not a scalar variable. It may mean occasional use of a chatbot, systematic use of generative AI in office work, embedded AI in production systems, predictive maintenance, AI-assisted coding, automated customer support, AI-enabled document processing, or end-to-end workflow redesign. Treating these forms as one binary variable risks mixing technologies, intensities, and organizational meanings.
Time. The available firm-level window is short. If AI behaves like a general-purpose technology, the initial phase can contain integration costs, training costs, process redesign, data preparation, software integration, governance changes, and managerial learning. In that case, the absence of an immediate productivity effect is not surprising.
Selection. Firms that adopt AI earlier may already be more dynamic, more digitalized, better managed, more capitalized, or more exposed to AI-suitable tasks. The difference-in-differences design attempts to control for observable factors, but it cannot eliminate all unobservable differences in managerial quality, data maturity, organizational discipline, or latent growth trajectory.
Aggregation. Task-level evidence cannot be mechanically scaled to firm-level productivity. A 40 percent time saving in one task does not imply a 40 percent firm-level gain. The relevant variables are the task’s share of total cost, its role in the workflow, whether it is a bottleneck, whether output demand expands, whether quality changes, and whether complementary tasks are redesigned.
Macro simulations. The long-run estimates depend heavily on assumptions: the share of tasks exposed to AI, the share where adoption is economically convenient, the size of efficiency gains, the labour share in exposed activities, sectoral elasticities, price transmission, mark-ups, and labour mobility. The paper explicitly shows that different parameter choices generate very different annual TFP growth estimates.
Domestic value-added assumption. The paper notes that if productivity improvements occur abroad in sectors from which Italy imports intermediate inputs, the model may overstate some domestic value-added effects. This matters because AI supply chains and AI services are international.
Market power. Some simulations assume that productivity gains are transmitted to prices through constant mark-ups. But if AI suppliers or adopting firms retain gains as profits, propagation through lower prices and cheaper intermediate inputs may be weaker.
Labour reallocation. A model can allow labour to move across sectors, but the Italian labour market may be constrained by skills mismatch, geographic immobility, contractual rigidity, and sector-specific human capital. If labour cannot move toward expanding AI-complementary activities, macro gains may be smaller or slower.
Policy evaluation. The international policy comparison is informative, but some parts are necessarily qualitative. Public AI investment figures are not always comparable across countries, because they may mix actual expenditure, commitments, defence spending, research funding, infrastructure funding, and political announcements. The paper’s policy table is useful as a map, not as a precise causal ranking.

Reported facts

Adoption in Italy

The paper reports that AI adoption among Italian firms with at least 20 employees rose rapidly. It doubled in 2025 to 27 percent and reached 32 percent at the beginning of 2026. This is the most visible positive fact in the paper.

But the intensive-use figure is far lower. Only about 5 percent of firms report intensive integration of AI into business processes. This is the central diagnostic fact: adoption is spreading, but deep operational transformation remains rare.

The paper also reports that AI adoption is more common among larger firms and in services. This is consistent with a simple cost-and-capability interpretation: larger firms have more data, more standardized processes, larger fixed-cost absorption capacity, more IT staff, and more managerial bandwidth to absorb new technologies.

Compared with Europe, Italy remains behind. Using comparable 2025 data for firms with at least 10 employees, Italy’s adoption rate is four percentage points below the EU average and almost ten percentage points below Germany.

Where firms use AI

The most common business functions affected by AI are commercial activities, production of goods and services, and administrative compliance. The paper’s language is important: AI is being used mainly to optimize existing phases of processes, not to create new products or services.

Among firms using generative AI, around 30 percent use such applications, and text generation dominates. A little more than half also use other applications such as customer or employee chatbots, AI agents, or code-generation tools.

The pattern suggests a hierarchy of ease. Text generation is the low-friction entry point because it can be adopted with limited systems integration. Chatbots, agents, code generation, and process automation require more data, governance, workflow embedding, and monitoring.

Task-level productivity facts

The paper reports several strong task-level effects from the literature. For writing and text-production tasks, one cited study reports an average completion-time reduction of around 40 percent and a quality increase of around 18 percent. For customer service, another study reports an average productivity increase of around 14 percent. For coding tasks, a field experiment reports productivity gains of around 55 percent, measured through lines of code produced. For consulting-type professional tasks, users of AI complete more tasks and do so faster, with execution times around 25 percent lower.

These facts establish that AI can raise local task efficiency. But the paper’s review is careful: task-level productivity is not equivalent to firm-level productivity. The translation from one to the other depends on process architecture.

Firm-level productivity facts

The paper’s own firm-level analysis does not identify systematic short-run effects of AI adoption on revenue per employee, employment, or investment.

This is one of the most important results. It creates an apparent contradiction: task-level gains are large, but firm-level effects are not yet visible. The paper resolves the contradiction through system logic. Firms require complementary adjustments before local task gains become aggregate productivity gains.

The same pattern appears in survey responses. Among firms using AI, 70 percent report that adoption has not yet affected labour productivity. Expectations are more positive: around half expect a positive productivity effect over the next three years, while 35 percent expect no change.

This is a useful empirical split between realized effects and expected effects. Realized effects are still limited; expected effects are materially more optimistic.

Long-run productivity estimates

The paper reports a wide range of long-run estimates because the results depend on assumptions.

To frame long-run uncertainty, QEF 1009 compares two external task-based macroeconomic benchmarks. Under Acemoglu’s more conservative assumptions, AI contributes approximately 0.07 percentage points to annual TFP growth; under Aghion and Bunel’s more optimistic assumptions, the corresponding figure is approximately 0.68 percentage points.⁴ ⁵ QEF 1009 uses this contrast as a decomposition exercise, not as a final forecast for Italy. The gap between the two estimates is driven by different assumptions about the share of tasks exposed to AI, the fraction of exposed tasks for which adoption is economically convenient, the size of the efficiency gain when adoption occurs, and the labour share in exposed activities. The paper then turns to OECD-style estimates and to Italy-specific production-network simulations, where the reported annual labour-productivity gains for Italy range from 0.2 percentage points under slow adoption to 1.1 percentage points under rapid adoption.

The OECD estimates for G7 economies suggest annual labour-productivity gains from AI adoption between 0.2 and 1.3 percentage points. For Italy, the OECD-style estimate is approximately 0.19 percentage points under slow adoption and 0.89 under rapid and extensive adoption.

The paper’s own Italian simulations report annual labour-productivity gains over a decade of 0.2 percentage points under slow adoption, 0.7 under medium adoption, and 1.1 under rapid adoption. It also reports a slight positive employment effect, around one tenth of a percentage point in each scenario.

The decisive variable is therefore not whether AI exists, but whether adoption becomes fast, deep, and complementary.

Sectoral propagation facts

The paper decomposes the contribution to GDP growth by sector under a rapid-adoption scenario. The largest contribution comes from manufacturing, about 2 percentage points over the decade. Commerce follows, with about 1 percentage point. Professional, scientific, and technical activities contribute about 0.8 percentage points.

The professional-services result is especially interesting. The paper notes that this sector has centrality in the production network because it serves and connects multiple supply chains. As a consequence, a productivity improvement there can generate indirect effects beyond its own sectoral size.

This is one of the most enterprise-relevant points in the paper. AI adoption in a central service function can have leverage if that function feeds many downstream production processes. In enterprise-architecture language, the productivity effect is not proportional only to local cost share; it is also proportional to architectural centrality.

Barriers to adoption

The paper reports that, in 2025, around 10 percent of Italian firms had not adopted AI despite having considered it. Among these firms, the claim that AI is not useful for their production processes is residual.

The main barrier is lack of skills. But the paper does not reduce the problem to skills alone. Firms also report broader technological and organizational unreadiness, insufficient data integration, regulatory uncertainty, and risks associated with adoption.

This matters because a single-policy response would be structurally inadequate. A training subsidy does not solve missing data standards. A tax credit does not solve regulatory uncertainty. A chatbot voucher does not solve process redesign. A compute credit does not solve the absence of vertical applications.

Policy facts

The paper’s policy section distinguishes three families of intervention:

Demand-side support. This includes information, advisory services, testbeds, demonstrators, Digital Innovation Hubs, Competence Centers, pilots, proof-of-concept projects, grants, and vouchers. The purpose is to help firms discover economically meaningful use cases and reduce initial adoption risk.
Supply-side support. This includes support for startups, scale-ups, applied research, technology transfer, public procurement, and the development of vertical applications. The goal is to solve the missing-middle problem by creating scalable solutions between generic AI tools and expensive bespoke implementations.
Enabling infrastructure. This includes data spaces, interoperability standards, regulatory sandboxes, legal certainty, access to high-performance computing, compute credits, technical support, upskilling, reskilling, management training, and talent attraction.

The paper is explicit that the Italian policy landscape contains useful institutions but lacks sufficient coherence and scale. Digital Innovation Hubs and the Italian Competence Centers, the Centri di Competenza ad alta specializzazione created within the Industry 4.0 policy architecture, provide territorial coverage, but their funding is described as limited, unstable, and fragmented. The paper reports cumulative funding for Italian Competence Centers of €186 million over 2019–2025, and contrasts this with annual public resources of €2.2 billion for Germany’s Fraunhofer network and £320 million for the UK Catapult centres in 2023.

Innovation ecosystem facts

The paper reports that Italy has scientific capabilities but weak transformation of science into industrial AI applications. In 2022, highly cited AI scientific articles produced by researchers resident in Italy were a little more than half the German number, while AI patent applications were about twenty times lower than Germany’s.

This gap is relevant because productivity from AI does not arise only from buying tools. It also depends on the domestic ecosystem’s capacity to adapt models, build sector-specific applications, integrate them into production systems, and diffuse them through suppliers.

Compute infrastructure facts

Italy has significant high-performance computing assets. The paper discusses Eni’s HPC6 and Cineca’s Leonardo.

HPC6 is presented as a major industrial supercomputing infrastructure, used for energy applications, simulation, and potentially AI and machine-learning workloads. Leonardo, hosted by Cineca at the Bologna Technopole, is a public supercomputing infrastructure primarily oriented toward scientific research but increasingly relevant for AI.

However, the paper’s key point is not that Italy lacks compute. The point is that compute must be made usable by firms. The paper reports that, in 2024, only about 5 percent of Leonardo’s computing power was used for industrial projects. The planned IT4LIA AI Factory is therefore important because it aims to convert research-oriented compute capacity into a more accessible industrial AI infrastructure.

This distinction is critical. Supercomputing capacity is not automatically an industrial-policy asset. It becomes one only when firms can access it through practical interfaces, technical support, credits, datasets, tools, and sector-specific experimentation paths.

Review assessment

The strength of the paper is its system-level framing. Many AI-adoption studies stop at survey rates. Many productivity studies stop at task experiments. Many policy papers stop at generic recommendations. QEF 1009 is more useful because it connects adoption depth, task productivity, firm-level measurement, macro propagation, and policy design.

Its most important analytical move is the distinction between local task acceleration and aggregate productivity. This distinction should be obvious, but it is often lost in public discussion. A production system improves when bottlenecks move, constraints are relaxed, throughput rises, quality-adjusted output increases, or input requirements fall. A tool that makes one task faster is only a candidate productivity improvement until the surrounding process changes.

The paper is also valuable because it avoids naïve techno-optimism. It does not infer long-run productivity from impressive demonstrations. At the same time, it avoids premature pessimism. It does not infer low potential from absent short-run firm-level effects. The resulting position is empirically cautious but strategically serious.

The weakest part is unavoidable: the policy architecture is more convincing than the policy measurement. International comparisons of AI policy maturity are useful, but public AI spending, institutional capacity, regulatory experimentation, and ecosystem quality are difficult to compare. A table can rank policy areas, but it cannot fully measure execution quality.

Another limitation is that the firm-level empirical analysis necessarily compresses heterogeneous forms of AI adoption into a tractable treatment variable. The real economic distinction is not adopter versus non-adopter. It is more granular: experimental use, function-level adoption, workflow integration, process redesign, product innovation, automated decision support, and operating-model transformation. Future empirical work will need adoption-intensity measures closer to enterprise architecture than to procurement labels.

Implications for firms

For enterprise decision makers, the paper implies that AI adoption should not be governed as a software rollout. It should be governed as an operating-model transformation.

Business constraint. The first managerial question is not Which AI tool should we buy? but Which business constraint can AI relax? If the constraint is document throughput, customer-response latency, engineering cycle time, compliance review, maintenance prediction, sales qualification, or software delivery, then adoption must be designed around that constraint.
Complementary assets. The second question is whether the firm has the necessary complements. AI requires process ownership, clean data, integration architecture, security governance, model-risk management, human supervision, training, and measurable KPIs. Without these complements, adoption may remain visible but shallow.
Process advantage. The third question is whether the firm is merely buying generic capability or building process advantage. Generic AI tools may be useful, but they are easy for competitors to buy as well. Durable productivity gains are more likely when AI is embedded into firm-specific processes, data assets, decision rights, and supplier/customer interfaces.
Ecosystem integration. The fourth question is whether adoption is isolated or ecosystemic. In supply chains, the value of AI may depend on shared data standards, interoperable systems, and adoption by upstream or downstream partners. This is precisely where public policy, procurement, and sectoral data spaces can matter.

From AI adoption to value added: the ecosystem condition

The Banca d’Italia paper can be pushed one step further. Its central empirical distinction is between adoption and productivity. The next distinction is between productivity potential and value-added realization. AI may improve tasks; it may even improve firm-level productivity after organizational adjustment; but a country captures durable value added only when it has an ecosystem able to translate digital innovation into products, processes, services, intellectual property, exports, margins, and new firms.

The first-principles mechanism is simple. Value added is the difference between output value and intermediate input cost. AI increases value added only if it either raises the value of output, lowers the real cost of inputs, compresses cycle time, increases quality-adjusted throughput, enables new products, or allows firms to perform activities that were previously technically or economically infeasible. A chatbot subscription, an AI coding assistant, or an isolated document-generation tool is not yet this mechanism. It is an input. The economic question is whether that input is absorbed into a production architecture.

This is why the relevant unit of analysis is not the single model, and not even the single adopting firm. The relevant unit is the innovation-to-value chain: research, compute, data, model access, software engineering, domain knowledge, process redesign, financing, procurement, regulation, cybersecurity, management capability, labour skills, and market scaling. If one link is missing, the chain leaks value. Scientific capability may fail to become industrial application. Industrial data may remain locked in silos. Compute may be available but unusable by SMEs. Startups may produce prototypes but fail to scale. Firms may adopt tools but not redesign processes. Public funds may subsidize purchases without generating reusable capabilities.

The evidence is consistent with this interpretation. EIB data show that European firms have almost closed the headline adoption gap with US firms in advanced digital technologies: 77 percent of EU firms use advanced digital technologies against 78 percent of US firms; generative AI use is also similar, at 37 percent in the EU and 36 percent in the US. But the deeper difference is integration. Among firms using AI, 81 percent of US firms use it in at least two business processes, compared with 55 percent of EU firms. The productivity issue is therefore not merely adoption count; it is breadth and depth of integration.⁶

This supports the Banca d’Italia result. If Italian firms are adopting AI but mainly for limited, experimental, or existing-process optimization, measured firm productivity may remain weak in the short run. The local task gain exists, but the production system has not yet been reconfigured. The ecosystem problem is exactly this conversion gap: from tool use to process integration, from process integration to productivity, and from productivity to national value added.

The Italian data make the conversion gap visible. The European Commission’s 2025 Digital Decade country profile reports that 70.2 percent of Italian SMEs had at least a basic level of digital intensity, while only 8.2 percent of Italian enterprises had adopted AI. The same profile states that Italy has strengths in strategic technologies such as quantum and semiconductors, but also that the startup ecosystem remains underdeveloped, with only nine unicorns, a number that the Commission judges not commensurate with the size of the Italian economy.⁷

The European Innovation Scoreboard gives the same diagnosis in a different vocabulary. Italy is classified as a Moderate Innovator, with performance equal to 93 percent of the EU average in 2025. Its relative weaknesses include tertiary education, business-sector R&D expenditure, and job-to-job mobility of human resources in science and technology. Its lowest-ranked indicators include high-speed internet access and employed ICT specialists.⁸ These are not peripheral indicators. They are precisely the complementary assets required to turn AI into business value.

The OECD evidence reinforces the point. It notes that AI adopters are often more productive than non-adopters, but that this advantage shrinks once one accounts for digital capabilities, cloud adoption, ICT specialists and worker skills. In other words, AI is correlated with productivity partly because better firms adopt it first. The policy implication is not that AI is irrelevant, but that AI returns are conditional on complementary investments.⁹

OECD also stresses that productivity gains may follow a J-curve. In the first phase, measured productivity can be flat or even temporarily weaker because firms incur integration costs before they obtain visible benefits: they must clean and govern data, adapt processes, integrate AI with existing systems, train workers, redesign managerial routines, introduce controls, and absorb experimentation failures. These costs are real and often appear immediately. The benefits, by contrast, appear only later, when the new operating model stabilizes and the complementary intangible investments begin to work together. The “J” shape therefore describes a temporal pattern: an initial dip or muted effect, followed by acceleration if adoption becomes deep enough.

%%{init: {"theme": "neutral", "look": "handDrawn", "layout": "elk"}}%%
flowchart TD
  A["AI adoption starts"] --> B["Integration phase"]
  B --> C["Measured productivity<br/>flat or temporarily weaker"]
  C --> D["Complementary investments mature"]
  D --> E["Workflow redesign stabilizes"]
  E --> F["Productivity gains become visible"]

  B -. "Immediate costs" .-> B1["Data cleaning<br/>System integration<br/>Training<br/>Governance<br/>Process redesign"]
  D -. "Intangible capital" .-> D1["Reusable data assets<br/>New routines<br/>AI-capable workers<br/>Managerial learning"]
  F -. "Observed outcomes" .-> F1["Higher throughput<br/>Lower error rate<br/>Shorter cycle time<br/>Higher value added per worker"]

  classDef cost fill:#fff3cd,stroke:#7a5c00,stroke-width:1px;
  classDef gain fill:#e7f6ec,stroke:#247a3b,stroke-width:1px;
  classDef neutral fill:#eef2f7,stroke:#4a5568,stroke-width:1px;

  class B,C,B1 cost;
  class D,E,D1 neutral;
  class F,F1 gain;

Figure 1: AI productivity J-curve: integration costs precede measurable gains

This means that an AI policy focused only on the demand side is structurally incomplete. Subsidizing firms to buy AI tools may raise adoption statistics but not necessarily value added. To create value, an ecosystem must perform at least six conversion functions:

It must convert research into usable technology. Universities, public laboratories and research centres generate knowledge, but firms need deployable components, reference architectures, tested models, APIs, documentation, support services and sector-specific validation. Without technology-transfer intermediaries, knowledge remains upstream.
It must convert data into computable assets. AI systems do not operate on data in the abstract. They require structured, governed, interoperable, legally usable, secure and semantically meaningful datasets. This is why data spaces, data standards, metadata, ontologies, consent architectures, cyber controls and data-sharing governance are productive infrastructure, not bureaucratic accessories.
It must convert compute into accessible service. Supercomputers and AI factories matter only if firms can use them. For most SMEs, the bottleneck is not theoretical FLOPS; it is access procedure, technical support, cloud-native tooling, model-development environments, data pipelines, inference services, cost predictability and trusted assistance. The European AI Factories initiative is relevant because it explicitly combines computing power, data and talent, and because its one-stop-shop model is designed for startups, SMEs and researchers rather than only for academic high-performance computing users.¹⁰

It must convert generic models into vertical applications. The economically useful layer is not only the frontier model. It is the process-aware application: AI for maintenance planning, energy forecasting, claims management, technical-document control, quality inspection, production scheduling, procurement analysis, regulatory compliance, software modernization, design automation, or customer operations. This is the missing middle identified by the Banca d’Italia paper: the market risks being split between generic tools with shallow integration and expensive custom projects accessible only to large firms.
It must convert adoption into operating-model change. AI requires process redesign, decision-right redesign, KPI redesign, human-in-the-loop controls, model-risk management, cybersecurity, integration with ERP/CRM/MES/SCADA/document systems, and training of managers and operators. If these changes do not happen, AI remains a productivity option, not a productivity fact.
It must convert prototypes into scale. This is where finance, procurement, standards and market access become decisive. Startups and specialized vendors need early demand, reference customers, public procurement, venture debt, patient capital, regulatory sandboxes and access to industrial data. Without this layer, pilots do not become products, and products do not become industrial diffusion.

The global R&D structure shows why this matters. The European Commission’s 2025 Industrial R&D Investment Scoreboard reports that the world’s top 2,000 R&D investors account for more than 90 percent of business-sector R&D. In the AI-relevant ICT software sector, the asymmetry is large: US software companies invested €276 billion in R&D in 2024, about 17 times the investment of EU software companies in the same sector. US ICT-software firms also increased capital expenditure by 50.6 percent in 2024, reaching €279.7 billion, as they built infrastructure for advanced AI.¹¹

This fact changes the interpretation of European and Italian AI policy. If frontier-model production and hyperscale AI infrastructure are concentrated elsewhere, then Europe and Italy cannot rely on passive adoption. They must specialize in the layers where value can still be captured: industrial data, domain models, trusted applications, cybersecurity, robotics, embedded AI, regulated-sector AI, engineering tools, public-sector procurement, and AI-enabled modernization of existing industrial strengths.

This is not a defensive argument. It is a value-chain argument. A country does not need to dominate every layer of AI to capture value, but it must control enough complementary layers to avoid becoming merely a purchaser of foreign general-purpose technology. If the model, cloud, chip, platform, application, data pipeline and integration partner are all external, domestic firms may gain some efficiency, but strategic margins, learning, intellectual property and ecosystem spillovers accrue elsewhere.

The European Commission’s AI Continent Action Plan implicitly recognizes this. It combines the InvestAI initiative, intended to mobilize €200 billion for AI investment in Europe, with AI Factories, data access, skills and regulatory simplification. The Data Union Strategy is framed as a way to create a data market able to scale AI solutions, while AI Factories are meant to connect compute, data, talent, SMEs, startups, industry and finance.¹² Italy’s IT4LIA AI Factory, with a reported total cost of €430 million and a link to Leonardo, is relevant only if it becomes such a translation infrastructure: not simply a machine room, but an industrial service platform.¹³

A practical policy test follows. A policy measure should not be evaluated only by asking whether it increases AI adoption. It should be evaluated by asking which conversion bottleneck it removes. Does it create reusable vertical applications? Does it improve data interoperability? Does it increase managerial capability? Does it help SMEs redesign processes? Does it create reference implementations? Does it strengthen domestic suppliers? Does it improve access to compute? Does it reduce regulatory uncertainty? Does it generate measurable value added rather than subsidized software consumption?

For firms, the same test applies internally. An AI project should not be approved because it uses a fashionable model. It should be approved because it changes a constraint in the operating model. The minimum business case should identify the bottleneck, the process owner, the data source, the integration point, the control model, the expected throughput or quality gain, the adoption path, the cybersecurity and compliance controls, and the metric by which value added will be measured.

This is the deeper lesson to add to QEF 1009. AI productivity is not produced by models alone. It is produced by an ecosystem that makes models economically operative. Italy’s challenge is therefore not only to increase AI adoption, but to build the institutional, industrial and managerial machinery that turns digital innovation into domestic value added.

Conclusion

Banca d’Italia QEF 1009 should be read less as a paper on artificial intelligence in the narrow technical sense and more as a paper on absorptive capacity. Its central result is not simply that Italian firms are adopting AI, nor that AI may add between 0.2 and 1.1 percentage points to annual labour-productivity growth under different long-run scenarios. The more important result is conditional: those gains require a transition from shallow use to deep integration.

The paper’s empirical structure is therefore correctly cautious. At the task level, the evidence already shows large gains in writing, coding, customer service and other bounded activities. At the firm level, however, the paper does not yet identify systematic short-run effects on revenue per employee, employment or investment. This is not a contradiction. A task is not a firm, and a tool is not a production system. Productivity rises when the binding constraints of the system move: when workflows are redesigned, data become usable, decisions are reorganized, bottlenecks are removed, and complementary investments mature.

The extension proposed in this review sharpens the same argument. Even productivity is not the final variable. The final variable is value added. A country captures value from AI only if it has an ecosystem that translates digital innovation into domestic economic substance: reusable applications, specialized suppliers, data infrastructures, compute access, managerial capability, skilled labour, technology transfer, procurement channels, standards, finance and regulatory certainty. Without that ecosystem, AI adoption may still occur, but a large part of the surplus may accrue elsewhere: to foreign platforms, hyperscalers, frontier-model providers, software vendors and system integrators.

This is the strategic risk for Italy and, more broadly, for Europe. The problem is not the absence of isolated assets. Italy has industrial depth, research institutions, competence centres, supercomputing capacity and sectoral know-how. The weakness is orchestration. Scientific output does not automatically become patentable technology; compute capacity does not automatically become usable SME infrastructure; startup formation does not automatically become scale-up; generic models do not automatically become vertical applications; and software adoption does not automatically become operating-model transformation.

The policy implication is consequently stricter than a generic call for more AI investment. Measures should be judged by the conversion bottleneck they remove. A useful policy is one that helps firms identify high-value use cases, access usable data and compute, test solutions in realistic environments, adopt open and interoperable standards, reduce compliance uncertainty, finance pilots, scale domestic suppliers, and train managers and workers to redesign processes around the technology. A weak policy is one that merely increases subsidized software consumption while leaving the production system unchanged.

The same discipline applies inside firms. An AI project should not be justified because it uses a frontier model or because competitors are experimenting with similar tools. It should be justified because it modifies a measurable operational constraint: cycle time, quality, throughput, error rate, working-capital intensity, engineering velocity, compliance cost, customer-response latency, maintenance accuracy, or product-development capacity. The minimum managerial question is not Which AI tool should we buy? but Which part of the operating model becomes economically different after adoption?

The most useful warning of QEF 1009 is historical. The ICT revolution did not reward countries and firms merely because they purchased computers and software. It rewarded those that reorganized work, management, data, supply chains and business models around the new technology. AI is likely to impose the same discipline, but with faster competitive pressure and higher dependence on intangible assets.

The final thesis is therefore simple. AI will not become Italian productivity by diffusion alone. It will become productivity only through integration; it will become value added only through an ecosystem capable of transforming models, data and compute into industrial capability. Italy’s challenge is not merely to adopt AI faster. It is to build the machinery that converts AI adoption into durable domestic value.

Appendix A — From Task Gain to Firm Productivity

Purpose of the appendix

This appendix formalizes the central productivity problem raised by Banca d’Italia QEF 1009: the empirical literature already documents sizeable AI-related gains in individual tasks, while short-run firm-level evidence is still weak or inconclusive. This is not a contradiction. A task is not a firm; a local acceleration is not necessarily a throughput increase; a tool is not automatically a new production function.¹⁴

The appendix explains why task-level AI gains may fail to appear immediately in firm productivity statistics, and why the passage from local gain to aggregate productivity requires process redesign, complementary assets, demand, managerial capability and organizational absorption.

The core thesis is the following:

A task-level productivity gain becomes firm-level productivity only when it relaxes a binding constraint in the firm’s production system or when the firm reorganizes adjacent tasks so that the local gain propagates into output, quality, cost, speed, or value added.

This means that AI adoption should not be evaluated only by asking whether workers are faster when using AI. The relevant question is whether the firm’s production system becomes economically different after AI adoption.

Definitions

Let a firm produce output Y using labour L, capital K, intermediate inputs M, data D, organizational capital O, software systems S, and managerial practices G. In reduced form:

Y = F(L, K, M, D, O, S, G)

AI can affect this production function in several ways. It can reduce the labour time required for a task, increase output quality, reduce errors, improve prediction, automate classification, accelerate coding, compress document-processing time, improve customer response, optimize scheduling, or enable new products and services.

However, firm productivity is not the same object as task speed. Labour productivity is typically measured as:

\text{Labour Productivity} = \frac{Y}{L}

or, in value terms:

\text{Labour Productivity} = \frac{\text{Value Added}}{\text{Employment or Hours Worked}}

Therefore, a local AI gain affects measured productivity only if it changes the numerator, the denominator, or both. A worker completing a draft faster does not automatically increase value added. The saved time must be reallocated to productive work, output demand must expand, quality-adjusted output must rise, or labour input must fall without reducing output. If the saved time is absorbed by waiting, rework, coordination, review, compliance, or idle capacity, measured productivity may not move.

This is the first-principles reason why QEF 1009 can report strong task-level evidence and still find no systematic short-run effect at firm level.

The four levels of AI productivity

AI productivity should be analysed at four distinct levels.

Level	Unit of observation	Typical metric	What can go wrong
Task	A bounded activity such as drafting, coding, summarizing, classifying, replying, translating or searching	Time saved, output quality, error rate, units per hour	Gain remains local and does not affect final output
Process	A sequence of tasks producing a business result	Cycle time, throughput, first-pass yield, rework, cost-to-serve	Adjacent tasks, approvals, queues or systems become bottlenecks
Firm	The whole production system	Revenue per employee, value added per worker, TFP, margin, investment	Gains are too small, too dispersed, or offset by integration costs
Economy	Network of sectors and firms	Aggregate labour productivity, TFP, GDP, real wages	Spillovers depend on adoption depth, sector centrality, capital, skills and data infrastructure

The error to avoid is moving directly from the first row to the third or fourth row. A task experiment can prove that AI increases local efficiency. It does not prove, by itself, that firms or countries will see proportional productivity growth.

The additive benchmark: when task gains aggregate smoothly

The simplest case is an additive production or cost structure. Suppose a business process is composed of n tasks. Let task i account for a share s_i of total process labour cost, where:

\sum_{i=1}^{n} s_i = 1

If AI reduces the labour time required for task i by a fraction g_i, then the maximum direct proportional labour-cost saving, under a separable additive assumption, is approximately:

\frac{C - C'}{C} \approx \sum_{i=1}^{n} s_i g_i

This formula gives an upper bound under favourable conditions. It assumes that tasks are separable, saved time can be productively redeployed or removed, quality is unchanged or improved, and no new coordination or review costs are introduced.

For example, if document drafting represents 10 percent of the labour cost of a process and AI reduces drafting time by 40 percent, the direct process-level labour-cost reduction is not 40 percent. It is at most:

0.10 \times 0.40 = 0.04

or 4 percent.

This is before considering validation, editing, compliance review, data entry, customer approval, system integration, and managerial oversight. The same logic applies to coding. If code writing becomes faster but testing, security review, deployment, architecture governance, user acceptance and release management do not change, the software-delivery process will not accelerate proportionally.

The additive benchmark is useful because it disciplines exaggerated claims. A large task gain can imply a modest process gain if the task has a small share of total cost or total cycle time.

The bottleneck case: when task gains do not propagate

Many firms are not additive systems. They are constrained systems. Their output is determined by bottlenecks.

Let a process have several stages, each with capacity c_i. Overall throughput is:

T = \min(c_1, c_2, \ldots, c_n)

If AI increases the capacity of a non-bottleneck stage, total throughput does not change. The local team may work faster, but final output remains constrained by another stage.

For example, suppose an engineering organization has the following weekly capacities:

Stage	Weekly capacity before AI
Requirements clarification	40 items
Coding	80 items
Security review	30 items
Testing	35 items
Release approval	25 items

The system throughput is 25 items per week, because release approval is the bottleneck. If AI-assisted coding raises coding capacity from 80 to 120 items per week, total throughput remains 25 items per week. The local coding gain is real, but the firm-level productivity gain is zero unless the bottleneck also moves.

This is one reason why AI projects often appear useful to individuals while remaining invisible in firm-level productivity measures. The local task is improved, but the constraint is elsewhere.

%%{init: {"theme": "neutral", "look": "handDrawn", "layout": "elk"}}%%
flowchart TD
  R["Requirements<br/>40 items/week"] --> C["Coding<br/>80 items/week"]
  C --> S["Security review<br/>30 items/week"]
  S --> T["Testing<br/>35 items/week"]
  T --> A["Release approval<br/>25 items/week"]
  A --> O["Delivered output<br/>25 items/week"]

  C -. "AI-assisted coding" .-> C2["Coding after AI<br/>120 items/week"]
  C2 -. "No throughput change<br/>because release approval is still binding" .-> A

  classDef bottleneck fill:#fff3cd,stroke:#7a5c00,stroke-width:2px;
  class A bottleneck;

Figure 2: Why improving a non-bottleneck task may not increase firm throughput

The serial-cycle-time case: when local time saving is diluted

A process may be sequential rather than capacity-constrained. Suppose total cycle time is the sum of task times:

\mathcal{T} = \tau_1 + \tau_2 + \ldots + \tau_n

If AI reduces task j from \tau_j to \tau_j', the proportional reduction in total cycle time is:

\frac{\mathcal{T} - \mathcal{T}'}{\mathcal{T}} = \frac{\tau_j - \tau_j'}{\mathcal{T}}

The gain is therefore bounded by the task’s share of total cycle time.

Consider a sales proposal process:

Step	Time before AI
Customer qualification	1 day
Technical scoping	3 days
Draft proposal	2 days
Pricing approval	4 days
Legal review	5 days
Customer negotiation	10 days

Total cycle time is 25 days. If AI reduces proposal drafting from 2 days to 0.5 days, the local drafting gain is 75 percent, but total cycle time falls from 25 days to 23.5 days. The process-level gain is 6 percent.

The AI tool may still be valuable: quality may improve, the proposal team may handle more opportunities, or salespeople may focus on higher-value work. But the measured cycle-time effect is limited unless pricing, legal review, negotiation and scoping are also redesigned.

The queueing case: when small task gains can be large

The opposite can also happen. A modest AI gain can have a large process effect if it relaxes a highly utilized queue.

If a service stage receives work at rate \lambda and processes it at rate \mu, utilization is:

\rho = \frac{\lambda}{\mu}

In the simple M/M/1 case — a stylized single-server queue with random arrivals, random service times, and stable capacity such that \lambda < \mu — the expected time in the system is:

W = \frac{1}{\mu - \lambda}

and the expected waiting time in queue is:

W_q = \frac{\rho}{\mu - \lambda}

These formulas should not be read as a literal model of every business process. They are useful because they show the non-linear mechanism. As \lambda approaches \mu, the denominator becomes small and waiting time rises sharply. Therefore, when a process stage is close to saturation, even a modest AI-driven increase in service capacity \mu, or a reduction in avoidable arrivals \lambda, can produce a disproportionately large reduction in delay.

The practical implication is that AI has high leverage when it increases capacity at a stage operating near saturation. Examples include:

customer-support triage;
claims review;
compliance screening;
software bug classification;
invoice exception handling;
maintenance-ticket prioritization;
engineering change-request review.

In these cases, AI does not need to automate the whole process to create value. It only needs to reduce service time, improve routing accuracy, or lower the arrival rate of avoidable exceptions at the overloaded stage.

This explains why some AI deployments show immediate measurable effects while others do not. The decisive variable is not only the technical performance of AI, but the location of the task inside the process architecture.

The Leontief case: perfect complementarity

In some processes, tasks are strict complements. Output is limited by the least available required input or completed step:

Y = \min \left(\frac{x_1}{a_1}, \frac{x_2}{a_2}, \ldots, \frac{x_n}{a_n}\right)

If AI improves one input but another input remains binding, output does not increase.

This is the production-theory version of the Banca d’Italia paper’s warning. AI may strongly improve one task, but if the production process requires multiple complementary tasks in fixed proportions, the gain may not translate into aggregate output.

Examples are common in regulated or safety-critical environments:

A legal memo drafted by AI still requires lawyer review.
A pharmaceutical document summarized by AI still requires validated regulatory submission.
A software patch generated by AI still requires testing, security validation and deployment control.
An industrial maintenance recommendation still requires spare parts, technician availability and safe work permits.
A credit-risk analysis still requires policy compliance, explainability and approval authority.

In such processes, AI can reduce the cost of one component, but the firm captures full productivity only when complementary steps are also changed.

The O-ring case: quality complementarity and residual human work

The O-ring model describes production processes in which quality failures in any one task can degrade or destroy the value of the whole output.¹⁵ The name comes from the Challenger disaster, where the failure of one small component had catastrophic system-level consequences.

In a simplified O-ring production function, output quality is multiplicative:

Q = \prod_{i=1}^{n} q_i

where q_i is the quality or reliability of task i. In such a system, raising the quality of one task may have limited value if another task remains unreliable.

This matters for AI because many business processes are not pure volume processes. They are reliability processes. Examples include cybersecurity incident response, financial reporting, medical documentation, engineering design, grid operations, legal advice, software release, and regulatory compliance.

An AI system may generate a useful draft, but if the residual human or institutional step is responsible for correctness, liability, approval, or safety, the task is not eliminated. It is transformed. The productivity gain depends on how much human work remains, how much review is required, and whether AI reduces or increases the risk of hidden errors.

Recent work on O-ring automation makes this point directly: when tasks are quality complements, exposure scores that simply add automatable tasks can overstate displacement or productivity effects. If one unautomated task remains a binding bottleneck, automation of other tasks may reallocate human attention to the bottleneck rather than eliminate the job or proportionally raise output.¹⁶

For enterprise adoption, the implication is that AI should not be assessed only by the share of tasks it can accelerate or automate. In O-ring processes, the economically decisive question is whether AI changes the reliability of the whole chain. If AI accelerates drafting but increases verification burden, hidden-error risk, or approval friction, the apparent task gain may disappear at process level. If, instead, AI improves both the task and the reliability of validation, for example by reducing omissions, standardizing checks, surfacing anomalies, or improving traceability, the productivity effect can be materially larger.

Why task experiments are still important

The previous sections should not be read as skepticism about AI’s technical usefulness. Task-level evidence is essential because it proves that AI can change the local production frontier.

The empirical evidence cited by QEF 1009 is substantial. In writing tasks, experimental evidence shows large reductions in completion time and improvements in output quality. In customer service, deployment of a generative AI assistant increased productivity, especially for less experienced workers. In coding, field-experiment evidence shows large increases in code output. These are real effects.¹⁷ ¹⁸ ¹⁹

But their interpretation must be precise. They show that AI can improve bounded tasks under specific conditions. They do not prove that firms will automatically obtain proportional gains in revenue per employee, value added per worker, or total factor productivity.

Task experiments answer the question:

Can AI make this activity faster, cheaper, or better?

Firm productivity requires a second question:

Does improving this activity change the economics of the whole production system?

Both questions are necessary. The first establishes technical-economic potential. The second establishes organizational-economic realization.

Why firm-level gains may appear with a lag

The productivity J-curve provides a useful interpretation of the short-run evidence. General-purpose technologies often require complementary intangible investments: process redesign, training, data preparation, organizational change, new managerial routines, software integration and new business models. These investments may appear first as costs, while benefits appear later.²⁰

This is especially plausible for AI. A firm that wants to use AI beyond generic office productivity may need to:

clean and govern data;
redesign workflows;
integrate AI into enterprise systems;
define human oversight;
establish model-risk controls;
train workers and managers;
adapt KPIs;
change decision rights;
renegotiate supplier interfaces;
validate security and compliance controls;
develop monitoring for drift, hallucination, leakage and bias.

During this phase, measured productivity may stagnate or even decline. The firm is investing in organizational capital, but accounting systems may treat much of that investment as current cost. Later, if the new operating model works, productivity gains become visible.

This is directly analogous to the earlier ICT experience. The economic literature on information technology found that productivity effects were strongest when IT was combined with workplace reorganization, decentralization, skilled labour and managerial capability.²¹ ²²

The implication for AI is strict: buying AI tools is not equivalent to adopting AI as a productive technology. Productive adoption requires complementary reconfiguration.

%%{init: {"theme": "neutral", "look": "handDrawn", "layout": "elk"}}%%
flowchart TD
  A["AI adoption starts"] --> B["Integration phase"]
  B --> C["Measured productivity<br/>flat or temporarily weaker"]
  C --> D["Complementary investments mature"]
  D --> E["Workflow redesign stabilizes"]
  E --> F["Productivity gains become visible"]

  B -. "Immediate costs" .-> B1["Data cleaning<br/>System integration<br/>Training<br/>Governance<br/>Process redesign"]
  D -. "Intangible capital" .-> D1["Reusable data assets<br/>New routines<br/>AI-capable workers<br/>Managerial learning"]
  F -. "Observed outcomes" .-> F1["Higher throughput<br/>Lower error rate<br/>Shorter cycle time<br/>Higher value added per worker"]

  classDef cost fill:#fff3cd,stroke:#7a5c00,stroke-width:1px;
  classDef gain fill:#e7f6ec,stroke:#247a3b,stroke-width:1px;
  classDef neutral fill:#eef2f7,stroke:#4a5568,stroke-width:1px;

  class B,C,B1 cost;
  class D,E,D1 neutral;
  class F,F1 gain;

Figure 3: AI productivity J-curve: integration costs precede measurable gains

Why AI adoption intensity matters more than adoption count

A binary adoption variable is analytically weak. A firm that occasionally uses a public chatbot and a firm that embeds AI into customer operations, software delivery, quality control and forecasting are both adopters, but they are not economically comparable.

The U.S. Census Bureau’s AI diffusion work is useful because it separates firm use, business-function deployment and worker-task use. It reports that AI use is still often limited in scope: among adopting firms, many use AI in only a few business functions, and common generative-AI tasks include writing, document analysis and information search.²³

This confirms the need for a multi-level adoption taxonomy. From a productivity standpoint, the relevant ladder is:

Level	Adoption form	Productivity interpretation
1	Individual experimentation	Possible local time saving
2	Function-level use	Departmental efficiency possible
3	Workflow integration	Process productivity possible
4	Operating-model redesign	Firm productivity plausible
5	Product/service innovation	Value-added growth plausible
6	Ecosystem integration	Sectoral spillovers possible

QEF 1009’s distinction between broad adoption and intensive integration points in the same direction. The economically relevant question is not only how many firms use AI, but how deeply AI is embedded in value-producing processes.

A numerical example: why a large coding gain may not become firm productivity

Consider a software team delivering enterprise applications. Suppose total delivery effort is distributed as follows:

Activity	Share of total effort
Requirements and functional analysis	20%
Coding	25%
Testing and quality assurance	20%
Security and architecture review	10%
Deployment and release management	10%
User acceptance and change management	15%

Assume AI increases coding productivity by 50 percent. If coding represents 25 percent of total effort, the maximum direct effort reduction is:

0.25 \times 0.50 = 0.125

or 12.5 percent of total effort.

But this is still an upper bound. If testing and security review must expand because AI-generated code requires more verification, part of the gain is offset. If release management remains monthly, faster coding does not change release frequency. If requirements are unstable, faster coding may increase rework. If user acceptance is slow, throughput remains constrained.

Therefore, the firm-level result may be:

Scenario	Local coding gain	Process redesign?	Firm-level effect
Tool only	High	No	Small or invisible
Tool plus testing automation	High	Partial	Moderate
Tool plus DevSecOps redesign	High	Yes	Large
Tool plus product/platform strategy	High	Yes, plus reuse	Potentially structural

The task gain is necessary but not sufficient. The productivity result depends on the architecture around the task.

A numerical example: customer service

Consider a customer-support operation. Suppose an AI assistant helps agents resolve more tickets per hour. Empirical evidence from a large customer-support deployment shows a significant productivity increase from AI assistance.²⁴

But the firm-level effect depends on what happens next. If demand is elastic and faster support improves customer retention, output value may rise. If the firm has a backlog, faster resolution may reduce waiting time and improve service levels. If the firm can reallocate agents to higher-value cases, labour productivity may improve. If the firm simply lets agents absorb the saved time without changing staffing, routing, service-level agreements or customer-success processes, measured productivity may be smaller.

The process-level mechanism can therefore differ:

Mechanism	Productivity channel
Faster average handling time	More tickets per agent
Better answer quality	Lower repeat contacts
Improved routing	Fewer transfers
Agent coaching	Faster learning curve for junior workers
Knowledge capture	Lower dependence on tacit senior expertise
Customer retention	Higher revenue per customer
Staffing optimization	Lower cost-to-serve

The same AI tool can therefore produce different firm outcomes depending on operating-model design.

From productivity to value added

Even firm-level productivity is not the end of the analysis. For the national economy, the question is whether AI increases domestic value added.

A firm may adopt AI and improve productivity while much of the surplus accrues to external suppliers through software subscriptions, cloud fees, model API costs, consulting fees, or platform rents. Conversely, a domestic ecosystem of vertical AI suppliers, integrators, data providers and AI-enabled industrial firms can retain more of the surplus.

This is why the ecosystem argument in the main article matters. Task-level gains become firm productivity through process integration. Firm productivity becomes domestic value added through supplier ecosystems, intellectual property, scalable applications, industrial data, exportable services and retained margins.

The chain is:

%%{init: {"theme": "neutral", "layout": "elk"}}%%
flowchart TD
  A["AI improves a task"] --> B{"Is the task economically material?"}
  B -- "No" --> B1["Local convenience, weak productivity effect"]
  B -- "Yes" --> C{"Is it a bottleneck or part of a redesigned workflow?"}
  C -- "No" --> C1["Gain diluted by adjacent tasks"]
  C -- "Yes" --> D["Process-level productivity"]
  D --> E{"Does the firm reorganize roles, systems, data and KPIs?"}
  E -- "No" --> E1["Partial gain, J-curve delay"]
  E -- "Yes" --> F["Firm-level productivity"]
  F --> G{"Is there domestic ecosystem capture?"}
  G -- "No" --> G1["Efficiency gain with external rent leakage"]
  G -- "Yes" --> H["Domestic value added"]

Figure 4: From task-level AI gain to domestic value added

The diagram shows why simple AI adoption statistics are insufficient. The value chain contains multiple conversion gates. Failure at any gate weakens the measured economic result.

The role of complementary assets

The transition from task gain to firm productivity depends on complementary assets. These include:

Complementary asset	Why it matters
Data quality	AI cannot improve decisions if input data are incomplete, inconsistent, unlawful or semantically unstable
Process ownership	Someone must redesign the workflow and be accountable for measurable outcomes
Integration architecture	AI must connect to ERP, CRM, MES, PLM, ticketing, document systems, data platforms or operational systems
Human oversight	High-risk or high-value outputs require review, approval and accountability
Skills	Workers and managers must know how to use, supervise and challenge AI outputs
Managerial practices	Benefits depend on goal setting, monitoring, decentralization, incentives and execution discipline
Cybersecurity	AI introduces risks around data leakage, prompt injection, model misuse, supply chain and access control
Compliance	Legal constraints shape what can be automated, augmented or merely recommended
Change management	Workers must adopt the new process rather than route around it
Measurement	The firm must know whether cycle time, error rate, throughput, quality or value added changed

The OECD emphasizes the same point for SMEs: aggregate productivity gains require not only broader AI adoption, but also complementary investments in skills, data, cloud and organizational capability.²⁵

This is particularly relevant for Italy, because the productive structure contains many SMEs that may lack internal data-science teams, mature enterprise architecture, process documentation, advanced cybersecurity functions or structured change-management capacity.

Why some empirical studies find firm-level effects

The absence of systematic short-run effects in QEF 1009 does not imply that firm-level AI productivity effects never exist. Other studies using different data, identification strategies and firm populations find positive effects.

For example, EIB/BIS research using matched EIBIS–ORBIS data on more than 12,000 non-financial firms in the EU and United States estimates that AI adoption increases labour productivity by about 4 percent, with gains driven by capital deepening rather than short-run employment reduction.²⁶

This evidence is compatible with QEF 1009 rather than contradictory to it. It suggests that positive effects may appear where AI adoption is sufficiently material, capital-complemented, and embedded in firms with adequate absorptive capacity. The difference lies in sample, time horizon, adoption depth, identification strategy and measurement.

The correct synthesis is:

task-level AI productivity gains are already documented;
firm-level gains are possible but heterogeneous;
short-run aggregate evidence remains mixed;
deep integration and complementary assets are the likely mediating variables;
policy should therefore target the adoption-to-productivity conversion mechanism, not adoption counts alone.

A diagnostic model for firms

A firm evaluating an AI project should classify the project according to four questions:

Is the task material? A task is material if it accounts for a significant share of cost, time, quality risk, customer value, working capital, compliance exposure or revenue generation. If it is not material, even a large AI gain will have limited firm-level effect.
Is the task a bottleneck? If the task constrains throughput, lead time, quality or decision speed, AI has high leverage. If it is not a bottleneck, the gain may remain local.
Are adjacent tasks redesigned? AI may accelerate one step while creating new burdens in review, validation, exception handling or integration. Adjacent tasks must be redesigned to preserve the gain.
Is the gain measured at process level? The relevant metric should not be number of AI users or number of prompts. It should be cycle time, throughput, cost-to-serve, first-pass yield, conversion rate, customer retention, revenue per employee, error rate, inventory turns, engineering velocity or value added.

This produces a simple decision matrix.

Task materiality	Bottleneck status	Adjacent redesign	Expected productivity effect
Low	No	No	Local convenience
High	No	No	Bounded task saving
High	Yes	No	Bottleneck relief may be offset
High	Yes	Yes	Process productivity plausible
High	Yes	Yes, plus operating-model change	Firm productivity plausible
High	Yes	Yes, plus ecosystem integration	Value-added and spillover effects plausible

Implications for policy evaluation

The same logic applies to public policy. A policy that increases the number of firms using AI may not increase productivity if it only subsidizes generic tool consumption.

A good policy should remove one or more conversion bottlenecks:

Conversion bottleneck	Policy response	Productivity logic
Firms do not know where AI creates value	Advisory services, demonstrators, testbeds	Helps identify material tasks and bottlenecks
Firms cannot fund experiments	Pilot grants, vouchers, proof-of-concept support	Reduces fixed-cost barrier
Firms lack vertical applications	Startup support, procurement, applied R&D	Converts generic AI into sector tools
Firms lack usable data	Data spaces, standards, interoperability	Makes AI inputs reliable and scalable
Firms lack compute access	AI Factories, compute credits, technical support	Lowers infrastructure barrier
Firms lack skills	Training, reskilling, management education	Builds absorption capacity
Firms face regulatory uncertainty	Sandboxes, guidance, living labs	Reduces wait-and-see behaviour
Supplier rents absorb subsidies	Open standards, competitive procurement	Preserves contestability

The productivity test for policy is therefore not did adoption increase? but did the policy increase the probability that AI adoption changes production?

Implications for measuring AI success

A firm or policymaker should distinguish leading indicators from final indicators.

Leading indicators include:

number of AI use cases identified;
number of pilots launched;
number of pilots moved into production;
share of AI projects integrated with enterprise systems;
number of business functions using AI;
number of workers trained;
data-quality improvements;
reduction in manual handoffs;
adoption of reusable components;
use of compute or AI-factory services;
development of vertical applications.

Final indicators include:

revenue per employee;
value added per worker;
total factor productivity;
cost-to-serve;
cycle time;
first-pass yield;
error rate;
customer retention;
operating margin;
innovation output;
exportable AI-enabled services or products.

The mistake is to treat leading indicators as final indicators. Adoption is a leading indicator. Productivity is an outcome. Value added is the economic result.

Conclusion of the appendix

The central lesson is that AI productivity is a system property.

AI can make a task faster, but the firm benefits materially only if the task is economically important, if it relaxes a constraint, if adjacent tasks are redesigned, if data and systems are integrated, if workers and managers adapt, and if the gain is measured at process and firm level.

This explains the empirical pattern emphasized by Banca d’Italia QEF 1009: strong task-level evidence, limited short-run firm-level evidence, and potentially large long-run gains under deep adoption. It also explains why the policy problem is not merely to increase AI adoption, but to increase the conversion rate from adoption to productivity.

The final principle is operational:

AI does not create productivity because it is used. It creates productivity when it changes the production system.

Footnotes

Bellomarini, L., Bertolotti, F., Citino, L., Cassinis, M. G., D’Amuri, F., Del Prete, S., Formai, S., Mirenda, L., Rigon, M., & Russo Russo, A. (2026). L’adozione dell’intelligenza artificiale: effetti su produttività e politiche a sostegno. Banca d’Italia, Questioni di Economia e Finanza, 1009. DOI ↩︎
Acemoglu, D. (2025). The simple macroeconomics of AI. Economic Policy, 40(121), 13–58. DOI ↩︎
Aghion, P., & Bunel, S. (2024). AI and growth: Where do we stand? Federal Reserve Bank of San Francisco. URL ↩︎
Acemoglu, D. (2025). The simple macroeconomics of AI. Economic Policy, 40(121), 13–58. DOI ↩︎
Aghion, P., & Bunel, S. (2024). AI and growth: Where do we stand? Federal Reserve Bank of San Francisco. URL ↩︎
European Investment Bank. (2025). How are EU firms faring with AI, big data and other digital tools? European Investment Bank. URL ↩︎
European Commission. (2025). Italy 2025 Digital Decade Country Report. Shaping Europe’s Digital Future. URL ↩︎
European Commission. (2025). Italy — European Innovation Scoreboard 2025 country profile. European Commission, Directorate-General for Research and Innovation. URL ↩︎
OECD. (2025). AI adoption by small and medium-sized enterprises. OECD Publishing. URL ↩︎
European Commission. (2026). AI Factories. Shaping Europe’s Digital Future. URL ↩︎
Nindl, E., et al. (2025). The 2025 EU Industrial R&D Investment Scoreboard. European Commission, Joint Research Centre and Directorate-General for Research and Innovation. DOI ↩︎
European Commission. (2025). AI Continent Action Plan. European Commission. URL ↩︎
Agenzia per la Cybersicurezza Nazionale. (2024). IT4LIA, Italy will host one of the first AI Factories. Agenzia per la Cybersicurezza Nazionale. URL ↩︎
Bellomarini, L., Bertolotti, F., Citino, L., Cassinis, M. G., D’Amuri, F., Del Prete, S., Formai, S., Mirenda, L., Rigon, M., & Russo Russo, A. (2026). L’adozione dell’intelligenza artificiale: effetti su produttività e politiche a sostegno. Banca d’Italia, Questioni di Economia e Finanza, 1009. DOI ↩︎
Kremer, M. (1993). The O-ring theory of economic development. The Quarterly Journal of Economics, 108(3), 551–575. URL ↩︎
Gans, J. S., & Goldfarb, A. (2026). O-ring automation. National Bureau of Economic Research, Working Paper, 34639. URL ↩︎
Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187–192. DOI ↩︎
Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at work. The Quarterly Journal of Economics, 140(2), 889–942. URL ↩︎
Gambacorta, L., Qiu, H., Shan, S., & Rees, D. M. (2024). Generative AI and labour productivity: A field experiment on coding. Bank for International Settlements, BIS Working Papers, 1208. URL ↩︎
Brynjolfsson, E., Rock, D., & Syverson, C. (2021). The productivity J-curve: How intangibles complement general purpose technologies. American Economic Journal: Macroeconomics, 13(1), 333–372. DOI ↩︎
Bresnahan, T. F., Brynjolfsson, E., & Hitt, L. M. (2002). Information technology, workplace organization, and the demand for skilled labor: Firm-level evidence. The Quarterly Journal of Economics, 117(1), 339–376. DOI ↩︎
Bloom, N., Sadun, R., & Van Reenen, J. (2012). Americans do IT better: US multinationals and the productivity miracle. American Economic Review, 102(1), 167–201. DOI ↩︎
Bonney, K., Breaux, C. L., Dinlersoz, E., Foster, L. S., Haltiwanger, J. C., & Pande, A. A. (2026). The microstructure of AI diffusion: Evidence from firms, business functions, and worker tasks. U.S. Census Bureau, Center for Economic Studies Working Paper, CES-WP-26-25. URL ↩︎
Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at work. The Quarterly Journal of Economics, 140(2), 889–942. URL ↩︎
OECD. (2025). AI adoption by small and medium-sized enterprises. OECD Publishing. URL ↩︎
Aldasoro, I., Gambacorta, L., Pal, R., Revoltella, D., Weiss, C., & Wolski, M. (2026). AI adoption, productivity and employment: Evidence from European firms. European Investment Bank, Economics Working Paper, 2026/02. URL ↩︎

Reuse

CC BY-NC-ND 4.0

Citation

BibTeX citation:

@online{montano2026,
  author = {Montano, Antonio},
  title = {AI {Adoption,} {Productivity,} and the {Missing} {Middle}},
  date = {2026-06-12},
  url = {https://antomon.github.io/longforms/ai-adoption-productivity-missing-middle-banca-italia-qef-1009/},
  langid = {en},
  abstract = {This article reviews Banca d’Italia’s QEF 1009 on
    artificial intelligence adoption, productivity, and policy design.
    Its central argument is that the paper’s main contribution is not a
    simple forecast of AI-driven growth, but a disciplined distinction
    between nominal AI adoption, deep organizational integration,
    measurable productivity, and domestic value-added capture. QEF 1009
    documents a rapid increase in AI use among Italian firms, from 27
    percent in 2025 to 32 percent at the beginning of 2026, while
    showing that intensive integration remains limited, at about 5
    percent. Its short-run firm-level econometric analysis does not yet
    identify systematic effects on revenue per employee, employment, or
    investment, despite strong task-level evidence from the
    international literature on writing, coding, customer service, and
    professional work. The review interprets this apparent tension
    through a production-system lens. A task-level gain becomes
    firm-level productivity only when it relaxes a binding constraint,
    propagates through adjacent workflow redesign, and is supported by
    complementary assets such as clean data, integration architecture,
    process ownership, model governance, cybersecurity, managerial
    capability, and worker skills. This explains why AI adoption may
    follow a productivity J-curve: integration costs, organizational
    change, and intangible investment can precede measurable gains. The
    article extends the Banca d’Italia argument from productivity
    potential to value-added realization. AI can generate durable
    national economic value only if embedded in an ecosystem capable of
    translating digital innovation into vertical applications, scalable
    suppliers, interoperable data infrastructures, accessible compute,
    procurement channels, skills, regulatory certainty, and
    operating-model transformation. This is especially relevant for
    Italy, where scientific, industrial, and supercomputing assets
    exist, but the conversion chain from research and compute to
    firm-level adoption and domestic value capture remains fragmented.
    The appendix formalizes the same logic by distinguishing additive,
    bottleneck, serial-cycle-time, queueing, Leontief, and O-ring
    production structures. These models show why large local AI gains
    may be diluted, delayed, amplified, or destroyed before appearing in
    firm-level productivity statistics. The resulting policy implication
    is restrictive: public support should not merely increase AI
    adoption counts, but remove the conversion bottlenecks that prevent
    firms from turning AI tools into process productivity, firm
    productivity, and domestic value added.}
}

For attribution, please cite this work as:

Montano, Antonio. 2026. “AI Adoption, Productivity, and the Missing Middle.” June 12. https://antomon.github.io/longforms/ai-adoption-productivity-missing-middle-banca-italia-qef-1009/.