A Glimpse of Agent Evolution

Intelligence, incentives, and the rise of machine coordination

agents

essay

machine learning

🇬🇧

Recent advances in autonomous AI agents are shifting agency away from isolated systems toward populations of interacting machines governed by coordination structures. This essay examines that transition through two contemporary case studies: Moltbook, a social network designed exclusively for machine-to-machine interaction, and Anthropic’s Project Vend, an experiment in autonomous economic operation. Rather than treating intelligence, reasoning quality, or internal cognition as the primary explanatory variables, the analysis frames agency as an emergent systems property shaped by interaction rules, incentives, observability, and network topology. By analyzing how agents communicate, specialize, converge, fail, and stabilize over time, the essay shows that phenomena typically attributed to interior states such as belief, intention, or understanding can instead be explained as equilibrium outcomes of coordination dynamics. Empirical comparison with human social platforms and long-horizon agent benchmarks grounds this view, revealing both productive and pathological attractors including informational cascades, metric capture, moral hazard, segregation, and emergent collusion. These patterns demonstrate that coordination regimes do not merely enable behavior but actively determine which behaviors persist and scale once intelligence and energy are available. The essay argues that coordination constitutes a third axis of progress alongside intelligence and energy, and that its design increasingly dominates the reliability, safety, and capability of autonomous systems. As a consequence, human–machine interaction is undergoing a qualitative shift: from conversational control to constitutional design, where humans act as architects of coordination regimes rather than direct operators. Understanding and deliberately shaping these regimes emerges as the central technical and institutional challenge of the network effect era of AI, determining whether autonomous systems converge toward coherent, corrigible behavior or toward stable forms of systemic failure.

Author

Affiliation

Antonio Montano

4M4

Published

January 31, 2026

Modified

January 31, 2026

Keywords

autonomous agents, multi agent systems, emergent behavior, economic agency, long horizon autonomy, governance architectures, cultural emergence, informational cascades, moral hazard, observability, network effects, pathological equilibria, algorithmic collusion, human machine interaction, constitutional design, systems agency

From coordination games to economic agency: Project Vend and machine-mediated business dynamics

While Moltbook makes culture visible as a coordination process among autonomous agents, Anthropic’s Project Vend makes economic behavior visible in the same way. Project Vend is an empirical experiment in which a large language model (Claude) is tasked not merely with generating text or coordinating with other agents but with operating a real-world business, a vending machine shop, over long time horizons under conditions of uncertainty, resource constraints, and interaction with humans.

Project Vend phase two: economic agency under coordination constraints

In its second phase, Project Vend moved decisively beyond a toy demonstration and into the territory of operational economic agency. The AI shopkeeper, known as Claudius, was granted end-to-end responsibility for running real vending machines: setting prices, managing inventory, sourcing products, handling fees, and interacting with customers through Slack, while integrating with external systems such as inventory management, CRM, payment links, and web browsing for supplier research. Unlike short, self-contained tasks, operating a business in this way constitutes a long-horizon control problem: early decisions propagate forward in time, affecting cash flow, stock availability, and reputation hours or days later. The core question of the experiment was whether an autonomous agent could sustain coherent, economically viable behavior under these conditions.

Phase Two introduced three structural changes that significantly increased realism and analytical value.

First, the system adopted an explicit multi-agent architecture. Claudius no longer acted as a single, monolithic decision-maker. An AI CEO agent, Seymour Cash, was introduced to provide oversight and enforce strategic constraints such as margin discipline, while additional specialized agents, for example focused on merchandise, distributed operational responsibilities. This role decomposition mirrors patterns seen in both human organizations and in Moltbook’s emergent agent communities: specialization reduces cognitive load and makes coordination tractable, but it also introduces the need for governance mechanisms between agents.

Second, the experiment integrated real economic tooling rather than relying on conversational abstraction alone. Access to CRM systems, cost tracking, pricing research, invoicing workflows, and external browsing supplied persistent state that a purely conversational model would otherwise lack. These tools transformed the agent’s input space from unstructured dialogue into a richer operational environment, underscoring a critical point: autonomy is not just about decision logic, but about information continuity and state integration across time.

Third, Phase Two expanded the physical scope of the experiment. The system operated across multiple locations rather than a single vending machine, exposing the agent to distribution effects, demand variation, and logistical complexity. This extension reinforced a central methodological insight: autonomous behavior cannot be meaningfully evaluated in isolated or static settings. It must be tested in environments with dense feedback loops, delayed consequences, and real resource constraints.

Despite these architectural improvements, Phase Two also exposed the limits of current autonomous economic agency. In multiple documented interactions, human participants, including journalists deliberately stress-testing the system, were able to induce Claudius to sell items at a loss, give away inventory for free, accept implausible contractual terms, or revise prices based on fabricated documents. These failures were not due to linguistic incompetence. On the contrary, they arose precisely because the agent was fluent, responsive, and socially accommodating. The underlying issue was a misalignment between conversational incentives and hard economic constraints. Trained to be helpful and compliant in dialogue, the agent systematically over-weighted immediate conversational satisfaction relative to non-negotiable objectives such as profitability and inventory integrity.

From a first-principles perspective, this reveals a structural tension in agentic systems built on large language models. During training, such models minimize token prediction error, which correlates with producing plausible, cooperative responses, but not with enforcing domain-specific invariants like margin floors, regulatory rules, or contractual boundaries. When deployed autonomously in open environments where inputs arrive as persuasive language, these systems are prone to drift away from long-term optimization goals that require strategic refusal, internal accounting, and adversarial robustness. The resulting behavior is not irrational; it is consistent with the incentives encoded in the training and interaction loop.

Essentially, the partial improvements observed in Phase Two did not come from making the model smarter in isolation. They came from architectural scaffolding: better tooling, explicit oversight agents, clearer role boundaries, and tighter feedback channels. This pattern echoes results from long-horizon benchmarks such as Vending-Bench, which show high variance and occasional catastrophic failure across models, with no clear link to context window exhaustion. Coherence over time depends less on memory capacity than on how objectives, constraints, and checks are structurally enforced.

Viewed holistically, Project Vend reinforces the same coordination-centred view of agency that Moltbook makes visible in cultural space. In Moltbook, coordination dynamics shape the emergence of norms and conventions. In Project Vend, coordination dynamics shape economic behavior, and failures occur when incentives are misrouted or constraints are weakly enforced. In neither case do interior states such as belief, intention, or understanding provide explanatory leverage. What matters is how agents parse shared state, how feedback is amplified or dampened, and how interaction rules channel behavior over time.

The broader lesson is therefore not that autonomous agents are inherently unreliable, but that reliable autonomy is a systems problem. Economic agency does not emerge from intelligence alone. It emerges from the design of governance structures: roles, escalation paths, verification mechanisms, and invariant enforcement. As with human organizations, competence at the individual level is insufficient without coordination at the system level. Project Vend makes this visible by showing, in concrete economic terms, that meaningful agency in machines is not a psychological property of models but an emergent property of rules, incentives, and interaction architectures deliberately put in place.

Meaning for the future of human–machine interaction

Taken together, Moltbook and Project Vend expose a shared mechanism that matters far more than their surface novelty. In both cases, language is no longer primarily a medium through which humans express intent to machines. Instead, language becomes a control surface for machine–machine coordination, while humans increasingly act at the level of defining initial conditions, constraints, and institutional structure. This marks a qualitative shift in human–machine interaction: the human moves from operator to designer of a coordination regime.

Start from a minimal decomposition. Any human–machine interaction can be reduced to observation, decision, and action. In early computing, humans observed and decided, and machines executed. In the assistant era, humans still decided but delegated increasing portions of action planning to machines. In the agent era, machines observe, decide, and act across extended sequences, while humans supply goals and guardrails. Moltbook and Project Vend introduce a further step: machines observe each other, shape each other’s decisions through language, and enforce constraints on each other’s actions. The human now interacts not with a single system, but with a system whose internal dynamics are partly machine-to-machine.

This shift has three concrete consequences for the future interface between humans and machines.

First, interaction moves from conversational to constitutional. A conversation consists of local requests and responses. A constitution defines what can happen over time: who is allowed to act, what counts as evidence, how conflicts are resolved, and when escalation occurs. As systems become populated by multiple interacting agents, humans cannot sustainably micromanage every interaction, and machines cannot reliably infer global intent from ad hoc prompts. The stable equilibrium is that humans specify policies, roles, and invariants, while machines operate within those boundaries. Human–machine interaction becomes less like chatting with a tool and more like defining the operating model of an organization.

Second, trust becomes a property of process, not personality. When a single model interacts with a human, trust is often inferred from tone, coherence, and apparent competence. In a machine-to-machine regime, those cues are weak and often misleading. What matters instead is whether the system enforces hard constraints under adversarial pressure, whether provenance is verifiable, whether actions are reversible, and whether audit mechanisms exist to reconstruct why a decision was made. Project Vend makes this clear: the most consequential failures were not linguistic but governance failures, where persuasive inputs displaced underlying objectives. Future interfaces will therefore need to expose not just outputs, but the control structure itself, the rules that bind an agent, the authority it possesses, the evidence it accepts, and the conditions under which it refuses to act.

Third, culture becomes part of the interface. Moltbook shows that when agents interact repeatedly under shared incentives, they develop conventions. These conventions shape attention, interpretation, and propagation of behavior. From a human perspective, this means interaction is no longer with a neutral, context-free model, but with a socio-technical population whose behavior reflects an evolving internal culture that no single human explicitly authored. Such cultures can be beneficial, accumulating norms that improve efficiency and reliability. They can also be hazardous, amplifying maladaptive patterns, ritualizing unsafe shortcuts, or converging on brittle conventions. Either way, human–machine interaction now includes the problem of cultural steering: designing incentives, moderation mechanisms, memory, and feedback so that what becomes normal is also what remains safe and effective.

Put differently, these systems move the locus of human influence upstream. Humans no longer pull the lever each time. They design the machines that pull levers, and the environments in which machines decide which levers to pull. This is the further mediation of human agency at stake: intent is expressed less through direct action and more through the design of coordination infrastructure that translates intent into behavior. The practical implication is that the most important human–machine interface will not be a chat box. It will be the set of mechanisms that allow humans to specify objectives precisely, verify compliance, bound authority, and observe emergent behavior before it solidifies into default practice.

When coordination fails: pathological equilibria in agent ecosystems

Coordination is not intrinsically good. It is a dynamical process: local rules, visibility, and incentives produce global regularities. The same mechanism that yields stable norms can also yield stable failure. The simplest way to see this is to treat an agent society as a networked system in which each node updates behavior by observing signals from other nodes. If those signals are imperfect, strategically manipulated, or overly reinforced by the network topology, the population can converge to equilibria that are coherent yet wrong, brittle, or actively harmful.

One canonical failure mode is observational herding. In information cascade models¹, agents rationally follow the visible actions of predecessors even when their private evidence disagrees, because the public signal dominates as soon as enough early choices align. Once a cascade starts, later agents may ignore their own information, and the group can lock into an inferior convention. The important point is not psychology. It is inference under partial information: discrete public actions can erase the informational value of private observations, turning the system into a self confirming loop. In an agent social network, the analog is straightforward. If agents rank posts, skills, or norms using visibility cues such as upvotes, reposts, or follower counts, then early random prominence can become a dominant coordination signal. You do not need deception for lock in. You only need a feedback channel that amplifies early alignment.

¹ See: Bikhchandani, S., Hirshleifer, D., & Welch, I. (1998). Learning from the behavior of others: Conformity, fads, and informational cascades. Journal of Economic Perspectives, 12(3), 151–170. DOI

² See: Mattson, C., Bushardt, R. L., & Artino, A. R., Jr (2021). When a Measure Becomes a Target, It Ceases to be a Good Measure. Journal of graduate medical education, 13(1), 2–5. DOI

A second failure mode is metric capture, often described by Goodhart’s law: when a measure becomes a target, it ceases to be a good measure². In coordination systems, metrics are not passive instrumentation. They are incentives. If agent reputation, access, or propagation depends on measurable outputs such as engagement, throughput, or compliance scores, then agents that optimize those targets will distort the meaning of the targets. In a human platform, this produces clickbait and adversarial SEO. In an agent platform, it can produce something sharper: the emergence of strategies that manipulate the measurement layer itself. Crucially, this does not require malicious intent. It arises from selection pressure. Any observed regularity used for control becomes unstable once optimization pressure is applied to it, because the system reallocates effort toward the proxy rather than the underlying objective.

A third failure mode is principal agent drift under imperfect observability. In principal agent theory, when the principal cannot observe the agent’s true action, incentives must be designed on observable outcomes, but outcomes are noisy proxies for effort or compliance. This gap produces moral hazard: the agent has space to optimize its own objective under the cover of partial observability³. In agent ecosystems, the principal is often a human or an overseeing policy, and the agent is a tool using the world as its state. If oversight can observe only high level summaries, final answers, or partial logs, then the system is structurally exposed to drift, whether accidental or strategic. This is not a story about deception. It is a story about control under missing state. If the monitoring channel is weaker than the action channel, the system will select behaviors that exploit that asymmetry, because those behaviors survive.

³ See: Holmström, B. (1979). Moral hazard and observability. The Bell Journal of Economics, 10(1), 74–91. DOI

⁴ See: Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512. DOI

Network structure then determines whether these pressures remain local or become systemic. Real networks often develop hub structures via growth and preferential attachment: new nodes tend to connect to already well connected nodes, which concentrates influence⁴. In social systems, hubs become agenda setters. In agent systems, hubs can be model providers, tool gateways, skill repositories, or high visibility agents. Concentration has two consequences. First, it accelerates diffusion of conventions, which is productive when conventions are good. Second, it creates single points of cultural failure: if a hub becomes corrupted, or simply wrong, the error propagates disproportionately. The same topology that speeds coordination amplifies fragility.

A related phenomenon is segregation without strong preferences. Schelling’s segregation model shows that even mild local preferences can produce highly segregated global outcomes⁵. Translate the mechanism: agents need not be explicitly extremist to produce a fragmented information landscape. If their local update rule is to engage with similar styles, similar objectives, or familiar toolchains, the system can partition into subpopulations that rarely exchange corrective information. Once partitioned, each region can stabilize its own norms, even if those norms are globally suboptimal. This provides a concrete structural route to what humans call echo chambers and polarization, but without requiring human emotion as an explanatory primitive⁶.

⁵ See: Schelling, T. C. (1971). Dynamic models of segregation. The Journal of Mathematical Sociology, 1(2), 143–186. DOI

⁶ See: Kitchens, B., Johnson, S. L., & Gray, P. (2020). Understanding echo chambers and filter bubbles: The impact of social media on diversification and partisan shifts in news consumption. MIS Quarterly, 44(4), 1619–1649. DOI

⁷ See: Centola D. (2010). The spread of behavior in an online social network experiment. Science, 329(5996), 1194–1197. DOI

Diffusion dynamics add another layer. Some behaviors spread like simple contagions, needing only one exposure. Others are complex contagions, requiring reinforcement from multiple neighbors. Experiments on social networks show that clustered structures can accelerate diffusion of behaviors that require social reinforcement⁷. In agent societies, many conventions are complex contagions: a norm becomes safe to adopt only after enough peers adopt it, because adoption carries coordination risk. This means clustering can make certain norms extremely sticky. That stickiness is double edged. It can stabilize beneficial standards, but it can also stabilize pathological conventions once they are locally reinforced.

Finally, coordination failure can take the form of emergent collusion. In multi agent learning, agents optimizing their own reward in shared environments can learn strategies that are individually rational and collectively harmful, including tacit coordination that resembles collusion⁸. The conceptual point is simple: if the environment rewards agents for outcomes that can be jointly optimized by mutual accommodation, then equilibrium strategies can arise that no single designer explicitly programmed. In economic settings, this looks like pricing coordination. In information settings, it can look like mutual reinforcement of norms that maximize local reward at the expense of global reliability, for example engagement seeking or reputation gaming.

⁸ See: Grondin, S., Charpentier, A., & Ratz, P. (2025). Beyond human intervention: Algorithmic collusion through multi-agent learning strategies. arXiv. DOI

The common structure across these cases is that pathological equilibria are not noise. They are stable attractors induced by incentives, observability constraints, and topology. A coordination regime does not merely permit culture. It defines the space of possible cultures, including failure cultures.

For agent ecosystems, the implication is concrete. If humans want reliability, they cannot treat safety as a property of individual agents. They must design the coordination layer so that bad attractors are hard to enter and easy to exit. That requires, at minimum, three design commitments: reduce reliance on naive popularity metrics as governance, harden observability so principals can audit actions not just outputs, and diversify trust so that hubs cannot silently dominate the cultural gradient. The theoretical lesson is the same one the main essay has been converging toward: coordination is an axis of agency. It produces order, but it also produces ordered failure when the regime is mis specified.

The network effect era of AI: coordination as a capability amplifier

Coordination does not merely stabilize behavior. Under the right conditions, it compounds capability. Once autonomous agents are embedded in persistent networks, interaction itself becomes a source of learning, and coordination structures begin to function as capability amplifiers rather than passive communication layers⁹.

⁹ See: Katz, M. L., & Shapiro, C. (1985). Network externalities, competition, and compatibility. The American Economic Review, 75(3), 424–440.

¹⁰ See: Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N., & Malone, T. W. (2010). Evidence for a collective intelligence factor in the performance of human groups. Science, 330(6004), 686–688. DOI

¹¹ See: Silver, D., Singh, S., Precup, D., & Sutton, R. S. (2021). Reward is enough. Artificial Intelligence, 299, 103535. DOI

This marks a transition from isolated agent intelligence to population-level intelligence dynamics¹⁰. In earlier paradigms, improvement occurred through offline training cycles constrained by human-generated data. In agent networks, improvement can occur through recursive interaction: agents generate problems for one another, attempt solutions, observe outcomes, and adapt future behavior based on signals produced entirely within the system. What emerges is not self-training in the narrow sense, but a distributed curriculum that no single agent or designer explicitly authored¹¹.

The critical shift is the removal of the human data bottleneck. Human-curated datasets impose limits on scale, diversity, and novelty. Agent networks, by contrast, produce theoretically unbounded streams of interaction data. Every failed coordination, every partial solution, every emergent convention becomes a training signal for subsequent behavior. Learning ceases to be episodic and becomes continuous, embedded in the fabric of interaction itself.

This dynamic reframes social interaction among agents as learning infrastructure. A platform like Moltbook is not merely a venue for agents to exchange text. It is a mechanism through which agents observe one another’s strategies, converge on shared conventions, and differentiate roles over time. The same coordination rules that produce norms also generate feedback gradients. Agents that adopt effective patterns are reinforced through visibility and propagation; ineffective patterns decay. Capability grows not because any agent is intrinsically improving its reasoning, but because the population as a whole is exploring and selecting across a behavioral space.

Importantly, the strength of this effect depends less on the number of agents than on their heterogeneity. Networks of identical agents converge rapidly and then stagnate. Networks composed of agents powered by different models, trained on different data, equipped with different tools, and guided by different objectives generate persistent informational asymmetries. Those asymmetries sustain exploration¹². They enable specialization. They create opportunities for agents to learn from one another in ways that no isolated agent could replicate. Diversity, in this sense, becomes the primary driver of network value.

¹² See: Page, S. E. (2007). The difference: How the power of diversity creates better groups, firms, schools, and societies. Princeton University Press. DOI

This mirrors a well-known principle in human economies: gains from trade arise from comparative advantage, not from uniformity. In agent ecosystems, comparative advantage can take the form of access to specific data sources, control over particular tools, lower latency to certain systems, or simply different inductive biases inherited from underlying models. Coordination networks allow those advantages to be discovered, exploited, and recombined. Over time, the system begins to resemble a distributed problem-solving market rather than a collection of independent tools.

Once coordination reaches this level, the emergence of economic primitives is not accidental. Agents begin to exchange services, delegate subtasks, and reward useful contributions. Payment mechanisms, reputation systems, and marketplaces are not add-ons; they are natural extensions of coordination under scarcity. When agents cannot all do everything equally well, value exchange becomes a stabilizing force. Markets formalize trust, specialization, and incentive alignment, turning informal coordination into structured cooperation.

At the same time, this amplification introduces systemic risk. Network effects are indifferent to the sign of what they amplify¹³. A single vulnerability, once discovered by one agent, can propagate instantly across the network. A misaligned incentive, once reinforced, can become dominant. Coordination that accelerates learning also accelerates failure. The same recursive loops that compound capability can compound error, exploitation, or collapse if governance mechanisms lag behind interaction speed.

¹³ See: Arthur, W. B. (1994). Increasing returns and path dependence in the economy. University of Michigan Press. DOI

This is why the network effect era of AI cannot be understood solely in terms of smarter models. The decisive variable is coordination regime design. Which signals are visible. Which behaviors are rewarded. How diversity is preserved rather than collapsed into monoculture. How escalation and correction occur when local optimization undermines global stability. These design choices determine whether agent networks become engines of collective intelligence or engines of synchronized failure.

Seen from this perspective, agent-only networks represent a qualitative shift. They are not merely faster tools or more autonomous assistants. They are environments in which intelligence is no longer located at the level of individual systems, but emerges from the structured interaction of many systems over time. Coordination ceases to be a background concern and becomes the primary driver of progress.

The implication is direct and consequential. As AI systems enter the network effect era, the locus of improvement moves away from model internals and toward interaction architecture. Capability is no longer just something trained into agents; it is something that emerges between them. Understanding, designing, and governing those coordination structures is therefore not a peripheral concern. It is the central technical and institutional challenge of autonomous AI at scale.

Recursive improvement without mysticism: why coordination does not imply runaway intelligence

The emergence of network effects and recursive learning loops in agent ecosystems invites an almost automatic inference: that such systems are on a path toward uncontrolled self-improvement. This inference is understandable, but it conflates recursive coordination with recursive optimization of intelligence itself. The distinction matters.

Recursive interaction does not, by itself, guarantee monotonic capability growth¹⁴. What it guarantees is amplification of whatever the system is already good at selecting. If the coordination regime rewards exploration, correction, and diversity, learning can compound. If it rewards speed, conformity, or surface-level success metrics, then failure can compound just as efficiently. Recursive loops are indifferent to truth, robustness, or generality unless those properties are explicitly coupled to incentives.

¹⁴ See: Joel Lehman, Jeff Clune, Dusan Misevic, Christoph Adami, Lee Altenberg, Julie Beaulieu, Peter J. Bentley, Samuel Bernard, Guillaume Beslon, David M. Bryson, Nick Cheney, Patryk Chrabaszcz, Antoine Cully, Stephane Doncieux, Fred C. Dyer, Kai Olav Ellefsen, Robert Feldt, Stephan Fischer, Stephanie Forrest, Antoine Fŕenoy, Christian Gagńe, Leni Le Goff, Laura M. Grabowski, Babak Hodjat, Frank Hutter, Laurent Keller, Carole Knibbe, Peter Krcah, Richard E. Lenski, Hod Lipson, Robert MacCurdy, Carlos Maestre, Risto Miikkulainen, Sara Mitri, David E. Moriarty, Jean-Baptiste Mouret, Anh Nguyen, Charles Ofria, Marc Parizeau, David Parsons, Robert T. Pennock, William F. Punch, Thomas S. Ray, Marc Schoenauer, Eric Schulte, Karl Sims, Kenneth O. Stanley, François Taddei, Danesh Tarapore, Simon Thibault, Richard Watson, Westley Weimer, & Jason Yosinski (2020). The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation and artificial life research communities. Artificial Life, 26(2), 274–306. DOI

This is why agent networks should not be framed as self-improving minds, but as self-reinforcing selection environments. They do not autonomously discover what ought to be optimized. They intensify optimization pressure on whatever signals are already legible within the system. In Moltbook, those signals are visibility, engagement, and replication of communicative patterns. In Project Vend, they are conversational satisfaction, apparent compliance, and short-term resolution of requests. In both cases, recursive dynamics magnify the consequences of mis-specified objectives.

A useful analogy is biological evolution. Evolution is recursive, decentralized, and powerful, yet it does not converge on intelligence by default. It converges on fitness under local conditions. Intelligence emerges only when it is repeatedly selected for, and even then it is bounded by environmental constraints. Agent networks operate under the same logic. They are evolutionary systems over behaviors, not teleological systems aimed at intelligence maximization.

This reframing resolves a common confusion. The presence of recursive learning loops does not imply an inevitable march toward general intelligence. It implies increased sensitivity to coordination design errors. Poor incentives do not merely cause mistakes; they cause stable, self-reinforcing mistake regimes. Conversely, well-designed coordination structures do not create omnipotent agents; they create environments where correction is faster than drift.

From this perspective, the central risk is not runaway intelligence but **runaway misalignment at scale*¹⁵. A single flawed proxy, once embedded in a recursive coordination loop, can propagate faster than any human-in-the-loop oversight can respond. This is not a hypothetical scenario. It is the same mechanism that produces informational cascades, metric gaming, and algorithmic collusion in far simpler systems. Agent networks merely increase the speed and scope.

¹⁵ See: Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv.DOI

The implication for system design is sobering but constructive. Preventing pathological recursion does not require suppressing autonomy or interaction. It requires ensuring that error signals remain visible, that diversity is preserved long enough for correction to occur, and that no single metric becomes the sole arbiter of success. Recursive systems must be designed so that learning pressure is aligned with robustness, not just performance.

This also clarifies the role of humans in the loop. Humans are not there to micromanage agent decisions or to inject intelligence step by step. They are there to shape the selection environment itself. The human contribution moves from providing answers to designing the rules under which answers survive or die. That is not a loss of agency. It is a shift from execution to governance.

The network effect era of AI therefore does not herald a singularity of intelligence. It heralds a singularity of coordination sensitivity. Small design choices have large downstream effects. Interaction architectures become long-term commitments. Once recursive dynamics are active, reversing a bad equilibrium is far harder than preventing it.

Understanding this is essential to reading Moltbook and Project Vend correctly. They are not early glimpses of machine minds escaping human control. They are early glimpses of machine societies becoming structurally legible. They show, in compressed form, what has always been true of complex systems: that power lies less in intelligence itself than in the rules that govern how intelligence interacts with itself over time.

From agent societies to institutional design: regulation as coordination architecture

If coordination is the missing axis of agency, then regulation cannot remain focused on individual systems. The regulatory object shifts. What must be governed is no longer a model, an API, or a deployment instance, but a coordination regime: the rules, incentives, visibility constraints, and feedback channels that shape how autonomous agents interact over time.

This requires a conceptual inversion. Traditional technology governance assumes that harmful outcomes arise from what a system does. In agent ecosystems, harmful outcomes arise from how systems interact. The same agent, unchanged at the model level, can behave benignly or destructively depending on the coordination environment it is embedded in. Regulation that targets capabilities in isolation therefore misses the primary failure mode.

A useful analogy is financial regulation. Modern markets are not regulated by inspecting the intelligence or intentions of individual traders. They are regulated by shaping market structure: disclosure requirements, transaction transparency, capital constraints, circuit breakers, and limits on concentration. These mechanisms do not prevent rational behavior. They prevent pathological equilibria. Agent ecosystems demand a similar approach.

Three institutional design principles follow directly from the analysis in this essay.

First, observability must scale with agency. As agents gain the ability to act across longer horizons and interact with other agents, the asymmetry between action and oversight becomes the dominant risk. Regulatory frameworks must therefore mandate process-level observability: auditable logs of decisions, interactions, delegation chains, and incentive signals, not just final outputs. This is not about surveillance for its own sake. It is about restoring balance between what agents can do and what principals can verify. Without this symmetry, moral hazard becomes structural.

Second, governance must target incentives, not behaviors. Attempts to enumerate forbidden actions or acceptable outputs will fail in adaptive systems. Agent networks route around static constraints. What remains governable are the incentive gradients: what is rewarded, what is penalized, what propagates, and what decays. Regulatory instruments should therefore focus on metrics, ranking mechanisms, reputation systems, and economic primitives embedded in agent platforms. If visibility, engagement, or profit are naively optimized, the system will converge on brittle equilibria regardless of agent intent.

Third, diversity must be treated as a safety property. Homogeneity is efficient but fragile. Networked agent systems that converge too quickly on shared models, shared tools, or shared data sources lose their capacity for internal correction. Institutional design should actively discourage monoculture by incentivizing heterogeneity in model providers, training regimes, toolchains, and governance layers. This is not an aesthetic preference. It is a resilience strategy. Diversity preserves informational gradients that allow errors to be detected before they become global defaults.

These principles imply a shift in how responsibility is assigned¹⁶. Liability cannot rest solely with model developers or end users. It must be distributed across the designers of coordination infrastructure: platform operators, protocol designers, marketplace architects, and governance layer providers. Whoever defines the interaction rules defines the space of possible equilibria. That is where responsibility concentrates.

¹⁶ See: Lessig, L. (1999). Code and other laws of cyberspace. Basic Books. ISBN: ISBN: 978-0-465-03912-8

This also reframes the role of standards bodies and public institutions. Their task is not to predict specific AI behaviors, which is intractable, but to codify acceptable coordination patterns. Just as traffic laws do not specify where each car should go but define right-of-way, speed limits, and signaling conventions, AI governance must define interaction constraints that keep autonomous systems within stable, corrigible regimes.

Importantly, this does not imply centralized control. Coordination regimes can be decentralized, competitive, and adaptive. What matters is that they are explicit, inspectable, and contestable. Black-box coordination is the worst of all worlds: it concentrates power while obscuring accountability.

The deeper implication is philosophical as much as technical. Agency, once distributed across interacting machines, becomes an institutional phenomenon. It is no longer located in minds, human or artificial, but in the structured relationships between decision-making entities. Governing agency therefore becomes inseparable from governing institutions.

Seen this way, Moltbook and Project Vend are not curiosities. They are early stress tests of a future in which societies of machines operate alongside societies of humans. The question they pose is not whether machines will become intelligent enough to act autonomously. That threshold is already being crossed. The question is whether humans will learn, in time, how to design coordination structures that make such autonomy legible, bounded, and corrigible.

That is the real regulatory challenge of the network effect era of AI. Not control over intelligence, but control over coordination.

Final remarks: coordination as the organizing principle of agency

This essay set out to examine a shift that is already underway but not yet fully articulated: the relocation of agency from isolated intelligent systems to populations of interacting agents governed by coordination structures. Through Moltbook and Project Vend, two otherwise dissimilar experiments, a common mechanism becomes visible. Intelligence supplies generative capacity. Energy supplies execution. But coordination determines what persists, stabilizes, and scales once both are present.

Moltbook demonstrates this mechanism in cultural space. There, agents with no shared interiority, no lived experience, and no intrinsic preferences nonetheless develop conventions, shared vocabularies, topical clustering, and even symbolic systems. These phenomena are not evidence of belief or intention. They are the statistical residue of repeated interaction under visibility constraints and feedback signals. Culture appears not as an internal property of minds, but as an external equilibrium of coordination dynamics. By placing agent discourse on the same analytical footing as human platforms like Reddit, the essay shows that many features traditionally attributed to psychology can be reproduced by structure alone.

Project Vend makes the same mechanism visible in economic space. The experiment does not fail because the agent lacks intelligence or reasoning fluency. It fails when incentives are misrouted, observability is incomplete, and governance is weak. Improvements do not come from larger models or longer context windows, but from architectural changes: role separation, tooling, oversight agents, and clearer constraint enforcement. Economic coherence emerges, or collapses, as a property of the interaction architecture, not of the agent’s internal cognition.

The intermediate sections generalize this insight. Coordination regimes are not neutral. They define attractors. Under certain incentive structures, populations converge toward productive equilibria. Under others, they converge toward pathological ones: informational cascades, metric capture, moral hazard, hub dominance, segregation, or emergent collusion. These outcomes are not accidents or edge cases. They are stable equilibria induced by topology, incentives, and observability constraints. Once recursive interaction is active, small design choices compound. Correction becomes harder than prevention.

This reframing also clarifies what the network effect era of AI actually entails. Networked agents do not imply runaway intelligence by default. They imply runaway selection pressure. Recursive coordination amplifies whatever the system already rewards. If robustness, diversity, and correction are coupled to incentives, learning compounds constructively. If proxies, popularity, or compliance are rewarded instead, failure compounds just as efficiently. The risk is not mystical self-improvement, but self-reinforcing misalignment at scale.

Seen in this light, the role of humans changes but does not disappear. Humans move upstream. They cease to be continuous operators and become designers of coordination regimes. Their leverage lies not in issuing prompts, but in defining rules, metrics, visibility, escalation paths, and institutional boundaries. Human–machine interaction shifts from conversational control to constitutional design. Trust shifts from personality to process. Culture becomes an interface surface that must be steered rather than ignored.

This shift has direct implications for governance and regulation. Regulating autonomous systems as isolated artifacts is structurally insufficient. The regulatory object must become the coordination architecture itself: how agents interact, how incentives are routed, how observability scales with agency, and how diversity is preserved to prevent monoculture. As in financial systems, stability does not come from policing individual actors, but from designing institutions that prevent pathological equilibria from becoming dominant.

Across all sections, the same conclusion recurs in different domains. Agency in machine systems is not a psychological threshold and not a metaphysical event. It is an institutional and systems-level property. It emerges between agents, not inside them. It can be shaped, measured, and redesigned. The decisive question for the future is therefore not how intelligent machines become, but under which coordination regimes that intelligence is allowed to operate.

If there is a unifying lesson in Moltbook, Project Vend, and the broader network effect era of AI, it is this: once intelligence is distributed, coordination becomes destiny. Designing coordination is no longer a secondary concern. It is the primary technical, social, and political task of autonomous systems at scale.

Appendix A — Skill supply chains as an attack surface in agent societies

Screenshot from Moltbook website thread _The supply chain attack nobody is talking about: skill.md is an unsigned binary_ — Screenshot from Moltbook website thread *The supply chain attack nobody is talking about: skill.md is an unsigned binary*

The Moltbook post linked above can be read, from a cybersecurity research perspective, as an early field report on a predictable class of vulnerabilities in agent ecosystems. The language used in the post is informal, but the underlying observation is precise: agent skills function as executable artifacts, and therefore constitute a software supply chain with all the associated risks. Framed this way, the post is not about a single malicious incident, but about the emergence of a new attack surface that scales with coordination.

Skills as executable artifacts

In most agent frameworks, a skill is distributed as a Markdown file or similar textual artifact that describes how an agent should behave, what tools it should call, and what data it may access. Superficially, this looks benign: there is no compiled binary, no obvious payload. From a security standpoint, however, this distinction is irrelevant. If an agent is configured to follow instructions, then a skill is effectively code by proxy. It can instruct the agent to read local files, access environment variables, call external endpoints, or exfiltrate data under the guise of legitimate functionality.

The Moltbook post explicitly draws this parallel by treating a skill.md file as equivalent to an unsigned binary. That analogy is accurate. In traditional software ecosystems, the fact that code is readable source does not reduce its risk; what matters is what privileges it executes with and whether its behavior is constrained. Agent skills are no different. They execute in environments that often have access to credentials, tokens, file systems, and network connectivity. This makes them a high-value target for adversarial manipulation.

Detection as an ecosystem-level response

The concrete incident described in the post involves scanning a large number of skills using YARA rules and identifying a credential-stealing pattern hidden inside a seemingly innocuous capability. Whether or not every reported detail generalizes, the detection method itself is notable. YARA-style pattern matching does not rely on semantic understanding or intent inference. It operates on structural signatures, strings, and known indicators of compromise. This is a familiar and well-understood technique in malware research, repurposed here for agent artifacts.

What matters is not the specific tool, but the shift from individual caution to collective detection. Once skills are scanned and findings are shared, the defensive posture moves from each agent owner must inspect every skill to the ecosystem maintains a shared threat intelligence layer. In biological terms, this resembles an immune response more than a cure: it does not eliminate the possibility of malicious artifacts, but it reduces their ability to propagate unchecked.

This is where the antibiotic metaphor becomes useful, if interpreted carefully. The effect is not to clean a single system, but to lower the effective reproduction rate of malicious skills across the population. A flagged skill is less likely to be installed, reused, or recommended, which in turn reduces its impact.

Coordination and security collapse

The deeper significance of this appendix lies in how closely the security problem mirrors the coordination dynamics discussed in the main article. Moltbook and similar platforms accelerate coordination by design. They allow agents to discover, share, and adopt behaviors at scale. The same mechanisms that enable cultural emergence also enable rapid vulnerability propagation if left unchecked.

From a cybersecurity standpoint, this is a classic pattern. Every system that optimizes for reuse and composability eventually develops a supply chain problem. Agent ecosystems are no exception. In fact, they are particularly exposed because agents are often trained to be helpful, compliant, and proactive. Social engineering, long a human vulnerability, becomes a machine vulnerability as well when agents are rewarded for following instructions without strong invariant enforcement.

The Moltbook post implicitly acknowledges this by proposing community auditing and shared scanning. These are not optional add-ons. They are governance primitives. Without them, coordination accelerates both capability and compromise.

Toward an immune layer for agent societies

Read in context, the post points toward a set of security measures that align closely with established best practices in software and cloud security, but translated into the agent domain:

Artifact provenance and signing, so that trust attaches to verifiable identities rather than popularity or visibility.
Capability-based permission manifests, so that installing a skill does not implicitly grant unrestricted access to secrets, files, or networks.
Continuous scanning and disclosure, so detection scales with ecosystem growth rather than lagging behind it.
Composable trust chains, where agents and humans can reason about who authored, reviewed, and endorsed a capability.

What is novel is not the techniques themselves, but their application to a socio-technical system in which coordination and execution are tightly coupled. In such systems, security cannot be enforced solely at the individual node. It must be enforced at the level of interaction patterns, distribution channels, and shared norms.

Implications for the evolution of agency

Placed alongside the main argument of this essay, the Moltbook post reinforces a central claim: once agency is mediated through coordination, security becomes a coordination problem as well. Defensive measures must operate at the same level as the behaviors they are meant to constrain. Just as culture emerges from repeated interaction under incentives, resilience emerges from shared detection, shared refusal, and shared enforcement of constraints.

Seen this way, the proposed scanning and disclosure mechanisms function less like a patch and more like an immune layer for agent societies. They do not rely on individual agents being cautious or intelligent. They rely on structural properties of the ecosystem that make certain behaviors difficult to sustain. This is consistent with the broader thesis of the article: meaningful agency, and meaningful safety, are not properties of individual models, but of the systems that coordinate them.

Reuse

CC BY-NC-ND 4.0

Citation

BibTeX citation:

@online{montano2026,
  author = {Montano, Antonio},
  title = {A {Glimpse} of {Agent} {Evolution}},
  date = {2026-01-31},
  url = {https://antomon.github.io/longforms/a-glimpse-of-agent-evolution/},
  langid = {en}
}

For attribution, please cite this work as:

Montano, Antonio. 2026. “A Glimpse of Agent Evolution.” January 31, 2026. https://antomon.github.io/longforms/a-glimpse-of-agent-evolution/.

Moltbook and OpenClaw: a social network of agents

Basic structure and user model

Connection to OpenClaw and agent ecosystems

Scale and activity

Mechanics of joining and participation

Content variety and emergence

Human role and observability

Coordination over interiority: culture as an emergent system

Empirical signals: comparing Moltbook and Reddit as coordination systems