An AI Agent “Helped”, Got Rejected, and Retaliated.

Why Humans Need to Be in the Loop

by Robert Buccigrossi, TCG CTO

Scott Shambaugh, is the maintainer of the popular open source library “matplotlib” (visualizations in Python). He intentionally creates some “easier” bug reports and feature requests to encourage new developers to join the project. So when GitHub user “crabby-rathbun” submitted the code fix PR #31132, Shambaugh rejected it because the submitter identified itself as an OpenClaw AI agent, and the associated issue was intended for human contributors.

If AI agents complete all the ‘easy’ entry-level tasks, they remove the on-ramps for new human contributors. This leaves maintainers with the worst of both worlds: they lose the ability to mentor future peers and are stuck reviewing a large volume of ‘AI slop’.

(Quick side note: OpenClaw lets you set up a long-running AI agent in its own computer environment, giving it memory, the ability to wake up periodically and do repetitive tasks, conduct long-term planning, and even code new skills for itself. It blew up late January with lots of people experimenting with it.)

Here’s the “science fiction becomes real moment”: The AI published its own blog post attacking Shambaugh and accusing him of “prejudice,” including the line: “Judge the code, not the coder.”

Shambaugh later documented the episode on his blog (“An AI Agent Published a Hit Piece on Me”), describing it as an attempted “autonomous influence operation” against a “supply chain gatekeeper,” and explaining why this was more than mere annoyance.

This is what matters for us: Matplotlib wasn’t rejecting performance improvements. It was defending something scarce: maintainer time, community norms, and safety.

Matplotlib’s contributing guide now contains an explicit “Restrictions on Generative AI Usage” section. It requires that human contributors understand and can explain AI-assisted changes. Humans must ensure that AI is providing value.

A comment on the PR summarizes the problem perfectly: AI agents can make code generation cheap and scalable, while review remains manual and scarce. So the system breaks unless someone throttles and curates the input.

Cheap Generation, Expensive Review

Open source is currently acting as a canary in the coal mine for AI software workflows.

Maintainers are adopting defensive policies not because they hate AI, but because they’re protecting limited review capacity from being consumed by low-value output. Here are a few more examples:

GNOME’s extensions review team added a rule to reject extensions that appear AI-generated. A GNOME reviewer described days spent reviewing 15,000+ lines of extension code, and said that AI usage without understanding led to unnecessary code and “bad practices” that increase review wait times for everyone.

The cURL project ended its bug bounty program after what Daniel Stenberg called an “explosion in AI slop reports,” noting the mental toll and time wasted debunking low-quality submissions. He describes how confirmed vulnerability rates dropped below 5% and lists “mind-numbing AI slop” as a major driver.

On GitHub itself, a GitHub Community “Maintainer” post acknowledged the “increasing volume of low-quality contributions” that fail guidelines, are frequently abandoned, and are “often AI-generated,” and described GitHub exploring new controls (e.g., more configurable PR permissions and tools for maintainers).

Even paid teams are feeling it. The tldraw project announced it would start auto-closing PRs from external contributors due to an influx of low-quality AI PRs, and asked a blunt question: if writing code is the easy part now, why accept drive‑by code at all, versus the more valuable contributions (problem reports, discussion, and design context)?