On June 9, 2026, I posted about burning $500 in Claude usage auditing 20 entrepreneurs at DealCon, and said my agents “update their own training via RSI.” Smart people pushed back. They were right about one thing, wrong about two others, and the whole exchange is a better lesson in building self-improving systems than my original post was. So here is the definitive version — what I mean by RSI, what I don’t, and the receipts.
The fair criticism, stated plainly
No deployed AI model updates its own weights from your usage. When you run Claude, GPT, or Gemini, the model’s training is frozen. Nothing you do in a session retrains it. So when I wrote that my agents “update their own training,” that was the wrong phrase — “training” means weights to anyone technical, and the weights don’t move. The commenters who called that out were correct, and I’m glad they did, because the precise version of the claim is stronger than the sloppy one.
What my system actually does
The recursion in my system lives at the process layer, not the model layer. Every skill in my library ends with the same loop, and one skill — recursive-self-improvement-qa, which you can download and read — exists only to enforce it:
- Do the work to a written definition of done, not to “good enough.”
- Document the run as a meta-article: what was tried, what worked, what failed, what surprised us.
- QA against the definitive article — the canonical standard for that task.
- Rewrite the SOP — the skill file itself — so the lesson is now an instruction, not a memory of a mistake.
- Persist to memory, so the next session starts from everything the last one learned.
The model never gets smarter. The system does: the skills, the SOPs, the checklists, the memory files. Run the same frozen model against a playbook that has improved itself through fifty documented cycles and you get dramatically better output than the same model with a cold prompt. One commenter described this as “agents reading their own notes back, which is memory, not RSI” — and that’s a fair technical description of the mechanism. I’d only add: when the notes systematically rewrite the instructions, and the instructions govern the next run, “notes” undersells it. It’s a flywheel made of documents. Call it process-layer RSI, call it a self-improving system, call it institutional memory with a work ethic. The label matters less than the loop.
Why frozen weights don’t stop the compounding
Here’s the part the “that’s just memory” framing misses. The models leapfrog each other every few weeks — Claude today, Gemini or GPT or Grok tomorrow. If your advantage lives in the model, you lose it at the next release. If your advantage lives in documented, self-improving process — skills, SOPs, definitions of done, meta-articles, memory — then every new model makes your system better the day it ships, because you swap the engine and keep the car. That’s exactly why we built our skill packs as portable markdown files rather than platform-locked workflows. The moat isn’t the model and it isn’t even the prompt. It’s the accumulated, versioned record of what works, structured so an agent can act on it.
The receipts
The strongest comment in the thread said: when the checkable claims are wrong, the uncheckable ones don’t get the benefit of the doubt — post one of the PDFs. Reasonable standard. Here’s everything, public:
- All 20 DealCon Authority Audits — the full 15-page PDFs, named and dated per attendee, are at the bottom of dennisyu.com/dealcon. Not a sample. All of them.
- The skill packs themselves — the same 10-skill system, free, installable on your own Claude in about a minute: the personal edition and the company edition. Reverse-engineer them; that’s the point.
- The 239-task library behind the method, every SOP readable: the Task Library dashboard.
- A measured field result — the January 2026 Maps Visibility System pilot across four Pure Green franchise locations, with before/after grids: the pilot data.
- The build process documented as it happened: how we turned 239 tasks into an AI-runnable skill library.
Scoring the whole thread, both directions
Since the standard is “get the checkable claims right,” let’s apply it evenly.
What I got wrong: “Updates its own training” — imprecise; it’s the instructions and memory that update, never the weights. “Trillion dollar market cap” — wrong term; Anthropic is a private company, so it has a valuation, not a market cap.
What the fact-checkers got wrong: One commenter’s Claude confidently stated Anthropic is “valued around $60–70B, not $1 trillion.” The reality, checkable in thirty seconds: Anthropic closed a Series H on May 28, 2026 at a $965 billion valuation — Fortune covered it as the company nearly becoming the first trillion-dollar private company in history — and CNBC reported it surpassing OpenAI as the world’s most valuable startup, with a confidential IPO filing following on June 1. So my “trillion” was off by about 3.5 percent and one word. The correction was off by roughly $900 billion — because that Claude answered from stale training data without searching. The same commenter’s Claude also said “each conversation starts fresh,” which was true in 2024 and is false now: persistent memory across sessions is a shipping product feature, and it’s load-bearing in everything described above.
That’s the meta-lesson, and it’s worth more than the argument: an AI that doesn’t verify is a confident generator of outdated facts — in both directions. The fix is the same loop this whole article describes. Search before asserting. QA against a standard. Update the document. That’s not hype; that’s the job.
Definition of done for this argument
Use precise words and the disagreement mostly dissolves. Fine-tuning changes weights — nobody’s deployed agent does this from daily use. Memory persists context across sessions — shipping today. Process-layer RSI is memory plus an enforced loop that rewrites the system’s own instructions against a quality standard — that’s what we run, it’s documented, and you can install it yourself and watch the SOPs change. If you can find a claim on this page that doesn’t survive a thirty-second check, I’ll correct it here, in public, with a changelog entry — because that’s literally the system working as designed.
Updated June 11, 2026. This article is maintained by the same loop it describes: each significant critique becomes a revision, and each revision is logged.
