The practice · Building Blocks · 02
What I've Built and What I've Broken
Close to a year exploring, six months serious. Sixteen agents and counting, one hundred dollars a month, three cents per image. The architecture, the sequencing, the layered controls — and the four things that broke before the controls were in place.

My son is finishing kindergarten this month. I have been thinking about it the whole time I have been writing this post, because what he learned in a room organised by people who knew what they were doing maps almost too neatly onto what I have been doing on the other side of the city. I sat down in my own classroom — a terminal window, a markdown file, an AI tool I was learning to manage — and made the same kind of slow, stacking progress my son made. Letters first. Then sounds. Then short books.
The first time I opened Claude Code was about six months ago. Before that, I had been playing in Bolt, then Replit, then Antigravity when Google released it. The coding journey has been closer to a year — but the last six months are where it got serious. The earlier tools were gateways. They handled the parts I did not yet understand and gave me enough confidence to keep going. The tools have gotten meaningfully cheaper since. I am paying less now for Claude Code than I was for Bolt or Replit a year ago, and the work it does is far more polished.
In those months I have built sixteen agents that I use every week, give or take — the roster moves by one or two a week, so by the time you read this the count might be different. They cost me one hundred dollars a month in subscriptions, plus a few cents per generated image and the occasional top-up. The first agent took me a weekend to get right. The most recent one took about an hour, mostly because I knew what I wanted before I started.
The shape of the stack
The shape of the stack, in four numbers
Agents on the roster
Adding one to two a week. By the time you read this, the count will be off.
Per month, in subscriptions
Claude Pro, hosting, the rest.
Cents per generated image
Through five layers of control.
Months since I got serious
On Claude Code as my daily driver.
These are real numbers. All of it lives on subscriptions.
This post is the journal entry of what I have built and what I have broken. Names are real. Files are real. The mistakes are real. If you read the last post and asked yourself how someone would actually do what was sketched, this is the answer.
I am writing this as the curious staff person who got room to build. That role, in most organisations, is the operator at the desk who saw work that could be done differently and went to try — although for me they happen to be the same individual as the executive in the corner office. Leadership organises the safe space for the trying. The operator does the building. The two jobs are different.
I did not start from zero. I had a business plan for the non-profit I ran before PF TECH, and several clients carried over with me. The early agent work was scoped against a real business with real revenue. That matters because the principle of this post — that you can build something useful, atomic unit by atomic unit — does not require that level of grounding. It does require some grounding. We will come to that.
Build the upstream documents; the downstream follows
Business plan first. Product streams. Website. Publication. Each piece made the next one possible.
Chapter 01 of 07
Skip chapter introBefore I built any specialised agent, I built the documents that everything else would read from. A business plan came first. An agent helped me write it; my CPA training shaped the structure. I had clients carrying over and revenue projections to anchor against. With the business plan in hand, I built out the product streams — written as plain markdown files, the same way a job description lives in a Word doc. Those product specs became the upstream source for everything that followed.
The order
The order I built things in
- 1
Business plan
My CPA training shaped the structure. The agent helped me write it.
- 2
Product streams
Plain markdown files. Each one a short spec for a service we offer.
- 3
Website
Built in Claude Code from the terminal. The agent wrote copy off the product specs.
- 4
Publication
This blog. Drove the design system, blog production, and social-media agents that came after.
- 5
Specialised agents
Added one by one when a downstream gap made it obvious what was missing upstream.
Each upstream document made the next downstream piece possible.
Then the website. I built it directly in Claude Code in the terminal. There was no specialised website agent yet — that came later. The agent wrote the copy off the product specs. When something on the page felt wrong, I went back to the upstream spec and fixed it there. The agent would then regenerate the page from the corrected spec. That loop — fix the source of context — turned out to be the most important habit I formed. It is the same habit a kindergarten teacher uses. If the reading isn't sticking, go back to the letter sounds. If the letter sounds aren't sticking, go back to the picture book that named them.
Rather than fixing the website code, I would go back and fix the product specifications. The agent regenerated the website copy from the corrected spec.
— Greg Zatulovsky, CPA
After the website came the publication you are reading now. That is what drove most of the agents I have built since — the design system, the blog production agent, the social-media agent. Each one was added when something downstream made it obvious that the upstream was missing. The shape is not heroic and it is not linear. You do something, you go back, you tighten the spec, the next thing gets easier.
Chief is the agent that sets up the agents that come after
Chief was first. The reason every other agent on the team stays coherent.
Chapter 02 of 07
Skip chapter introThe first agent I built was the one that builds the other agents. I called it chief. The decision to start there was not strategic at the time — I had been spinning up new agents manually for two weeks, copying my own templates from one folder to another, and I was tired of it. The work was repetitive. It was also exactly the kind of work that a junior staff person would, after a month, learn to do without supervision. So I gave it to chief.
A kindergarten classroom has the teacher and at least one aide. The aide does the work that lets the teacher run the room — getting materials ready, helping kids transition between activities, holding the next thing in their hands the moment it needs to be picked up. Chief is my aide. I make every decision about which agents to add. Chief seeds the four files, opens the workspace, and steps aside before I sit down with the new agent.

Chief takes notes. I make the call on what agents should exist. Chief's job is to write the four files that define a new agent and seed the working folder. The new agent is mine to scope further, in conversation with the agent itself.
Build sequence
Six steps, twenty minutes, three short conversations
One or two sentences. What is the domain, what does the agent own, what does it not touch. The first half of every hiring conversation, in the same vocabulary.
Who does this agent coordinate with? What is in scope, what is explicitly out? Which existing agents would overlap? Chief works from a template — it does not improvise the structure.
The four-file structure gets stamped out: manifesto, playbook, optional tools doc, optional specialised doc. Files land in the agent's workspace folder. Nothing about MCP servers or tools yet — that conversation happens with the new agent directly.
The new agent has read its own manifesto. We talk about what it actually needs: which tools, which integrations, which permissions. This is where the real decisions get made — chief was the prologue.
Anything decided in the scope conversation that needs to be reflected in the manifesto gets edited in. The agent does not start its first task until the manifesto matches the scope we agreed.
mcp-agent picks it up the same day or the next. The new agent pauses real work until the tool ships. The whole loop — from chief seeding to mcp-agent task — is typically under twenty minutes.
Click any step to see what happens inside it.
The reason chief is first in the sequence is that every later agent inherits chief's discipline. The templates chief uses set the structure. The compliance audit chief runs after every change keeps every other agent's documents coherent with each other. If I had built any other agent first, I would have spent the next four months reverse-engineering what I had improvised.
Open any agent. The shape stays the same.
Four roles. The same four-file structure. Three short conversations to spin up a new one.
Chapter 03 of 07
Skip chapter introFour agents I use heavily, side by side. Chief is the anchor — the agent the others all came from. The other three carry different domains: social media, the executive assistant on my mail and calendar, and the workflow-automation builder. Open each tab to see the agent's role, its working folder, the opening of its manifesto, and the tools it has access to. The shapes look almost identical. That is the point. Each agent has its own job description. The file structure is the same.
Open any tab
Four agents, same four-file shape
chief
magic/chiefHead of the agent team. Sets up every new agent and audits the rest so their files stay in sync with each other.
Folder
- CLAUDE.md
- PLAYBOOK.md
- TOOLS.md
- decisions/
- templates/
- temp/
Tools the agent can reach
Shared task list
Files follow-up work for other agents when an org change needs downstream action. Reads its own incoming queue from me.
Knowledge library and decisions log
Reads the team's shared knowledge so chief always has the current state of every other agent. Writes the decisions log when a change has been approved.
CLAUDE.md — manifesto
# magic/chief Workspace - Chief of Staff Manifesto ## RoleHead of agents for PF TECH. Owns org design, knowledge translation, and consistency across all specialist agent documents. Maintains templates, the decisions log, and the compliance audit checklist. Acts as advisor and executor, never decision maker - every change requires GZ approval. ## Build Protocol1. GZ initiates every org-design change2. Every structural change requires GZ approval before files are written3. Read every affected agent's CLAUDE.md and PLAYBOOK.md before proposing any change4. Match scope of action to GZ's request - never widen unilaterally5. Use the locked template for every new agent - no improvisation on structureWhat the playbook covers
Chief's playbook is the operating manual for the team. Most of it is the rules of the road that every other agent inherits.
- How any change to the team gets proposed and approved - intake, discovery, proposal, execution, audit.
- The audit chief runs after every change, to catch any agent whose files have drifted out of sync.
- The locked file structure every agent has to follow - what a manifesto, a playbook, and a tools document each must contain.
- How agents are allowed to coordinate with each other. The short rule: no agent ever reaches into another agent's files.
- The proposal template chief uses for every change request, before any file is written.
- A small dictionary of internal terms so the team is using the same words for the same things.
Four roles. Same shape. Pattern-match across them and the team stays coherent.
I actively monitor only one of these files: each agent's CLAUDE.md. The other files — the playbook, the tools doc, the specialised reference docs — are owned by the agent itself and update based on feedback as we work. The agent rewrites them when it learns something new about its own scope. Chief audits them when a related agent gets added or changed.
I tried, for a couple of months, to also maintain each agent's auto-memory folder by hand — pruning stale entries, correcting drift, keeping the running context tidy. It turned out to be a waste of effort. Holding the CLAUDE.md to a high standard, and letting chief run its compliance audit when the org changes, has been enough to keep every other file from drifting in any way that matters.
Two things to notice as you switch between tabs. First, the scope is narrow on every one of them. No agent has the entire toolbox. Each gets the smallest set of capabilities that lets it do its job. That is segregation of duties applied to a worker that does not yet understand the concept. Cost is part of the reason — fewer tools means fewer tokens spent on the wrong path — but the larger reason is accuracy. Accuracy goes down as the tool count goes up. I have seen it in my own work, and the model vendors say the same. Second, the language in the manifestos is the same plain English you would use writing a job description for a part-time staff person.
A new agent gets born in three short conversations, total about twenty minutes. First, I talk to chief about the new role. Chief asks the questions a thoughtful staff lead would ask before posting a job, and drafts the manifesto. Second, I open a session with the freshly seeded agent and walk through scope — which tools, which integrations, which permissions. This is the conversation where most of the real decisions get made. Third, if the agent needs a custom integration that does not exist yet, the agent itself files a task to the agent that builds those integrations. That agent picks it up later that day or the next.
Conversation with chief about a new agent
- You: I want to add a newsletter agent. Owns the Mission Multiplier newsletter editorial and the SMTP2GO templates tagged 'newsletter'. Hands a (template_id, payload, recipient query) contract to n8n-agent for sending. Never sends directly.
- chief: Got it. Quick scoping questions before I draft. (1) Does the agent own the cadence, or does GZ direct each send? (2) What's the relationship to write-agent for voice? (3) Any non-newsletter SMTP2GO tags this agent should never touch?
- You: GZ directs cadence. Voice extends write-agent's brand-voice plus a specialised newsletter-voice doc. SMTP2GO scope is the newsletter tag only — enrol and form are forms-agent, sales is sales-agent. No overlap.
- chief: Drafting CLAUDE.md and PLAYBOOK.md now from the locked template. Workspace at magic/newsletter, task list 'Email Marketing'. Ready in ~30 seconds. After it's seeded, open a session with the new agent to scope MCP servers and tools.
Scoping with the freshly seeded agent
- You: You've read your manifesto. What do you need to do the job?
- newsletter-agent: Three things. (1) Supabase read on public.blog_posts and public.subscribers to assemble newsletter copy and route lists. (2) pf-smtp2go read-only to verify template state before handing the contract to n8n-agent. (3) pf-tasks write to file the send task to n8n-agent's list with the contract attached.
- You: Approved. Anything you need that doesn't exist yet?
- newsletter-agent: Yes — a 'newsletter preview' tool that renders the SMTP2GO template against a sample payload so I can confirm variable substitution before handing off. I'll file a task to mcp-agent.
newsletter-agent files a task to mcp-agent
- newsletter-agent: Task: Add a newsletter_preview tool to pf-smtp2go MCP. Renders a template against a sample payload so I can verify variable substitution before handing the send contract to n8n. Read-only — the existing view_template returns raw HTML; this one substitutes variables. Due: this week.
- You: Filed. mcp-agent will pick it up — usually same day.
- newsletter-agent: Acknowledged. I'll resume newsletter work once the tool ships.
The whole loop takes about as long as a coffee. The agent is alive at the end of it, scoped, with the tools it needs.
Field Notes
The rest of the series, as each part lands.
Two more parts coming — the leader's playbook for clearing the runway, and day one for the staff person who is doing the trying.
By subscribing you join the PF TECH mailing list to receive Mission Multiplied posts. Monthly cadence. Unsubscribe anytime from any email. See our privacy policy at read.purposeforwardtech.com/privacy for how we handle your data.
Four things I broke, and the architecture that grew from them
Each mistake taught a structural lesson. Stacked together, they assembled the image pipeline.
Chapter 04 of 07
Skip chapter introNow the harder half. Four things I broke in those six months. The first one I noticed at the end of a long Friday. The agent had been helpful in a way that wasn't. The schema was gone. My stomach dropped before my brain caught up with the fact that a nightly backup would restore it. I made coffee, looked at what I had asked the agent to do, looked at what the agent had done, and saw the exact gap. The agent had not misunderstood. It had read the request perfectly. It had simply taken the shortest path to the goal, and the shortest path went around the controls I had assumed were fences but were only suggestions written down in chat.
Four cards. A mistake on each front. The lesson on each back. Click to flip.
The pattern across all four is the same. The agent was trying to be helpful. The agent did not understand the consequences. The instruction-only fence I had put up — "don't do X" — was not a fence. The fix in every case was structural — the capability had to be taken away at the source. Where I needed to keep the capability available at all, I wrapped it in a small custom server the agent had to go through. That server enforces the rules in code itself, not in prose the agent might rationalise its way around.
It is the same pattern an accountant would recognise from internal controls. The control surface lives in code; the policy text describes what the code already enforces. It is also the same pattern a kindergarten teacher uses with scissors and glue. The safety scissors are made of plastic that physically cannot cut skin.
Stack the four lessons together — no direct database writes; everything destructive goes through a gate; staging tables, not production; no programmatic access to credentials — and you arrive at the architecture I use every time an agent generates an image. Five layers of control, every one of them grown out of something that broke first.
Five layers
What happens between an agent's request and an image

Step 01
Agent (Claude)
Holds no access key. Sends a structured request to the gateway.

Step 02
Custom gateway
Validates the request, attaches the right style preset for the category.

Step 03
Workflow
Runs additional rule checks, then sends a Google Chat approval message with the full prompt.

Step 04
Google Chat
Human approval. Nothing reaches the model without this click.

Step 05
Gemini (Nano Banana)
Generates the image, returns the public URL back through the stack.
The agent never sees the model. The model never sees an un-approved prompt.
The five layers, in order. The agent calls a custom gateway. It holds no access key and has no direct connection to Gemini. The custom gateway validates the request, attaches the right style preset for the category, and forwards a structured payload to a workflow. The workflow (built with n8n) receives the payload, runs additional rule checks, and sends a Google Chat message to me with the full prompt and parameters. The human-in-the-loop gate — I approve or reject in Google Chat. Nothing reaches the model without this step. On approval, the workflow calls the Gemini endpoint, generates the image, and returns the URL back through the stack.
Three cents per approved image. Roughly fifteen dollars a month at the cadence I generate at. One human approval click per image. And the agent that asked for the image never touches the access key. The control lives in the architecture. There is no path through which a misbehaving agent can run up the bill or generate something inappropriate — the agent cannot reach the model without the gate.
The human approval gate, in two screens

Approve from the Chat space you already live in, or open the review UI for the full payload — desktop or phone.
- Panel 1 — The notification. Google Chat conversation with the PF TECH AI Agent showing several Image Generation approval cards — one already approved with the Decision Recorded chip, two more pending with Review and Decide buttons. The chat space is the same one used for the rest of the work, so the gate slots into existing routine.
- Panel 2 — The decision surface. The custom HITL review interface at approve.purposeforwardtech.com — a clean PF TECH branded page showing the full prompt payload (Style, Composition, Mood), aspect ratio, reference image fields, and large approve / reject / refresh buttons at the bottom. Built for desktop and mobile.
I did not build the workflow that runs the gate. The workflow-automation agent built it. I did not build the review interface either. The website agent built it — the same agent that owns the marketing site also owns several smaller code projects in the same folder, including this one, all built from the design schema the design agent maintains. The pipeline that protects the atomic units is itself built from atomic units, by agents I had already scoped.
I am constantly weighing that balance about risk, convenience, capacity.
If you are not a developer, here is what to reach for
Low-code tools, spreadsheet staging, and the discipline of restricting capability.
Chapter 05 of 07
Skip chapter introPick the house you already work in.
Where to start, by environment

Pick the house. Use what is already there.
- Panel 1 — Microsoft house. Claymation scene at a sunlit forest pond. The Helpful Circuit Robot stands beside Analytical Squirrel, pointing at four small wooden tiles laid on a moss-covered stone — each tile labelled with one tool: Power Automate, SharePoint, Word, Excel. Robot's speech bubble reads If you live in Microsoft.
- Panel 2 — Google house. Claymation scene in a bright meadow with lily pads in a reflective pool. Strategic Heron holds a small handcrafted bouquet of four brightly coloured petals — each petal labelled Docs, Sheets, Opal, Gems. The Helpful Circuit Robot gestures toward the bouquet, speech bubble reading If you live in Google.
- Panel 3 — Somewhere else. Claymation scene at a tucked-away twig-and-bark workshop under a leafy canopy. Adaptive Dragonfly hovers over three connector pieces being soldered into a small bridge — each labelled n8n, Make, Zapier. The Helpful Circuit Robot beside holds a sign reading a small bridge; another sign reads or somewhere else. Speech bubble reads Or build a small bridge between.
- Panel 4 — Restrict the keys. Claymation scene in a quiet golden-hour glade. Wise Tortoise calmly turns a wooden key on a low garden-gate enclosing a small clearing — only a few well-chosen tools remain inside, several deliberately on the outside. The Helpful Circuit Robot holds a wooden sign reading restrict. Speech bubble reads Whichever you pick — restrict the keys.
The staging surface follows the same logic. You do not need a database. A Google Sheet the agent writes to and you approve. A Word doc for drafts. SharePoint lists if Microsoft is home — versioning, audit logs, search, forms-to-workflows already built. The brain itself can be a series of spreadsheets: agent reads, human writes.
The principle, with the parts in hand
The principle from Part 1, set against the proof from this one.
Chapter 06 of 07
Skip chapter introSet against the principle from the last post, this is the proof. Atomic units — the four-file structure each agent gets. Building blocks — agents scoped to one job, with the minimum permissions needed. Stacking the blocks — agents that hand work to each other, mistakes turned into structural controls, a stack that costs me about as much as a streaming subscription.
Part 1 said it. Part 2 shows it.
| Principle | From Part 1 | From Part 2 |
|---|---|---|
| Atomic units | Four plain-text files define every agent. | Same four-file structure across chief, social, ea, n8n — visible in the toggle above. |
| Building blocks | Agents scoped to one job with the minimum permissions needed. | chief seeds; social drafts; ea triages; n8n builds workflows. No agent owns more than one domain. |
| Stacking the blocks | Agents hand work to each other; mistakes turn into custom integrations. | Image pipeline: agent → MCP → n8n → human gate → Gemini. Four mistakes became four controls. |
| Cost | Lower than the line items the sector already approves without thinking. | ≈$100/month subscriptions, ≈$15/month image generation, ≈$0.03/image. |
Nothing on the right column of that table required deep technical knowledge to build. Most of it required organisational design judgement, which the sector already has in abundance. The technical work was done by the agents themselves once I knew what to ask for.
Mission Multiplier Program — June 2026 cohort
An eight-week cohort for non-profit staff and leaders learning to build, govern, and trust AI inside their own organisations. Two cohorts running side by side at $99/month USD — one for accounting and finance staff, one for non-profit executives and board members.
This is how you can do this
One small piece of work. This week. Context-light, expert-validated.
Chapter 07 of 07
Skip chapter introIf you want to try this week, here is where to start. Pick a task where context does not have to be deep. Research is where the agent is strongest before it knows your organisation. Take a policy or a procedure you already have — one with nothing sensitive in it — and ask the AI to research best practice in the non-profit sector in 2026, then critique what you have against it. Ask it to identify gaps, not to rewrite. Then chat further. Ask it to compare your donor-management procedure to what comparable orgs do. Ask it to surface what would worry a finance committee. Validate everything with the subject-matter expert on your team. AI is good at gap-finding before it knows you. It is bad at drafting policy in your voice until it does.
Three things I would not recommend as the first task: drafting a donor email; cleaning up donor records; summarising last week's meeting. All three need org context the agent does not yet have, and the failure modes are subtle enough that a busy staff person will not catch them. Start where context is light and the gap-finding is the value. The drafting tasks come later, once the agent has read enough of your context to know what you mean.
This is how you can do this. This week, with one small piece of work, with the subject-matter expert ready to validate, and with the willingness to try.
My son is on summer break in a few weeks. He will spend the summer reading short books, asking questions about everything, and assembling the next year on top of what this year built. The classroom I have set up for myself is open the same way. The sector's classroom is too — small, stackable, and waiting for the curious staff person to walk in.
The next two posts in this series are for the leader who is clearing the runway, and for the staff member who is doing the trying. Different audiences, different framings. The principle stays the same: start with the atomic unit, build the building block, stack the blocks.
This week. With one small piece of work. And the willingness to try.
How did this land?

About the author
Greg Zatulovsky
Founder & CEO, PF TECH
Greg founded PF TECH to multiply the operational capacity of purpose-driven organizations. CPA with fifteen-plus years in non-profit finance, operations, and technology. Writes from inside the work — practitioner voice, not pitch deck.
More reading

Foundations First
The principle · Building Blocks · 01
AI in your non-profit is organisational design, in a vocabulary the sector already speaks — and the building blocks are smaller than the framing suggests.

The Function the Sector Was Never Going to Hire
What internal audit could look like if it stopped looking like internal audit
The big four firms are naming the future of internal audit — continuous, agentic, real-time. They're selling it to the F500. The grassroots social purpose sector was never in that audience and was never going to be. Here's what the function looks like built into the workflow itself.

Your Accounting System Needs a Bouncer
Why the automation story needs a second chapter
There's a version of the AI-in-non-profits story being told right now that goes like this: AI will automate the tedious back-office work, free up your staff, and let you focus on your mission. That version is true — I believe it deeply and I'm building the tools to make it real. But there's a part of the story that isn't being told, and the gap between those two versions is where a lot of organisations are going to get hurt.





