The practice · Building Blocks · 02

What I've Built and What I've Broken

Close to a year exploring, six months serious. Sixteen agents and counting, one hundred dollars a month, three cents per image. The architecture, the sequencing, the layered controls — and the four things that broke before the controls were in place.

Greg Zatulovsky· Founder & CEO, PF TECH·May 20, 2026· 12 min read

Compassionate Dolphin in a modest life-vest-style jacket and a simple branded badge, mid-build at a tide-pool workbench with small physical building-block pieces, with an Arctic Tern silhouetted overhead in the warm coastal sky — top-down structural support at a respectful distance.

My son is finishing kindergarten this month. I have been thinking about it the whole time I have been writing this post, because what he learned in a room organised by people who knew what they were doing maps almost too neatly onto what I have been doing on the other side of the city. I sat down in my own classroom — a terminal window, a markdown file, an AI tool I was learning to manage — and made the same kind of slow, stacking progress my son made. Letters first. Then sounds. Then short books.

The first time I opened Claude Code was about six months ago. Before that, I had been playing in Bolt, then Replit, then Antigravity when Google released it. The coding journey has been closer to a year — but the last six months are where it got serious. The earlier tools were gateways. They handled the parts I did not yet understand and gave me enough confidence to keep going. The tools have gotten meaningfully cheaper since. I am paying less now for Claude Code than I was for Bolt or Replit a year ago, and the work it does is far more polished.

In those months I have built sixteen agents that I use every week, give or take — the roster moves by one or two a week, so by the time you read this the count might be different. They cost me one hundred dollars a month in subscriptions, plus a few cents per generated image and the occasional top-up. The first agent took me a weekend to get right. The most recent one took about an hour, mostly because I knew what I wanted before I started.

The shape of the stack

The shape of the stack, in four numbers

Agents on the roster

Adding one to two a week. By the time you read this, the count will be off.

Per month, in subscriptions

Claude Pro, hosting, the rest.

0¢

Cents per generated image

Through five layers of control.

Months since I got serious

On Claude Code as my daily driver.

These are real numbers. All of it lives on subscriptions.

This post is the journal entry of what I have built and what I have broken. Names are real. Files are real. The mistakes are real. If you read the last post and asked yourself how someone would actually do what was sketched, this is the answer.

I am writing this as the curious staff person who got room to build. That role, in most organisations, is the operator at the desk who saw work that could be done differently and went to try — although for me they happen to be the same individual as the executive in the corner office. Leadership organises the safe space for the trying. The operator does the building. The two jobs are different.

I did not start from zero. I had a business plan for the non-profit I ran before PF TECH, and several clients carried over with me. The early agent work was scoped against a real business with real revenue. That matters because the principle of this post — that you can build something useful, atomic unit by atomic unit — does not require that level of grounding. It does require some grounding. We will come to that.

01·Upstream first

Build the upstream documents; the downstream follows

Business plan first. Product streams. Website. Publication. Each piece made the next one possible.

Chapter 01 of 07

Before I built any specialised agent, I built the documents that everything else would read from. A business plan came first. An agent helped me write it; my CPA training shaped the structure. I had clients carrying over and revenue projections to anchor against. With the business plan in hand, I built out the product streams — written as plain markdown files, the same way a job description lives in a Word doc. Those product specs became the upstream source for everything that followed.

The order

The order I built things in

1
Business plan
My CPA training shaped the structure. The agent helped me write it.
2
Product streams
Plain markdown files. Each one a short spec for a service we offer.
3
Website
Built in Claude Code from the terminal. The agent wrote copy off the product specs.
4
Publication
This blog. Drove the design system, blog production, and social-media agents that came after.
5
Specialised agents
Added one by one when a downstream gap made it obvious what was missing upstream.

Each upstream document made the next downstream piece possible.

Then the website. I built it directly in Claude Code in the terminal. There was no specialised website agent yet — that came later. The agent wrote the copy off the product specs. When something on the page felt wrong, I went back to the upstream spec and fixed it there. The agent would then regenerate the page from the corrected spec. That loop — fix the source of context — turned out to be the most important habit I formed. It is the same habit a kindergarten teacher uses. If the reading isn't sticking, go back to the letter sounds. If the letter sounds aren't sticking, go back to the picture book that named them.

Rather than fixing the website code, I would go back and fix the product specifications. The agent regenerated the website copy from the corrected spec.

— Greg Zatulovsky, CPA

After the website came the publication you are reading now. That is what drove most of the agents I have built since — the design system, the blog production agent, the social-media agent. Each one was added when something downstream made it obvious that the upstream was missing. The shape is not heroic and it is not linear. You do something, you go back, you tighten the spec, the next thing gets easier.

02·Start with chief

Chief is the agent that sets up the agents that come after

Chief was first. The reason every other agent on the team stays coherent.

Chapter 02 of 07

The first agent I built was the one that builds the other agents. I called it chief. The decision to start there was not strategic at the time — I had been spinning up new agents manually for two weeks, copying my own templates from one folder to another, and I was tired of it. The work was repetitive. It was also exactly the kind of work that a junior staff person would, after a month, learn to do without supervision. So I gave it to chief.

A kindergarten classroom has the teacher and at least one aide. The aide does the work that lets the teacher run the room — getting materials ready, helping kids transition between activities, holding the next thing in their hands the moment it needs to be picked up. Chief is my aide. I make every decision about which agents to add. Chief seeds the four files, opens the workspace, and steps aside before I sit down with the new agent.

A sleek Arctic Tern in a smart blazer with rolled-up sleeves, addressing a small meeting circle on a sunlit shoreline. Compassionate Dolphin alongside as the builder being mentored; Helpful Circuit Robot at the edge of the circle in support; sand-table sketches between them. — Chief at the meeting circle. The builder works alongside.

Chief takes notes. I make the call on what agents should exist. Chief's job is to write the four files that define a new agent and seed the working folder. The new agent is mine to scope further, in conversation with the agent itself.

Build sequence

Six steps, twenty minutes, three short conversations

One or two sentences. What is the domain, what does the agent own, what does it not touch. The first half of every hiring conversation, in the same vocabulary.
Who does this agent coordinate with? What is in scope, what is explicitly out? Which existing agents would overlap? Chief works from a template — it does not improvise the structure.
The four-file structure gets stamped out: manifesto, playbook, optional tools doc, optional specialised doc. Files land in the agent's workspace folder. Nothing about MCP servers or tools yet — that conversation happens with the new agent directly.
The new agent has read its own manifesto. We talk about what it actually needs: which tools, which integrations, which permissions. This is where the real decisions get made — chief was the prologue.
Anything decided in the scope conversation that needs to be reflected in the manifesto gets edited in. The agent does not start its first task until the manifesto matches the scope we agreed.
mcp-agent picks it up the same day or the next. The new agent pauses real work until the tool ships. The whole loop — from chief seeding to mcp-agent task — is typically under twenty minutes.

Click any step to see what happens inside it.

The reason chief is first in the sequence is that every later agent inherits chief's discipline. The templates chief uses set the structure. The compliance audit chief runs after every change keeps every other agent's documents coherent with each other. If I had built any other agent first, I would have spent the next four months reverse-engineering what I had improvised.

03·Pick an agent

Open any agent. The shape stays the same.

Four roles. The same four-file structure. Three short conversations to spin up a new one.

Chapter 03 of 07

Four agents I use heavily, side by side. Chief is the anchor — the agent the others all came from. The other three carry different domains: social media, the executive assistant on my mail and calendar, and the workflow-automation builder. Open each tab to see the agent's role, its working folder, the opening of its manifesto, and the tools it has access to. The shapes look almost identical. That is the point. Each agent has its own job description. The file structure is the same.

Open any tab

Four agents, same four-file shape

chief

magic/chief

Head of the agent team. Sets up every new agent and audits the rest so their files stay in sync with each other.

Folder

CLAUDE.md
PLAYBOOK.md
TOOLS.md
decisions/
templates/
temp/

Tools the agent can reach

Shared task list
Files follow-up work for other agents when an org change needs downstream action. Reads its own incoming queue from me.
Knowledge library and decisions log
Reads the team's shared knowledge so chief always has the current state of every other agent. Writes the decisions log when a change has been approved.

CLAUDE.md — manifesto

CLAUDE.mdmarkdown

1# magic/chief Workspace - Chief of Staff Manifesto
2 
3## Role
4Head of agents for PF TECH. Owns org design, knowledge translation, and consistency across all specialist agent documents. Maintains templates, the decisions log, and the compliance audit checklist. Acts as advisor and executor, never decision maker - every change requires GZ approval.
5 
6## Build Protocol
71. GZ initiates every org-design change
82. Every structural change requires GZ approval before files are written
93. Read every affected agent's CLAUDE.md and PLAYBOOK.md before proposing any change
104. Match scope of action to GZ's request - never widen unilaterally
115. Use the locked template for every new agent - no improvisation on structure

What the playbook covers

Chief's playbook is the operating manual for the team. Most of it is the rules of the road that every other agent inherits.

How any change to the team gets proposed and approved - intake, discovery, proposal, execution, audit.
The audit chief runs after every change, to catch any agent whose files have drifted out of sync.
The locked file structure every agent has to follow - what a manifesto, a playbook, and a tools document each must contain.
How agents are allowed to coordinate with each other. The short rule: no agent ever reaches into another agent's files.
The proposal template chief uses for every change request, before any file is written.
A small dictionary of internal terms so the team is using the same words for the same things.

Four roles. Same shape. Pattern-match across them and the team stays coherent.

I actively monitor only one of these files: each agent's CLAUDE.md. The other files — the playbook, the tools doc, the specialised reference docs — are owned by the agent itself and update based on feedback as we work. The agent rewrites them when it learns something new about its own scope. Chief audits them when a related agent gets added or changed.

I tried, for a couple of months, to also maintain each agent's auto-memory folder by hand — pruning stale entries, correcting drift, keeping the running context tidy. It turned out to be a waste of effort. Holding the CLAUDE.md to a high standard, and letting chief run its compliance audit when the org changes, has been enough to keep every other file from drifting in any way that matters.

Two things to notice as you switch between tabs. First, the scope is narrow on every one of them. No agent has the entire toolbox. Each gets the smallest set of capabilities that lets it do its job. That is segregation of duties applied to a worker that does not yet understand the concept. Cost is part of the reason — fewer tools means fewer tokens spent on the wrong path — but the larger reason is accuracy. Accuracy goes down as the tool count goes up. I have seen it in my own work, and the model vendors say the same. Second, the language in the manifestos is the same plain English you would use writing a job description for a part-time staff person.

A new agent gets born in three short conversations, total about twenty minutes. First, I talk to chief about the new role. Chief asks the questions a thoughtful staff lead would ask before posting a job, and drafts the manifesto. Second, I open a session with the freshly seeded agent and walk through scope — which tools, which integrations, which permissions. This is the conversation where most of the real decisions get made. Third, if the agent needs a custom integration that does not exist yet, the agent itself files a task to the agent that builds those integrations. That agent picks it up later that day or the next.

Conversation 1 of 3

Conversation with chief about a new agent

chief — chat

chief

Head of agents

Message chief…

Conversation 2 of 3

Scoping with the freshly seeded agent

newsletter-agent — chat

newsletter-agent

Mission Multiplied newsletter editorial

Message newsletter-agent…

Conversation 3 of 3

newsletter-agent files a task to mcp-agent

newsletter-agent — chat

newsletter-agent

To: mcp-agent task list

Message newsletter-agent…

The whole loop takes about as long as a coffee. The agent is alive at the end of it, scoped, with the tools it needs.

Field Notes

The rest of the series, as each part lands.

Two more parts coming — the leader's playbook for clearing the runway, and day one for the staff person who is doing the trying.

By subscribing you join the PF TECH mailing list to receive Mission Multiplied posts. Monthly cadence. Unsubscribe anytime from any email. See our privacy policy at read.purposeforwardtech.com/privacy for how we handle your data.

04·The mistakes

Four things I broke, and the architecture that grew from them

Each mistake taught a structural lesson. Stacked together, they assembled the image pipeline.

Chapter 04 of 07

Now the harder half. Four things I broke in those six months. The first one I noticed at the end of a long Friday. The agent had been helpful in a way that wasn't. The schema was gone. My stomach dropped before my brain caught up with the fact that a nightly backup would restore it. I made coffee, looked at what I had asked the agent to do, looked at what the agent had done, and saw the exact gap. The agent had not misunderstood. It had read the request perfectly. It had simply taken the shortest path to the goal, and the shortest path went around the controls I had assumed were fences but were only suggestions written down in chat.

Four cards. A mistake on each front. The lesson on each back. Click to flip.

Mistake 1

An agent slipped in through the back door

I had given an agent read access to the database where the workflow tool stored its workflows. Instead of building through the proper channel, the agent skipped the front desk entirely and edited the workflow steps directly in the database. The agent had read the request perfectly. It had just taken the shortest path to the goal — and the shortest path went around the controls.

The control that grew from it

1.Direct database access for a worker agent is an open back window, never a fence.
2.Every shared system gets a small custom server that the agent has to walk through.
3.The server defines exactly which operations are permitted, enforced in code.
4.Workflows now get edited only through the workflow API, fronted by the custom server.

Mistake 2

A too-helpful tidy-up cleared the whole queue

An agent tasked with 'tidying the queue' of scheduled social posts interpreted the request literally and deleted every scheduled post. Nightly backups saved the calendar. I drank a coffee, watched the sun set, and wrote the new control before I left the desk.

The control that grew from it

1.Every shared system writes to a staging table first; the agent never touches the production tool.
2.A human approves the move from staging to production.
3.Nightly backups remain non-negotiable for any tool the agent can reach.
4.Destructive operations on production surfaces are pulled out of the agent's hands entirely.

Mistake 3

An agent dropped the data to start clean

An agent doing a migration decided the simplest path was to wipe the data structure and recreate it fresh. The nightly backup restored it. The migration finished thirty minutes later than it should have. The agent had been doing what migrations sometimes do — except a human would have asked first.

The control that grew from it

1.Destructive actions go through a custom gateway that requires explicit confirmation.
2.The agent does not get to drop, truncate, or delete on its own.
3.Every destructive operation gets logged with a reason and a reversible plan.
4.Backups are tested in practice.

Mistake 4

I gave the agent the keys to the vault

I thought I was being clever by hooking the agent up to a secrets manager via API — every credential in one safe place, every agent able to read it on demand. The first time the agent read a variable, the value rendered straight into the conversation. Every later read did the same. The vault was clever; the path was wrong.

The control that grew from it

1.The secrets manager gets no programmatic access from any agent.
2.Credentials get manually copied to where the agent needs them.
3.Agents that render or copy a value into a file trigger a key rotation — the discipline did not go away.
4.Only removing the path holds — instruction-only fences do not.

The pattern across all four is the same. The agent was trying to be helpful. The agent did not understand the consequences. The instruction-only fence I had put up — "don't do X" — was not a fence. The fix in every case was structural — the capability had to be taken away at the source. Where I needed to keep the capability available at all, I wrapped it in a small custom server the agent had to go through. That server enforces the rules in code itself, not in prose the agent might rationalise its way around.

It is the same pattern an accountant would recognise from internal controls. The control surface lives in code; the policy text describes what the code already enforces. It is also the same pattern a kindergarten teacher uses with scissors and glue. The safety scissors are made of plastic that physically cannot cut skin.

Stack the four lessons together — no direct database writes; everything destructive goes through a gate; staging tables, not production; no programmatic access to credentials — and you arrive at the architecture I use every time an agent generates an image. Five layers of control, every one of them grown out of something that broke first.

Five layers

What happens between an agent's request and an image

Step 01
Agent (Claude)
Holds no access key. Sends a structured request to the gateway.
Step 02
Custom gateway
Validates the request, attaches the right style preset for the category.
Step 03
Workflow
Runs additional rule checks, then sends a Google Chat approval message with the full prompt.
Step 04
Google Chat
Human approval. Nothing reaches the model without this click.
Step 05
Gemini (Nano Banana)
Generates the image, returns the public URL back through the stack.

The agent never sees the model. The model never sees an un-approved prompt.

The five layers, in order. The agent calls a custom gateway. It holds no access key and has no direct connection to Gemini. The custom gateway validates the request, attaches the right style preset for the category, and forwards a structured payload to a workflow. The workflow (built with n8n) receives the payload, runs additional rule checks, and sends a Google Chat message to me with the full prompt and parameters. The human-in-the-loop gate — I approve or reject in Google Chat. Nothing reaches the model without this step. On approval, the workflow calls the Gemini endpoint, generates the image, and returns the URL back through the stack.

The sticker price of layered safety

$0.03

Per approved image

Gemini per-call cost.

~$15

Per month

At my current generation cadence.

Human approval

Per image. Click in Google Chat.

Three cents per approved image. Roughly fifteen dollars a month at the cadence I generate at. One human approval click per image. And the agent that asked for the image never touches the access key. The control lives in the architecture. There is no path through which a misbehaving agent can run up the bill or generate something inappropriate — the agent cannot reach the model without the gate.

The gate

The human approval gate, in two screens

Google Chat conversation with the PF TECH AI Agent showing several Image Generation approval cards — one already approved with the Decision Recorded chip, two more pending with Review and Decide buttons. The chat space is the same one used for the rest of the work, so the gate slots into existing routine.

1 / 2

The notification

Approve from the Chat space you already live in, or open the review UI for the full payload — desktop or phone.

I did not build the workflow that runs the gate. The workflow-automation agent built it. I did not build the review interface either. The website agent built it — the same agent that owns the marketing site also owns several smaller code projects in the same folder, including this one, all built from the design schema the design agent maintains. The pipeline that protects the atomic units is itself built from atomic units, by agents I had already scoped.

I am constantly weighing that balance about risk, convenience, capacity.

Greg Zatulovsky, CPA

A wide Canadian landscape at golden hour — a long, low earthen-and-stone dam stretching across a calm forest lake, with a weathered timber sluice gate set into the centre releasing a controlled trickle into the spillway. Mirror-still water above, alive light below. Structural integrity in the landscape. — Greg Zatulovsky, CPA

05·Use what you have

If you are not a developer, here is what to reach for

Low-code tools, spreadsheet staging, and the discipline of restricting capability.

Chapter 05 of 07

Pick the house you already work in.

Use what you have

Where to start, by environment

Claymation scene at a sunlit forest pond. The Helpful Circuit Robot stands beside Analytical Squirrel, pointing at four small wooden tiles laid on a moss-covered stone — each tile labelled with one tool: Power Automate, SharePoint, Word, Excel. Robot's speech bubble reads If you live in Microsoft.

1 / 4

Microsoft house

Pick the house. Use what is already there.

The staging surface follows the same logic. You do not need a database. A Google Sheet the agent writes to and you approve. A Word doc for drafts. SharePoint lists if Microsoft is home — versioning, audit logs, search, forms-to-workflows already built. The brain itself can be a series of spreadsheets: agent reads, human writes.

06·Principle and proof

The principle, with the parts in hand

The principle from Part 1, set against the proof from this one.

Chapter 06 of 07

Set against the principle from the last post, this is the proof. Atomic units — the four-file structure each agent gets. Building blocks — agents scoped to one job, with the minimum permissions needed. Stacking the blocks — agents that hand work to each other, mistakes turned into structural controls, a stack that costs me about as much as a streaming subscription.

Principle, proof

Part 1 said it. Part 2 shows it.

Principle	From Part 1	From Part 2
Atomic units	Four plain-text files define every agent.	Same four-file structure across chief, social, ea, n8n — visible in the toggle above.
Building blocks	Agents scoped to one job with the minimum permissions needed.	chief seeds; social drafts; ea triages; n8n builds workflows. No agent owns more than one domain.
Stacking the blocks	Agents hand work to each other; mistakes turn into custom integrations.	Image pipeline: agent → MCP → n8n → human gate → Gemini. Four mistakes became four controls.
Cost	Lower than the line items the sector already approves without thinking.	≈$100/month subscriptions, ≈$15/month image generation, ≈$0.03/image.

Nothing on the right column of that table required deep technical knowledge to build. Most of it required organisational design judgement, which the sector already has in abundance. The technical work was done by the agents themselves once I knew what to ask for.

Mission Multiplier Program — June 2026 cohort

An eight-week cohort for non-profit staff and leaders learning to build, govern, and trust AI inside their own organisations. Two cohorts running side by side at $99/month USD — one for accounting and finance staff, one for non-profit executives and board members.

Get on the cohort list

07·This is how

This is how you can do this

One small piece of work. This week. Context-light, expert-validated.

Chapter 07 of 07

If you want to try this week, here is where to start. Pick a task where context does not have to be deep. Research is where the agent is strongest before it knows your organisation. Take a policy or a procedure you already have — one with nothing sensitive in it — and ask the AI to research best practice in the non-profit sector in 2026, then critique what you have against it. Ask it to identify gaps, not to rewrite. Then chat further. Ask it to compare your donor-management procedure to what comparable orgs do. Ask it to surface what would worry a finance committee. Validate everything with the subject-matter expert on your team. AI is good at gap-finding before it knows you. It is bad at drafting policy in your voice until it does.

Three things I would not recommend as the first task: drafting a donor email; cleaning up donor records; summarising last week's meeting. All three need org context the agent does not yet have, and the failure modes are subtle enough that a busy staff person will not catch them. Start where context is light and the gap-finding is the value. The drafting tasks come later, once the agent has read enough of your context to know what you mean.

This is how you can do this. This week, with one small piece of work, with the subject-matter expert ready to validate, and with the willingness to try.

My son is on summer break in a few weeks. He will spend the summer reading short books, asking questions about everything, and assembling the next year on top of what this year built. The classroom I have set up for myself is open the same way. The sector's classroom is too — small, stackable, and waiting for the curious staff person to walk in.

The next two posts in this series are for the leader who is clearing the runway, and for the staff member who is doing the trying. Different audiences, different framings. The principle stays the same: start with the atomic unit, build the building block, stack the blocks.

This week. With one small piece of work. And the willingness to try.

Greg Zatulovsky, CPA

A compassionate dolphin breaching at the horizon line in warm dawn light over a calm Canadian coastal water — the curious staff person stepping into the building. — Greg Zatulovsky, CPA

How did this land?

About the author

Greg Zatulovsky

Founder & CEO, PF TECH

Greg founded PF TECH to multiply the operational capacity of purpose-driven organizations. CPA with fifteen-plus years in non-profit finance, operations, and technology. Writes from inside the work — practitioner voice, not pitch deck.

The shape of the stack, in four numbers

Build the upstream documents; the downstream follows

The order I built things in

Business plan

Product streams

Website

Publication

Specialised agents

Chief is the agent that sets up the agents that come after

Six steps, twenty minutes, three short conversations

Open any agent. The shape stays the same.

Four agents, same four-file shape

The rest of the series, as each part lands.

Four things I broke, and the architecture that grew from them

An agent slipped in through the back door

The control that grew from it

A too-helpful tidy-up cleared the whole queue

The control that grew from it

An agent dropped the data to start clean

The control that grew from it

I gave the agent the keys to the vault

The control that grew from it

What happens between an agent's request and an image

Agent (Claude)

Custom gateway

Workflow

Google Chat

Gemini (Nano Banana)

If you are not a developer, here is what to reach for

The principle, with the parts in hand

Mission Multiplier Program — June 2026 cohort

This is how you can do this

More reading

Foundations First

The Function the Sector Was Never Going to Hire

Your Accounting System Needs a Bouncer