So You Need an AI Policy?

Wayne Grigsby
Software Engineer

Today I talked to a guy who runs a small consulting firm. He does CFO work for small businesses. About thirty people on staff. He is, by his own description, not technical.

He has, running in production right now, a custom MCP server he vibe-coded with the Claude Agent SDK. OAuth 2.0. Role-based access. Audit logging. A client-facing agent in Slack, wired into QuickBooks for eight of his ten clients. Touching their financial data.

Halfway through the call, he said it himself, flat out:

"I'm also not technical. I'm vibe coding all of this stuff."

I wrote about this inevitability in When Everyone Can Build What Matters. This manifestation, though, is making me think about security and governance more than I'd planned to.

Pull back from this CFO for a moment. Think about your own organization. Your marketing team is running ChatGPT through personal accounts to draft customer-facing copy. Two of your data scientists are paying for Cursor on personal cards because procurement took six weeks last time. Someone in revops wired up an n8n flow last month that reads from your CRM and nobody signed off on it. People are plugging in platform-driven MCP servers, or vibe-coding their own. And that's before we count the shadow AI on personal phones.

None of it got vetted. None of it would pass an audit. This doesn't make any of these people bad actors. They're trying to use modern tools to build solutions and drive meaningful impact. This isn't new.

"They were so preoccupied with whether or not they could, they didn't stop to think if they should." — Dr. Ian Malcolm, Jurassic Park

The CFO I spoke to today never asked. If he'd been required to ask "is there a vetted platform for this before I build my own?" he'd have found one in thirty seconds. He didn't have a policy. So he wasn't forced to ask.

Your policy is upstream of the tools, not the other way around. Most orgs do this backwards. They buy the platform, then write policy that retroactively blesses what they bought hoping it will help prevent further tool sprawl.

This post is about writing the policy first. The questions it forces you to answer. Sitting down to write one is the easy part.

Three Failure Modes You'll Find in the Wild

Most AI policies fail in one of three ways. All three produce the illusion of a policy without producing one.

The friction wall. A Big 4 consulting firm I used to work at. You couldn't install Python without three approvals and a prayer. The wall wasn't AI-specific. It was how the org treated all developer tooling and emerging tech. So when AI showed up, it met the same gauntlet, and people stopped asking. They opened Claude on personal phones. They pasted client work into ChatGPT on personal accounts. The anxiety underneath was legal exposure, and the policy looked responsible because it was hyper-restrictive. But restrictive without alternatives is just a wall. Walls produce shadow infrastructure and distrust.

The silence. A global nonprofit I spent years inside. No official AI policy at all, to this day. Data was moving. Tools were being used. Nobody was watching it move. The anxiety underneath was operational caution, the reasonable instinct to not move until you understand the thing. But in 2026, no policy is a policy. It's a vote for whatever happens by default. And what happens by default in 2026 is MCP servers and agents spreading across internal infrastructure with no visibility and no governance.

The verbal ban. A midsize financial firm I've talked to. Leadership said "no AI" in an all-hands once. No document. No alternative. No enablement. Employees nodded and opened Claude on their phone in the parking lot. The anxiety underneath was executive discomfort. A verbal ban is performative governance. It satisfies the need to have done something without producing a working policy.

Three different postures. Three different anxieties. Same end state. No real auditable pathway. Work finds another way.

The River

Work is like water in a river. The force is constant and the energy is real. You can pretend it isn't there. You can try to block it cold. The water gets through anyway. That's physics.

A bare wall stops everything. Including the work you wanted. So shadow channels get carved around it in the dark, and the policy becomes a thing people work around instead of a thing people work with.

But "let it run free" isn't the answer either. An ungoverned river floods.

Think about the Hoover Dam. It's a wall, technically. But it's a wall built with intention. The water still moves. Engineers decided where it goes and how fast. On the way through, it spins turbines and powers cities across three states. The same force that would have eroded canyon walls is doing useful work.

That's the policy you want. Not a barrier. Not a free-for-all. A shaped channel with the work captured on the way through. Logging where the turbines would be. Guardrails inside the path, not bolted around it. The sanctioned route is the path of least resistance, and on the way through, it generates the audit trail and the visibility the unmanaged river never would.

A good policy does three things, in order. Enable the safe path. Surface the risky path. Forbid the wrong path.

Every question from here is the same question. What does your channel look like?

The Stack You're Governing

Before the questions, name what you're governing. Most AI policies skip this. They list rules without naming the system the rules apply to.

Five layers. The policy governs every interface between them.

Five-layer stack: Human → Client → LLM → Tooling Layer → Data & Systems. Identity flows down, data flows up. Risk lives at every interface.

A human picks a client to work in. The client routes prompts to an LLM. The LLM calls tools through the tooling layer. The tools reach into your data and systems. Identity propagates down. Information flows back up. Risk lives at every interface.

Every question that follows is about one of these interfaces. Who can use which client. What data the LLM is allowed to see. What tools the LLM is allowed to call. What the tools can reach. Where the logs go.

Name the layers and the policy has something to govern. Skip them and you're governing fog.

The LLM Is the Smallest Part

When people say "AI policy," they picture the model. Which one are we allowed to use. Anthropic. OpenAI. Google. That's the smallest part.

The model is commodity. Different prices, similar capabilities. Swappable on a Tuesday. The model isn't the product. It never was.

What makes AI useful inside your organization is the tooling layer humans built around it. The tools the model can call. The context you feed it. The skills you've taught it. The prompts it runs under. The evals that catch its mistakes. All of it human-built. Or it should be.

An LLM with no tools is a chatbot. An LLM with the right tools is a coworker. An LLM is only as good as the tools it's given.

So the reframe is this. You're not governing AI. You're governing the tooling layer humans built around the AI. That's a familiar problem. Provenance. Review. Ownership. Versioning. The same disciplines you already use for code.

The vibe-coded gateway didn't fail at AI. It failed here.

The Questions You Need to Answer

The format is the easy part. Templates and frameworks for tech policy are a Google search away. The hard part is what the policy actually says.

At its core, your policy answers these questions. On paper. Out loud. Before the incident, not after.

Can we even use AI? Start here. Without an answer, the rest of the document is theoretical. Some orgs land on "yes, with constraints." Some land on "not yet, here's why." Both are policies. "Nobody asked" is not.

If yes, which clients are sanctioned? ChatGPT. Claude. Cursor. Copilot. The hundred Cursor variants. Your allowlist is the gate. An empty one matches silence. One that includes everything matches a free-for-all. Neither is a policy. Pick on purpose.

Who can use what? Tiers, not blanket yes or no. A junior contractor and a principal engineer should not have the same access to the same tools, the same data, the same blast radius. Most policies pretend they do.

What data can touch which models? There is no "AI" answer. Only specific model, specific data. PII, customer data, internal IP, regulated data each get a different answer. Write them down.

Where does that data go after? Vendor terms. Retention. Training opt-outs. Geography. The vendor's TOS is your policy whether you wrote one or not.

What gets logged? Prompts. Outputs. Tool calls. Model versions. If you can't reconstruct it, you can't govern it. You're hoping.

Who pays? Centrally funded or chargeback. Pick one and write it down. If it's free, people will be careless with it.

Who's accountable when AI acts? The human who deployed it. Always. Say it out loud in the document so nobody can pretend later.

What is AI allowed to do, not just say? Read versus write. Autonomous versus human-in-the-loop. Blast radius. This is where most policies are silent. It's also where most incidents will come from.

Who wrote your tools, and who owns them? Provenance. Review. Ownership when something breaks. A vibe-coded MCP server is a supply chain question with no supply chain.

What platforms enforce this? A policy is words on paper until a platform captures prompts, outputs, tool calls, and model versions, and ties them to identity. Without that, you can write the most elegant policy in the world and have no way to tell whether anyone followed it. Enforcement isn't intent. It's infrastructure.

How do you re-evaluate? Procurement once isn't enough. Models change weekly. Static policies become policies people work around.

The guy from the top of this post never got asked any of these. Not one. That's why his stack looks the way it looks. He isn't reckless. He's unguarded. The policy is what would have made him pause. Not stop. Pause. Long enough to ask whether a sanctioned path already existed, and find out it did.

Answering these questions on paper, out loud, with your name next to the answer, is the work. The document is just where the answers go. You don't write a policy and then think. You think by writing the policy.

And then this becomes a procurement document. The platforms you evaluate, the tools you buy, the systems you build all answer to this list now. Not the other way around.

If this list feels like a mountain, good news. You're not starting from zero. Most of these answers already exist somewhere in your org's existing governance. They're just not in the document called "AI Policy."

Follow

More articles

The Context Inversion

I spent a year building persistent memory for my AI. Turns out I was building it for myself.

Read more

New Session, Who Dis?

The real problem with AI sessions isn't losing your place. It's losing your mind.

Read more

Located Near

  • Washington
    District of Columbia
    United States