Thinking · AI Governance

AI safe enough to scale.

A practical conversation guide for any leader — in IT or in the business — putting AI agents and automated workflows into production. Five questions to work through before and after you ship, in plain language. No theory; this is what we ask ourselves on every engagement.

First principles

Treat agents the way you treat employees.

Identity, access, accountability. An agent that can't be named, logged, and switched off doesn't belong in production.

Give every agent an identity

Manage agents with the same tools you already use for people — a directory entry, set permissions, and a record of what they did. An agent no one owns is a risk, not a tool.

Grant the least access needed

Give each agent the minimum it needs to do its job, and read-only access wherever you can. Permissions that quietly widen over time are the most common path to a real incident.

Build where safety is built in

Build on platforms that already handle security (Microsoft 365, Google Workspace, your cloud) and you inherit their controls. Build outside them and you recreate all of it yourself.

Question 01 · Staying in control

Who's in the loop — and when?

Apply one rule to everything and you either smother good work with reviews, or hand off decisions you should have kept. Sort the work by what's at stake first.

Sort work by what's at stake

Group every workflow by how easily it's undone, how much harm a wrong answer does, and how often it runs. Then decide where a human signs off, reviews, or steps aside.

First question: what if the output is wrong?

Give agents a place to grow up

An agent needs a home before it touches production. Decide where it's built and tested, who's responsible for it, and what it must prove before it's allowed to go live.

Named owner · safe staging · path to live

Try hard to break it first

Throw messy inputs, trick questions, and chained actions at it before launch. If it breaks in testing, the test did its job. If nothing breaks at all, push harder.

Test for failure first · replay real sessions

Keep your workflows portable

Work built on one vendor's platform shouldn't be stuck there. Write down what the workflow does, apart from how this particular tool does it, so you can move later.

Write the logic down, separate from the tool

Question 02 · Trusting the data

Can you trust what the AI knows?

AI is only as good as the information you feed it. These are the hardest calls on this list to get right, and the most expensive to get wrong.

Mixing sources changes trust

Two trustworthy sources can add up to a misleading answer once combined. Judge trust at the result, not just at each input on its own.

Check trust at the output, not the input

Right source for the job

Not every source belongs in every use. HR records inside a marketing agent is a leak waiting to happen. Match sources to uses at the access layer.

Sensitivity labels · source-gated access

Let AI check the human

A confident wrong answer outruns a careful right one. Where the AI is steadier at a task, let it review the person's work, not only the other way around.

Confidence scores · provenance on outputs

Plan for things drifting

Models drift, data drifts, and people drift. Set a schedule to re-check accuracy before you go live, not after something has quietly gone wrong.

Regular drift review · quiet retesting

Decide when to upgrade

New models arrive monthly, and not every one is worth the retesting. Write down what a new model has to clear, and revisit that bar each quarter.

Upgrade rules · clear go or no-go

Tag once, inherit everywhere

Hand-labeling data doesn't scale. Tag automatically, have people review the edge cases, and let every agent downstream inherit those labels.

Label once · inherit everywhere

Question 03 · Vendor lock-in

How locked in are you willing to be?

Every AI architecture choice is a bet on a vendor. The real question is whether you're making that bet on purpose — and how long it would take to change it.

Flexibility costs money

Staying model-agnostic costs more to build and less to change later; locking in costs less now and far more the day you must move. Choose the trade on purpose, and write down what leaving would cost.

Model-agnostic by default · locked in by choice

Give the policy teeth

A neutrality policy that no one enforces is just a wish. Back it with architecture reviews, purchasing controls, and a written process for the exceptions you choose to allow.

Exception log · reviewed every quarter

Question 04 · When it breaks

What happens when it breaks?

It isn't whether an AI system will fail — it will. It's whether you've decided in advance what that failure looks like, and how you keep it small.

Decide how it fails

Choose for each use. A customer chatbot that shuts down on an error just frustrates people; an approval system that keeps running on an error becomes a compliance problem.

Decided per use case · tested quarterly

Name the no-AI calls

Some decisions should never sit with an agent — firing someone, making a medical diagnosis, signing a contract. Write that list down, and be ready to defend it.

Published no-AI list · reviewed yearly

Step down, don't fall over

When the agent isn't sure enough, it should hand off — to a person, a simpler model, or a fixed fallback. Design it to ease down gracefully rather than collapse.

Confidence-gated handoff · clear fallback

Keep a real off switch

Every production agent needs a written stop condition and a named person who can pull the plug. Not in theory — a real switch that someone has actually tested.

Named owner · tested off switch · runbook

Question 05 · Concentration risk

What's your single point of failure?

As AI turns into everyday infrastructure, leaning too hard on one vendor or model becomes a leadership question — not a line item buried in procurement.

Two models for big calls

For high-stakes, hard-to-reverse decisions, run the same question through more than one model and look for agreement. The extra cost is tiny next to being wrong.

Require agreement on top-tier calls

Map risk by use, not tool

Measure your concentration by use case, not by platform. A support bot and a fraud model carry very different risk and deserve very different designs.

Use-case risk register · matched backups

Always have a manual mode

Design for the outage. The question isn't whether a provider goes down, but how work keeps moving when it does. Every system needs a no-AI way to run.

A no-AI mode · practiced, not hypothetical

The bottom line

Governance isn't the brake. It's the engine.

Teams that govern well ship more AI, not less — because they're not stopping mid-rollout to write policy, or rebuilding what they should have built with guardrails the first time.

Start with what you have

Your identity system, data labels, incident response, change process. Most AI governance is the governance you already run, pointed at a new kind of worker — not a parallel universe to invent.

A conversation, not a checklist

We don't hand you a fifty-page document and walk away. We work through these questions with your team, write down the answers, and revisit them as your use of AI grows.

Earn the right to keep going

Governance isn't a tax you pay before launch. It's how you keep moving quickly without stopping to explain yourself to legal, or rebuilding under pressure later on.