How to know you're working with the right AI agency

Why this matters more than your last vendor decision

An AI engagement is not the same as buying a website or contracting a marketing agency. AI agents and automation give the people implementing them an enormous amount of leverage over your business. The systems they ship will read across your stack, decide, and act. They'll touch finance, sales, support, ops, sometimes all four.

One degree of error in the design of that system compounds. A workflow that's 95% right but quietly miscategorises 5% of transactions. An agent that handles support beautifully on the happy path but escalates poorly on edge cases. A schema that loses precision in a way nobody catches for six months. The cost of a bad AI engagement isn't measured in the project budget. It's measured in the operational decisions your business made on top of bad data.

So the bar you set for the people running this work has to be higher than the bar you set for any other vendor.

The operator test

The single best signal of a good AI partner is whether the people you're talking to are operators. By operator we mean: they've built and run real software, they've shipped systems that other businesses depend on, and they understand operations from the inside, not from a deck.

Concrete signs:

They map your business before scoping. Good operators ask to walk through how your operation actually runs, department by department, dataset by dataset, before they tell you what to build. Bad ones ask what you want and write a quote.
They speak in outcomes, not architecture. Hours saved, headcount freed, error rate reduced, revenue moved. Architecture is the means; outcomes are the point. If the first conversation is mostly about LangChain or LangGraph, the engagement will be too.
They show working systems. Ask to see what they've shipped. Real production systems, with real users, that have been live for at least six months. Not demos, not proof-of-concepts that died on a slide.
They take ownership end to end. Discovery, design, build, infrastructure, rollout, training, ongoing operation. If a partner says "we just do the AI part and you handle the integration", they're not the partner. The integration is where most AI projects fail.
They have opinions. Strong, defensible opinions about what to build and what to refuse. Beware the partner who agrees with everything you say. They're optimising for the contract, not the outcome.
They build for safety. Logs, traces, evals, human-in-the-loop on anything material, audit trails, cost monitoring, rollback. If safety only comes up when you ask, the work won't be safe.
They document and hand over. A partner you can fire is a partner you can trust. Anything they build, you should be able to maintain or hand to another team. If they cultivate dependency, walk.
They have R&D Tax Incentive readiness if you're in Australia. Agentic AI work is textbook R&D under the 43.5% R&D Tax Incentive. A good AU partner documents engagements properly so your tax specialist can claim with confidence. They don't lodge the claim, that's not their job, but they hand over the readiness pack.

A warning

Realistically, finding someone firing on all of these cylinders is hard. The combination is rare. Most agencies have two or three of these qualities. A few have four. The operators who have all of them exist, but you'll have to look for them.

Your job, as the business hiring this work, is to find a strong operator and trust them to lead the build. The person at the helm of your AI work is going to be responsible for implementing most of your technology and most of your operational solutions for the next few years. The leverage they have on your business is enormous. AI agents amplify that leverage further still.

One degree wrong on a big project compounds. A small misjudgement in the data model becomes a quarterly reporting headache. A misjudged escalation rule becomes a customer churn problem. So the cost of picking the wrong partner is not just the cost of redoing the work, it's the cost of the operational decisions your business made on top of the wrong work.

This is why we say: pick a partner you'd hire twice. The single best test is whether you'd want this person running your AI engineering team if they applied for a job. If yes, engage them. If no, keep looking.

What to ask in a first conversation

A short list to take into your first call with any prospective AI partner:

"Walk me through a recent engagement. What was the business problem, what did you build, what's the current state, and what would you do differently?"
"How do you decide whether a workflow is suitable for an AI agent versus a deterministic automation?"
"What goes into your safety and observability layer?"
"What's the engagement shape and how do you scope it?"
"Who on your team will actually be doing the work, and what's their background?"
"How do you document and hand over so we can run it ourselves later?"
"What's your view on the R&D Tax Incentive for this work?" (Australia)

You'll learn more from how they answer the first question than from the rest combined. A real operator tells you a story with specifics, names, numbers, what worked, what didn't. A pretender gives you abstractions.

The takeaway

An AI agency engagement is one of the highest-leverage decisions your business will make this decade. The person who runs it is going to shape how a meaningful chunk of your operations runs for years. Take the time to find an operator. Test them with concrete questions. Ask to see real shipped systems. Walk if anything feels off.

The good ones are out there, and they don't have to be from the firms you've already heard of. Often the best operators have small teams, deliberate practices, and a track record they can show you.