Dark teal cover with an uneven jagged-edge motif and the Good Transformer wordmark, marking an article on mapping where AI helps and where it fails.
Working with AISmall businessUse cases

Map your own jagged frontier before you buy anything

AI is brilliant at one task and quietly poor at a near-identical one. Vendor demos cannot show you the edge. Here is a cheap, structured way to find where AI reliably helps in your business and where it silently fails.

Good Transformer6 min read

AI is uneven in a way that catches people out. It will draft a tricky client email beautifully, then get a simple sum wrong with total confidence. It will summarise a long document well, then invent a fact that was never in it. The capability is not a smooth slope from easy to hard. It is jagged: excellent on one task and quietly poor on a near-identical one that looks no harder. Ethan Mollick, the academic who first coined the term, called this the "jagged frontier", and for a small firm it has a blunt practical consequence. You cannot tell from the outside where the edge sits. You have to find it in your own work.

This matters most at the point of buying. A vendor demo is built to show the tool on the side of the frontier where it shines. Your business lives on both sides. So before you commit money or a workflow to AI, spend a little time mapping where it reliably helps you and where it lets you down. The mapping is cheap. The mistake it prevents is not.

What "jagged" really means

The clearest evidence comes from a large field experiment. Researchers from Harvard and others ran a pre-registered study with 758 Boston Consulting Group consultants. On tasks that sat inside the technology's strengths, consultants using GPT-4 completed 12.2% more tasks, 25.1% faster, and produced work rated around 40% higher in quality. Then the researchers handed them a task chosen to sit just outside that range, the kind where the model is confidently wrong. On that task, performance dropped by roughly 19 percentage points against colleagues working without the tool. Same people, same model, same afternoon. The only thing that changed was which side of the frontier the work sat on.

Mollick, who co-authored that study, makes the point that "unless you use AI a lot, you won't know which is which." The edge is real, it is specific to the task, and it is invisible until you test it. Dr Philippa Hardman, who studies how these tools land in real practice, reaches the same conclusion from her own field: the frontier "must be mapped" task by task, because it does not match anyone's intuition about what should be easy.

The frontier probe: five tasks on purpose

You do not map the edge by reading about it. You map it by running a handful of real tasks and watching closely. We call this the frontier probe. Pick five tasks from your actual week, chosen to spread across the kinds of work you do, and run each one through AI deliberately, with a real example, not a toy one.

A drafting task. Something written: a client update, a proposal section, a job advert. AI is usually strong here.

A summarising task. Feed it a long document or a messy thread and ask for the key points. Watch whether it keeps to what is actually there.

A reasoning task. Something with a definite right answer that needs a few steps: a quote, a rota, a calculation, a logic check. This is where the quiet failures cluster.

A research task. Ask it about your market, a regulation, a competitor. Check every fact it gives you against a source. This is where confident invention shows up.

A judgement task. Something that turns on taste or context: which of three approaches fits this client, how to phrase a difficult message. See whether it helps you think or just produces plausible filler.

Five tasks is enough to see the shape of the edge without turning the exercise into a project. The point is breadth, not volume.

Logging what you find

A probe is only useful if you write down what happened. Keep it crude. For each task, note three things: did the output need heavy fixing or light fixing, did it get anything confidently wrong, and would you trust it on this task next week without checking. A line per task. Ten minutes.

You learn the edge by walking it, not by watching a demo.

The pattern that emerges is more useful than any single result. Most small firms find a clean split. Drafting and summarising tend to land on the helpful side. Calculation, current facts and fine judgement tend to land on the other, where the work looks right and is sometimes wrong. That split is your map, and it is specific to your business and the way you work.

Turning the map into a do/don't list

The map is only worth making if it changes what you do. Turn it into two short lists.

A do list: the tasks where AI earned its place, where you would happily use it again and the failure modes are easy to catch. These are your safe starting points, and the ones worth building a proper habit around.

A don't-yet list: the tasks where it failed quietly or needed so much fixing it saved nothing. Not banned forever, but not trusted now, and never used without a human checking the output.

Two lists on one page tell your whole team where AI helps and where it is on probation. That is worth far more than a vendor's claim that it does everything.

The honest limits

The map decays. These tools change every few months, and a task that sat firmly on the don't-yet list can move onto the do list with a single model update. That is the argument for re-running a short probe every quarter rather than treating one map as settled. It is also the argument against deciding once that "AI can't do X" and never checking again, which is its own quiet risk.

One more caution. A probe of five tasks tells you about those five tasks. It is a sample, not a survey. Treat the do list as a set of confident starting points, not a complete account of everything AI could do for you. The goal is not a perfect map. It is enough of a map to stop you buying on a demo and to point your first real efforts at the ground where they will work.

What to do this week

Block half an hour. Pick your five tasks, run the probe with real examples, log a line each, and write the two lists. You will end the session knowing more about where AI fits your business than any sales call could tell you, and you will have spent nothing but the half hour.

Choosing where to point AI first, on evidence rather than hype, is the heart of the AI Reality Check Sprint: mapping the frontier across a team's real work and leaving them with a defensible order to work through. If that is the decision in front of you, book a business call and we will run it together.

Sources and further reading

Work with Good Transformer

Turn this thinking into working practice.

Explore team advisory

Newsletter

Get new Insights by email

Practical notes on using AI with judgement, and the AI news leaders actually need. No hype, no spam, unsubscribe anytime.

Choose how often you want the digest

Keep reading

AI agents6 min read

Start with a Minimum Viable Agent, not a moonshot

The cheapest way into agentic AI is the smallest useful agent that does one repeatable job end to end. How to scope it, test it against a real problem, and avoid buying capability you will never use.

19 May 2026