Dark teal cover with a node-and-edge motif and the Good Transformer wordmark, marking an article on choosing a first AI use case.
AI adoptionUse casesLeadership

How to choose your first serious AI use case

The best first AI project is rarely the most futuristic. A five-part test, VALUE, for picking a use case that is frequent, worthwhile and safe to get wrong.

Good Transformer7 min read

The most common way to choose a first AI project is to pick the one that sounded most impressive in a demo. A system that reads every contract, an assistant that knows the whole business, an agent that handles the inbox end to end. It is the wrong instinct, and an expensive one, because the first project does not just succeed or fail on its own terms. It sets what everyone in the organisation comes to believe AI can do.

Choose something ambitious and brittle, and you teach your team that AI overpromises. Choose something useful and well-judged, and you build the appetite and the skill to go further. So the first use case is worth choosing deliberately, against criteria, rather than by whichever idea has the most gravity in the room.

Why the futuristic option usually disappoints

There is good evidence that where you point AI matters more than whether you use it. In a field experiment with 758 Boston Consulting Group consultants, Harvard and MIT researchers found that on tasks well suited to the technology, consultants using GPT-4 completed 12.2% more tasks, 25.1% faster, at higher quality. On a task deliberately chosen to sit just outside what the model did well, the same tool made them 19% less likely to reach the right answer. The authors called this the "jagged frontier": capability that is excellent on one task and quietly poor on a neighbouring one that looks similar.

For a leader, the lesson is direct. The futuristic project tends to sit on the wrong side of that frontier, where the work is complex, the right answer is contested, and a confident-but-wrong system does real damage. The unglamorous project, the one you almost dismissed, often sits comfortably on the right side.

The frequency of the task matters too. In a separate study of 5,179 customer-support agents, generative AI raised productivity by 14% on average, and around 34% for the least experienced staff. The gains were largest where the task recurred constantly and where the AI could carry good practice from strong performers to weaker ones. A task you do twenty times a day is a better first bet than one you do twice a year, both because the total benefit is larger and because your team gets twenty chances a day to learn.

The VALUE test

When a leader asks me how to choose, I run candidate tasks through five questions. The first letters spell VALUE, which is the point: you are looking for the use case that converts effort into worth, not the one that converts budget into a press release.

Value. If this works, what is it actually worth? Be concrete: hours returned to fee-earning work, faster turnaround a client would notice, errors avoided. If you cannot describe the prize in a sentence, it is not your first project.

Activity. How often does the task happen? High-frequency work compounds: more total benefit, and far more repetitions for the team to build judgement. A daily task beats a quarterly one even if the quarterly one feels weightier.

Learnability. Can your people tell good output from bad, and get better at producing it? AI raises quality fastest where "good" is knowable and checkable. Where quality is a matter of fine professional judgement that takes years to acquire, expect slower, more supervised gains.

Uncertainty. How sure are you it will work, and how bad is it if it does not? This is the jagged frontier in practical form. Favour tasks that sit inside the technology's strengths and where a wrong output is caught easily and costs little. Save the high-uncertainty, high-consequence work for when you have learned more.

Ease. How hard is it to actually stand up? Consider the tools you already have, the data the task needs, who has to change what they do, and whether anyone has to grant access to sensitive systems. Ease is not everything, but a first project should not also be an IT project.

Pick the frequent, checkable, low-stakes task. That is where AI compounds.

What this looks like in practice

Take an accountancy firm weighing two ideas. The first is an "AI audit assistant" that reviews client records and flags issues. The second is a humble helper that turns messy meeting notes into structured client summaries and first-draft follow-up emails. The audit assistant scores high on Value but poorly on Uncertainty and Learnability: the consequences of a missed issue are severe and the judgement is hard to check at a glance. The summary helper is less exciting, but it is frequent, easy to verify, low-stakes, and runs on tools the firm already has. As a first serious use case, the second wins on VALUE, and it builds the confidence to attempt the first later. (Both are illustrative examples, not a specific firm.)

The pattern repeats across professional services. A recruitment agency is usually better starting with structured candidate summaries than with autonomous candidate outreach. A marketing agency is better starting with research synthesis and first-draft briefs than with fully automated client reporting. The boring choice is frequently the right one.

The honest limits

Two cautions. First, "safe to get wrong" must not collapse into "not worth doing". A first project that no one cares about teaches the team that AI is a toy. The Value test is there precisely to stop you choosing something trivial. Aim for a task that is genuinely useful and forgiving, not one that is merely forgiving.

Second, the frontier moves. A task that sits outside the technology's reliable range today may sit inside it in six months, and a scorecard is a snapshot, not a verdict for all time. Treat your VALUE scores as a decision you revisit, not a label you fix. This connects to a point worth holding throughout: time saved on a task is not the same as value to the business, which is the subject of the next piece in this series.

What to do next

List up to five candidate tasks, drawn from work your team actually does rather than from a vendor's feature list. Score each from one to five on Value, Activity, Learnability, Uncertainty and Ease. Be suspicious of anything that scores high only on Value and low on everything else, that is the seductive, brittle project. Pick the highest total with a manageable downside, and commit to it as a real piece of work with an owner and a way to measure it.

The tool

To make that comparison properly, I have built the AI Use-Case VALUE Scorecard: a worksheet that scores up to five candidate use cases across all five criteria, with guidance on what each score means, warning thresholds that flag a brittle choice, and a short recommendation section to capture the decision.

Download the AI Use-Case VALUE Scorecard (PDF)

Choosing well is also the heart of an AI Reality Check, the short engagement I run with teams: sorting the candidate use cases, scoring them honestly, and leaving you with a defensible order to work through. It builds naturally on knowing what to hand to AI in the first place, and on the durable AI literacy that lets a leader judge these calls without waiting for a demo.

Sources and further reading

Work with Good Transformer

Want this kind of thinking applied to your team?

Book a discovery call

Newsletter

Get new Insights by email

Practical notes on using AI with judgement, and the AI news leaders actually need. No hype, no spam, unsubscribe anytime.

Choose how often you want the digest

Keep reading