Article

Effective AI: stop prompting, start training

Ramon Mens
Ramon Mens | site manager
Article

At Media House, we had an AI process for text summaries that we just couldn't get under control. The prompt was extensive, the output reasonable, but editors were adjusting an average of 30%. The solution? Fine-tuning the model with 150 examples from our own CMS. Result: only 8% needed to be adjusted.

Same task, same editors, only a fraction of the work every day. How that's possible? Monkey see, monkey do.

The prompt as a stopgap

>

.

Most people assume that bad AI output means your prompt isn't good enough. So they tinker further. Add another example, another precondition, another sentence explaining what you actually mean. Before you know it, you're on two A4’s of prompt and the result is almost good. But then again, not always.

For single questions, that works fine. But if you want to perform the same type of task hundreds of times, in a specific style and with consistent output, it no longer works. Compare it to a new colleague who gives you a ten-page’s briefing every morning, instead of just working it in properly.

Provide rather than explain

In fine-tuning, you give a model not instruction but examples. Hundreds of input-output pairs that together say, this is how we do it here. The model picks up the pattern, the tone, the length, the structure. Without having to describe everything explicitly.

Take writing summaries. In a prompt, you have to lay out exactly what you want: three sentences maximum, start with the most important fact, no quotes, businesslike tone. In fine-tuning, you give the model a hundred summaries that you think are good. It sees for itself that they are short, that they open with the news, that there are no quotation marks in them. You don't have to write that down anymore.

Or take headlines. Just try to capture in a prompt what distinguishes a good Autovision headline from a good Metro headline. That's hard to put into words, even for experienced headline writers. But give a model 250 good headlines per title and the difference is there.

Creating a trained model is a chore. You have to collect and clean up examples. On the other hand: most CMS’s are full of useful data that you can export. It's ééone time hard work, versus every day polishing a prompt that just doesn't do what you want.

What it gets you in practice

Back to those summaries. Because at Mediahuis we store all editorial changes in the CMS, we were able to measure exactly how many editors were still polishing the AI output. With the expanded prompt: 30% adjustment on average. After fine-tuning on 150 examples: 8%. That saves time and frustration.

With article headlines, we see the same thing. Every title within Media House now has its own trained model, fed the best headlines as training data. No more endless prompt explaining what makes a good headline. The model has seen it and mimics it. Monkey see, monkey do.

When prompt, when fine-tune?

Fine-tuning is not the solution for everything. Aén press release rewrite? Just prompt. Brainstorming headlines? Prompt. Summarize an interview for internal use? Also prompt, fine.

If you find that you're fine-tuning the same prompt over and over again, for the same kind of task, and yet the results remain erratic? Then it pays to invest in a fine-tune. When explaining no longer works, start demonstrating. You can create your own trained model at OpenAI (https://developers.openai.com/api/docs/guides/supervised-fine-tuning/) but also at the European alternative Mistral https://docs.mistral.ai/capabilities/finetuning.