Skip to content
Jakub Kubišta

/ Know how / NÁVOD /

6 fuckups when building AI agents

Six recurring mistakes when building AI agents: bad scope, validation, prompts without tests, costs, measuring impact, model decay.

6 min read
A team in a dark office late in the evening in front of a screen with error logs and an interrupted process

Key points

  • AI isn't a cure-all — without validation it makes up things that aren't in your data
  • Prompts aren't a game. They're a product artifact — test, log, version them
  • AI costs grow faster than you notice. The goal isn't a wow effect, but price/performance
  • Without measurable value, customers stop using AI — even when it works
  • Models change every two weeks. The architecture must allow swaps without rewriting the whole

An AI agent looks simple at first glance — grab GPT-4, write a few prompts, add a UI, wrap it as SaaS, ship. The reality? More complicated. And more expensive.

In projects where I help companies deploy AI, I keep seeing the same dead ends. Not because the teams aren't capable — but because some problems aren't visible until you walk into them yourself.

Here's a selection of six fuckups I see in practice. Sharing them so you can avoid them — or at least know you're not alone.

1. Scope: AI isn't a cure-all

Problem: When defining an AI product, the key question is: when does AI even make sense? And specifically: does it make sense at this concrete step?

Example: A client wants an automatic quote builder based on catalog sheets and incoming inquiries. Great idea. But AI without validation makes up products the client doesn't even offer. The result is confusion and lost trust.

Fix: AI alone isn't enough. It has to be combined with full-text search, filtering, and validation at the business-logic level.

2. Quality: garbage in, garbage out

Problem: When your data is bad or incomplete, no model will save you. Same goes for using an open model instead of RAG, missing system prompt, or a prompt that's too generic or, conversely, over-prompted.

Fix: Prompts aren't a game. They're a product artifact. Test, log, version them. Or have someone advise you who has shipped a few AI projects.

3. Cost: AI isn't free

Problem: Heavy token usage, transaction growth on larger datasets, output reranking, multimodal AI = costs fly up. And before you notice.

Example: Reranking on one project originally cost tens of thousands of CZK a month. After switching to jina.ai, the cost dropped to hundreds.

Fix: Don't aim for the smartest AI in the world. The goal isn't a wow effect — it's the best price/performance ratio.

4. Payments: who's paying for this?

Problem: "Let every customer bring their own OpenAI key." Reality? UX fail. And on the other hand: building your own billing and tokenization metering is expensive and demanding.

Example: A tokenization model for a SaaS with 5 tiers, dozens of scenarios, and different models = weeks of work. And you haven't even started scaling yet.

Fix: For the MVP, forget detailed metering. A flat-rate model and usage monitoring is plenty.

5. Measurement: AI without metrics is invisible

Problem: The customer often doesn't even know what AI brought them. If you don't show them, they won't know why to keep going.

Example: On one project we delivered an AI tool, but without explaining the value clients stopped using it. Not because it didn't work — but because they had no context.

Fix: Give the customer a demo environment, pre/post metrics, and a clear evaluation of value. And start with education.

6. Updates and decay: AI has a short shelf life

Problem: Models change every two weeks. Quality, latency, and cost all swing. What worked yesterday can be slow, expensive, or outdated today.

Example: We considered moving from GPT-4o to GPT-5 for a specific dataset. Result? GPT-5 was too slow. Gemini 2.5 Flash ended up giving the better quality/speed ratio.

Fix: The architecture must be flexible. Watch new releases, test models, swap them as needed. And plan for it in the roadmap.

In closing

The biggest problems when building AI agents don't happen in the code. They happen in bad product design. Insufficient input validation. Overestimating AI models. Missing measurement. And lack of common sense.

If you want to avoid these dead ends — or you recognized yourself in one — I'd be glad to talk it through. Or at least save you a few thousand extra tokens.

Schedule a consultation →

/ Terms in this article /