All writing
AISystemsContextStrategy

Context Is a Better Upgrade Than a Bigger Model

Most teams still overfocus on model choice. In practice, better results usually come from better context, cleaner workflows, and stronger system design.

Max KellyFebruary 13, 20266 min read
Context Is a Better Upgrade Than a Bigger Model

Every few weeks, the AI conversation seems to reset around the same question:

Which model is best now?

That is not a useless question.

Model quality absolutely matters. Better models usually mean better judgment, better writing, and fewer obvious mistakes.

But I think a lot of people are blaming the wrong layer.

They are overestimating how much their results are limited by model choice and underestimating how much they are limited by system design.

In real workflows, the biggest quality jump usually does not come from moving from model A to model B.

It comes from giving the model better context.

That means:

  • cleaner inputs
  • better retrieval
  • less irrelevant information
  • clearer instructions
  • better tool access
  • better workflow structure

That is the layer most people are still underbuilding.

I've seen the same pattern a lot now.

Someone tries AI inside a real workflow, gets mediocre output, assumes the model is the problem, switches models, gets a small improvement, then hits the same wall again.

Usually the wall is not intelligence.

It is setup.

Why people overfocus on the model

The model is the easiest thing to compare.

It has a name. It has benchmark scores. It has public hype cycles. It has a release date.

Context quality is harder to see.

There is no leaderboard for:

  • whether your notes are structured well
  • whether your agent is pulling the right files
  • whether your retrieval pipeline is noisy
  • whether the system is carrying too much stale information
  • whether your workflow is asking the model to do too much in one step

So people default to the visible variable.

But if your system is feeding the model bad context, the best model in the world will still underperform.

The common failure pattern

This is the pattern I keep seeing:

  1. Someone tries AI in a workflow.
  2. The output feels generic or wrong.
  3. They conclude the model is not good enough.
  4. They switch models.
  5. The results improve a little, but not enough.
  6. They conclude AI is overhyped.

Usually the real issue is upstream.

The system gave the model one or more of the following:

  • too little context
  • too much context
  • the wrong context
  • context in the wrong format
  • context with no prioritization
  • no memory of previous corrections

That is not a model problem.

That is a systems problem.

More context is not the same as better context

This is where a lot of teams still get sloppy.

They assume that if context helps, then more context must help more.

Usually it does not.

Long context windows are useful, but they are not magic.

If you keep shoveling every note, transcript, file, and conversation into the same prompt, quality often gets worse, not better.

Why?

Because the system now has to reason inside a bloated working set:

  • more noise
  • more contradictions
  • more stale information
  • more opportunities to latch onto the wrong detail

Good AI systems do not just accumulate context.

They manage it.

They decide:

  • what should always be present
  • what should only be retrieved when needed
  • what should be summarized
  • what should be dropped
  • what should be written back into memory for next time

That is where a lot of the real leverage lives.

The best systems treat context like infrastructure

Most people still treat context like a prompt problem.

It is closer to an infrastructure problem.

You need to decide how information moves through the system.

For example:

  • What stable facts should always be available?
  • What recent work should be easy to recall?
  • What source documents should stay immutable?
  • What should be searchable instead of always loaded?
  • What corrections should persist across sessions?

Once you think that way, the quality conversation changes.

You stop asking:

Which model should I switch to?

And start asking:

Why is this system making the wrong thing easy for the model to see?

That is a much better question.

It is also the kind of question that actually leads to improvement.

Where context improvements usually beat model upgrades

In practice, I would usually look at these before reaching for a different model:

1. Retrieval quality

Is the system finding the right information at the right moment?

If retrieval is weak, the model is guessing too often.

2. Input structure

Are notes, specs, docs, and tasks stored in a form the system can actually use?

Messy source material produces messy outputs.

3. Working memory

Does the system know what it just did, what failed, what changed, and what the current objective is?

4. Persistent memory

Are useful preferences, corrections, and stable facts being saved anywhere?

If not, the system keeps relearning the same lesson from scratch.

5. Workflow design

Are you asking one model to do too much in one shot?

Often the answer is not "better model."

It is "break the workflow into better steps."

6. Tooling and harness quality

What tools can the system call? How is context loaded? How are results returned? How are errors handled?

Two systems can use the same model and produce very different output because the harness around the model is very different.

A useful mental model

If you want a simple mental model, think about output quality as something like:

model x context x workflow x tools

People obsess over the first variable because it is the easiest one to shop for.

But if the other three are weak, multiplying the first one only gets you so far.

This is also why the same model can feel mediocre in one product and excellent in another.

The intelligence layer may be similar.

The environment around it is not.

What to improve this week

If you are using AI in real work and the results feel inconsistent, I would not start by changing models.

I would audit the context layer.

Ask:

  1. What information does this system actually need to do the job well?
  2. What information is it carrying that does not help?
  3. What keeps getting repeated manually that should become persistent context?
  4. What sources are authoritative?
  5. What should be retrieved on demand instead of always injected?
  6. Where are outputs failing because the workflow is too compressed?

That exercise usually produces more value than another round of model shopping.

The real opportunity

The next wave of advantage is not going to come from having access to a frontier model.

That advantage compresses quickly.

The bigger advantage comes from building systems that:

  • know what matters
  • retrieve the right information
  • preserve useful memory
  • carry less noise
  • operate with better workflow discipline

That is harder to screenshot than a benchmark chart.

But it is where the real performance gains are coming from.

So yes, better models matter.

But if your context layer is weak, you are solving the wrong problem first.

That is the piece I think most people still have backwards.