Context rot is slowing down your AI agent: How to fix it

See how LogRocket's Galileo AI surfaces the most severe issues for you

No signup required

Check it out

Context rot is the gradual decline in an LLM or AI agent’s output quality as a session accumulates too much conversation history, stale instructions, tool output, failed attempts, and project context. It does not only happen when the context window is completely full. In practice, agent behavior can start degrading earlier as relevant information competes with noise.

Context rot is slowing down your AI agent: How to fix it

You have probably seen this if you use coding agents heavily. One minute, your agent is on a roll, writing code and generating useful tests. A few turns later, it forgets that you told it not to auto-commit changes, repeats a failed fix, or starts mixing together old and new requirements.

That degradation is not always a model capability problem. Often, it is a context management problem.

In this article, we’ll look at what causes context rot, how to recognize it in AI coding agents, and how to reduce it with fresh sessions, smaller prompts, prompt anchoring, compaction, persistent context files, plan files, and retrieval-augmented generation.

🚀 Sign up for The Replay newsletter

The Replay is a weekly newsletter for dev and engineering leaders.

Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.

What causes context rot?

The first time I noticed context rot was early in the AI boom. I would get stuck in loops with ChatGPT, asking it to fix something. It would say it had fixed the issue even when it had not. Sometimes, the next attempt made the problem worse.

At the time, the practical fix was simple: start a new conversation. Even before the term “context rot” was widely used, many users discovered that a fresh session often restored output quality.

The reason is that every model has a context window. You can think of the context window as the model’s active working memory for a conversation or task. It includes the user’s prompt, previous messages, tool output, system instructions, project files, agent rules, and any other information the model receives at runtime.

A context window has a fixed size, although that size varies by model.

[Insert image: LLM leaderboard from Artificial Analysis]

Modern models now offer much larger context windows, with some frontier models reaching into the million-token range. But larger context windows do not remove the tradeoff. They only give the model more room before the prompt becomes crowded.

The issue is that your input is not just what you type. In an agentic workflow, the model may also receive:

System prompts
Tool descriptions
Agent instructions
Retrieved files
Error logs
Terminal output
Prior failed attempts
Project-specific context files
Conversation summaries
Intermediate reasoning artifacts

As that context grows, the model has to decide which information matters now, which information is outdated, and which instructions should take priority. Long-context research has repeatedly shown that models do not always use long inputs uniformly. Chroma’s 2025 context rot research, for example, tested 18 LLMs and found nonuniform behavior as input length increased. The practical takeaway is clear: more context is not always better.

This is why context engineering matters. As covered in LogRocket’s guide to context engineering for IDEs: AGENTS.md and agent skills, coding agents perform better when they receive the right context at the right level of specificity, rather than every possible piece of project information at once.

How to recognize context rot

Context rot is difficult to spot because the model usually does not fail all at once. You notice friction before you understand what is happening.

A well-known early signal comes from the 2023 “Lost in the Middle” paper. The researchers found that models often perform best when relevant information appears at the beginning or end of the input, and worse when the relevant information appears in the middle of a long context.

This helps explain a common agent failure pattern: the agent remembers the original prompt and the latest message, but loses track of important decisions made in the middle of the session.

Here are the most common signs of context rot:

Symptom	What it looks like	Likely cause
Ignoring previous constraints	The agent stops following rules you set earlier, such as “do not auto-commit” or “avoid changing public APIs”	Important instructions are buried under newer context
Repeated mistakes	The agent keeps suggesting a fix you already rejected	Failed attempts remain in context and continue influencing the model
Increasing need to restate instructions	You have to paste the same rules every few messages	The session has lost a stable working memory
Declining output quality	Responses become shorter, more generic, or less precise	Relevant project details are competing with noise
Hallucinated recent facts	The model blends earlier and newer details into a false compromise	Conflicting instructions or stale context are unresolved
Broken reasoning chains	The agent handles step one and step four, but skips the logic in between	Multi-step task state is no longer coherent

These symptoms are especially common in long coding sessions because agents generate large amounts of intermediate context. They read files, run tests, inspect logs, make edits, backtrack, and try alternatives. All of that history can become useful context, but it can also become noise.

Proven ways to fix context rot

Context rot is not fully avoidable in long-running AI workflows. However, it is manageable. The goal is to treat context as a limited engineering resource, not an infinite bucket.

Over 200k developers use LogRocket to create better digital experiences

Learn more →

The table below summarizes the main techniques:

Technique	Best for	Main tradeoff
Fresh sessions	Switching tasks or recovering from degraded output	Requires a clean handoff summary
Smaller prompts	Keeping the agent focused on the current step	Requires more deliberate task planning
Trimmed inputs	Debugging with logs, stack traces, or large files	Requires manual cleanup before prompting
Prompt anchoring	Preserving critical constraints	Helps, but does not guarantee compliance
Compaction	Continuing long sessions with less history	Summaries may omit important details
Persistent context files	Reusing project rules across sessions	Large files can become their own context burden
Plan files	Long-running implementation tasks	Requires the agent to keep the plan updated
RAG or vector search	Large documentation or knowledge bases	Adds infrastructure and retrieval tuning overhead

Start fresh sessions

Starting a fresh session is still the simplest way to recover from context rot.

Each session should map to a specific task. For example, a session focused on debugging a production error should end once that error is understood and fixed. A new refactor, feature, or architecture decision should start in a separate session with a concise summary of the previous outcome.

This works because a new session starts with a cleaner context window. The model is no longer carrying every failed attempt, irrelevant file, stale assumption, or abandoned direction from the previous thread.

Fresh sessions work best when you pair them with a short handoff summary:

We fixed the ProductList rendering issue. The root cause was that products was undefined before the API response resolved. The component now defaults products to an empty array and renders a loading state while fetching. Next, we need to add tests for empty, loading, and populated states.

This keeps the useful context while dropping the noise.

Keep context small and focused

Even if you dedicate each session to one task, complexity still matters. Dumping the entire project roadmap into the prompt can overload the session before the agent has done any useful work.

For example, this prompt is likely to create context problems:

You are an expert developer. I need you to build a full-stack ecommerce app. First, design the database schema for users and products. Then, write the authentication logic using JWT. After that, create the React components for the storefront, and finally, set up the Stripe API for payments. Start with the database.

The issue is not that the request is impossible. The issue is that the agent now has to track database design, authentication, frontend architecture, and payments while it is supposed to focus only on the schema.

A stronger version scopes the immediate milestone:

You are an expert developer. Phase 1 is the database schema. Focus only on designing SQL tables for users and products. Do not write frontend, authentication, or payment code yet.

By narrowing the prompt, you reduce the amount of irrelevant information competing for attention. This also makes the output easier to evaluate because the agent has a clear boundary.

For larger teams, this same idea applies at the workflow level. LogRocket’s AI-assisted development governance guide makes a similar point: AI coding tools need project context, but they also need guardrails that keep work scoped, reviewable, and enforceable.

Strip unnecessary content when pasting input

Pasted input can quietly accelerate context rot. Error logs, stack traces, generated files, and copied documentation often include far more detail than the agent needs.

Consider this error log:

(node:12345) [DEP0040] DeprecationWarning: The punycode module is deprecated.
[10:42:01] Starting 'build'...
[10:42:05] Finished 'build' after 4.2 s
TypeError: Cannot read properties of undefined (reading 'map')
    at ProductList (src/components/ProductList.tsx:14:25)
    at renderWithHooks (node_modules/react-dom/cjs/react-dom.development.js:15486:18)
...[150 more lines of internal stack trace]...

For this issue, the agent probably needs only the error type, the file, and the line number:

TypeError: Cannot read properties of undefined (reading 'map')
    at ProductList (src/components/ProductList.tsx:14:25)

The rest of the stack trace may be useful in some cases, but it usually adds noise. Before pasting large outputs into an agent session, remove:

Unrelated warnings
Repeated stack frames
Old logs from previous runs
Generated code
Large dependency output
Comments unrelated to the bug
Files the agent does not need to inspect yet

This is a simple habit, but it has a large impact. Clean input gives the model fewer irrelevant tokens to weigh.

Use prompt anchoring for critical rules

Models often pay more attention to the beginning and end of a prompt than to information buried in the middle. You can use this tendency by placing critical constraints at the edges of your prompt.

For example:

SESSION RULE:
Do not auto-commit changes. Wait for explicit approval before committing.

TASK:
Refactor the authentication module to reduce duplicated validation logic.

REMINDER:
Do not auto-commit changes unless I explicitly instruct you to commit.

Prompt anchoring is especially useful for non-negotiable constraints, such as:

Do not commit changes
Do not modify database migrations
Do not change public API names
Do not install new dependencies
Do not edit generated files
Ask before deleting files

This technique is not a guarantee. If a rule is truly critical, pair it with tool permissions, hooks, tests, or repository-level policies. But anchoring still improves the odds that the model will preserve the rule across a longer task.

Compact and summarize long sessions

Compaction replaces part of the conversation history with a concise summary. The goal is to preserve the important decisions, current task state, and next steps while removing noisy intermediate turns.

This is useful when a session is still valuable, but the accumulated history is starting to work against you.

Tools such as Claude Code and OpenAI Codex support compaction workflows. Claude Code documents /compact as a slash command, and the Codex CLI docs describe /compact as a way to summarize the conversation so far and replace earlier turns with a concise summary.

If your agent does not support compaction directly, you can do it manually with a prompt like this:

We are about to move this task to a fresh session. Please summarize the current state of the work.

Include:
- The original goal
- The files changed so far
- Key technical decisions
- Current architecture
- Bugs fixed
- Tests added or still needed
- The exact next step

Exclude:
- Failed approaches that are no longer relevant
- Repeated error logs
- Old implementation ideas we rejected

Then copy the summary into a new session.

The key is to make the summary operational, not conversational. A good compaction summary should tell the next session what to do, what not to repeat, and what state the project is currently in.

Manually compact at natural checkpoints

Automatic compaction can happen at awkward moments, depending on the tool. It may trigger in the middle of debugging, planning, or implementation. Even when the compaction works, the resulting summary may omit details you were still using.

A better habit is to compact manually at natural checkpoints:

After a bug is diagnosed
After a design decision is made
After a test suite passes
After a feature milestone is complete
Before switching from planning to implementation
Before starting a risky refactor

For example, instead of waiting until the agent is already struggling, you can run /compact after finishing the schema design and before moving into authentication work.

This makes the summary cleaner because the task state is easier to describe. It also reduces the chance that unresolved errors, half-finished ideas, or temporary debugging output will become part of the agent’s long-term working context.

Use persistent context files

Fresh sessions and compaction help manage individual conversations. Persistent context files help preserve project knowledge across sessions.

These files are usually Markdown files such as CLAUDE.md, AGENTS.md, or tool-specific equivalents. They live in your project directory and tell the agent how to work inside the repository.

A useful context file might include:

Project architecture
Key dependencies and versions
Build and test commands
Naming conventions
Folder structure
Security constraints
Files the agent should not edit
Review steps before completion

For example:

# Project context: Ecommerce MCP dashboard

## Tech stack

- Framework: Next.js 15 App Router
- Language: TypeScript
- Database: Prisma with PostgreSQL
- Styling: Tailwind CSS and shadcn/ui

## Core rules

- Use Server Components by default.
- Add `use client` only when client-side interactivity is required.
- Follow the feature-based folder structure in `src/features`.
- All API routes must use the error-handling wrapper in `src/lib/api-wrapper.ts`.
- Do not modify database migrations without explicit approval.

## Build and test

- Run the build with `npm run build`
- Run tests with `npm test`

Persistent files give each new session a stable baseline. Instead of pasting the same rules repeatedly, you encode them once in the repository.

However, these files can also become bloated. A long CLAUDE.md or AGENTS.md can create the same problem it is supposed to solve: too much context, not enough signal. Keep these files lean, specific, and current.

For more on structuring these files, see LogRocket’s guide to context engineering for IDEs: AGENTS.md and agent skills.

Use a plan file for long-running tasks

For complex implementation work, a persistent context file is not enough. You also need a progress tracker.

A plan file, often named plan.md, PLAN.md, or PLANS.md, gives the agent a structured place to record what it is doing and what remains. This is especially useful when you compact the session, restart the agent, or move the task into a fresh conversation.

A practical plan file should include:

Section	Purpose
Goal	Defines the user-visible outcome
Current state	Explains the relevant architecture and files
Milestones	Breaks the task into implementation stages
Progress	Tracks completed and remaining work
Decision log	Records important choices and why they were made
Validation	Lists the exact commands and expected results
Next action	Tells the agent where to resume

A simple version might look like this:

# Plan: Add saved payment methods

## Goal

Allow users to save a payment method and select it during checkout.

## Current state

Checkout currently creates a one-time Stripe payment intent in `src/features/checkout/actions/create-payment.ts`.

## Milestones

1. Add a saved payment method table
2. Add an API action for listing saved methods
3. Update checkout UI to select a saved method
4. Add tests for empty and populated states

## Progress

- [x] Reviewed current checkout flow
- [x] Identified Stripe integration point
- [ ] Add database model
- [ ] Add UI state handling

## Decision log

- Use a separate `SavedPaymentMethod` table instead of storing metadata on `User` because users may have multiple saved methods.

## Validation

Run:

    npm test
    npm run build

## Next action

Add the `SavedPaymentMethod` Prisma model and generate the migration.

With this setup, you can tell the agent:

Continue from plan.md. Read the current progress, identify the next incomplete milestone, and implement only that step.

OpenAI’s Cookbook describes a more advanced version of this pattern with PLANS.md for multi-hour Codex tasks. The important idea is that the plan becomes a living document. It should be updated as the agent learns, changes direction, or completes milestones.

Externalize memory with RAG and vector databases

Persistent context files work well for concise project rules, but they do not scale cleanly to large documentation sets. If your agent has to read hundreds of pages of product specs, architecture docs, support tickets, or API references, loading everything into the prompt is inefficient.

This is where retrieval-augmented generation, or RAG, can help.

RAG uses external data retrieval to give the model only the most relevant chunks of information at runtime. A typical RAG workflow looks like this:

Split documents into smaller chunks
Create embeddings for those chunks
Store them in a vector database
Retrieve relevant chunks based on the current task
Feed only those chunks into the agent’s context

The advantage is that the agent no longer needs to ingest every document at once. It can retrieve the specific pieces of context needed for the current task.

For agentic systems, retrieval can also be exposed through tools, including MCP servers. LogRocket’s tutorial on building your first MCP server with Node.js shows how MCP can give models structured access to files, databases, and APIs.

However, RAG is not a default solution for every project. It adds infrastructure, retrieval tuning, chunking decisions, and quality-control work. It can also introduce its own failure modes if stale or low-authority documents are retrieved alongside current guidance.

Use RAG when:

The project has too much documentation for context files
The agent needs to query large internal knowledge bases
Different tasks require different subsets of context
Documentation changes frequently
You need runtime retrieval rather than static instructions

For smaller projects, a concise AGENTS.md, CLAUDE.md, or plan file is usually enough.

Best practices for preventing context rot

The techniques above help recover from context rot, but the better strategy is to slow it down from the beginning.

Here are the habits that matter most:

Practice	How it helps
Use atomic workflows	Smaller tasks create less context noise
Start new sessions by task	Each session begins with a cleaner working set
Trim pasted input	The model receives only what it needs
Compact at milestones	Summaries are cleaner and more useful
Keep context files lean	Persistent rules stay high-signal
Maintain a plan file	Long tasks remain resumable
Separate rules from references	Agents can distinguish instructions from documentation
Validate with tests and diffs	Correctness does not depend on the agent’s memory

One useful rule is “one task, one working context.” If a session starts as a debugging session, do not let it turn into a refactor, then a redesign, then a documentation update. Finish the task, summarize the result, and start a new session for the next scope.

This is also where multi-agent workflows need caution. Splitting work across agents can help, but only when each agent has enough shared context to coordinate safely. As LogRocket’s experiment on whether splitting work across AI agents actually saves time showed, parallelism can shift the bottleneck to review if the agents are not aligned.

Conclusion

Context rot is a side effect of how LLMs and AI agents process long, noisy, and changing inputs. Larger context windows help, but they do not eliminate the need for context management.

The practical fix is not to give the model everything. It is to give the model the right information at the right time.

For day-to-day work, that means starting fresh sessions when tasks change, keeping prompts focused, trimming pasted logs, anchoring critical rules, and compacting before the session gets messy. For larger projects, it means externalizing stable knowledge into CLAUDE.md, AGENTS.md, plan files, or retrieval systems.

When you treat context as a limited resource, AI agents become more reliable. They make fewer repeated mistakes, follow constraints more consistently, and recover more cleanly when a task spans multiple sessions.