Context rot is the gradual decline in an LLM or AI agent’s output quality as a session accumulates too much conversation history, stale instructions, tool output, failed attempts, and project context. It does not only happen when the context window is completely full. In practice, agent behavior can start degrading earlier as relevant information competes with noise.
You have probably seen this if you use coding agents heavily. One minute, your agent is on a roll, writing code and generating useful tests. A few turns later, it forgets that you told it not to auto-commit changes, repeats a failed fix, or starts mixing together old and new requirements.
That degradation is not always a model capability problem. Often, it is a context management problem.
In this article, we’ll look at what causes context rot, how to recognize it in AI coding agents, and how to reduce it with fresh sessions, smaller prompts, prompt anchoring, compaction, persistent context files, plan files, and retrieval-augmented generation.
The Replay is a weekly newsletter for dev and engineering leaders.
Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.
The first time I noticed context rot was early in the AI boom. I would get stuck in loops with ChatGPT, asking it to fix something. It would say it had fixed the issue even when it had not. Sometimes, the next attempt made the problem worse.
At the time, the practical fix was simple: start a new conversation. Even before the term “context rot” was widely used, many users discovered that a fresh session often restored output quality.
The reason is that every model has a context window. You can think of the context window as the model’s active working memory for a conversation or task. It includes the user’s prompt, previous messages, tool output, system instructions, project files, agent rules, and any other information the model receives at runtime.
A context window has a fixed size, although that size varies by model.
[Insert image: LLM leaderboard from Artificial Analysis]
Modern models now offer much larger context windows, with some frontier models reaching into the million-token range. But larger context windows do not remove the tradeoff. They only give the model more room before the prompt becomes crowded.
The issue is that your input is not just what you type. In an agentic workflow, the model may also receive:
As that context grows, the model has to decide which information matters now, which information is outdated, and which instructions should take priority. Long-context research has repeatedly shown that models do not always use long inputs uniformly. Chroma’s 2025 context rot research, for example, tested 18 LLMs and found nonuniform behavior as input length increased. The practical takeaway is clear: more context is not always better.
This is why context engineering matters. As covered in LogRocket’s guide to context engineering for IDEs: AGENTS.md and agent skills, coding agents perform better when they receive the right context at the right level of specificity, rather than every possible piece of project information at once.
Context rot is difficult to spot because the model usually does not fail all at once. You notice friction before you understand what is happening.
A well-known early signal comes from the 2023 “Lost in the Middle” paper. The researchers found that models often perform best when relevant information appears at the beginning or end of the input, and worse when the relevant information appears in the middle of a long context.
This helps explain a common agent failure pattern: the agent remembers the original prompt and the latest message, but loses track of important decisions made in the middle of the session.
Here are the most common signs of context rot:
| Symptom | What it looks like | Likely cause |
|---|---|---|
| Ignoring previous constraints | The agent stops following rules you set earlier, such as “do not auto-commit” or “avoid changing public APIs” | Important instructions are buried under newer context |
| Repeated mistakes | The agent keeps suggesting a fix you already rejected | Failed attempts remain in context and continue influencing the model |
| Increasing need to restate instructions | You have to paste the same rules every few messages | The session has lost a stable working memory |
| Declining output quality | Responses become shorter, more generic, or less precise | Relevant project details are competing with noise |
| Hallucinated recent facts | The model blends earlier and newer details into a false compromise | Conflicting instructions or stale context are unresolved |
| Broken reasoning chains | The agent handles step one and step four, but skips the logic in between | Multi-step task state is no longer coherent |
These symptoms are especially common in long coding sessions because agents generate large amounts of intermediate context. They read files, run tests, inspect logs, make edits, backtrack, and try alternatives. All of that history can become useful context, but it can also become noise.
Context rot is not fully avoidable in long-running AI workflows. However, it is manageable. The goal is to treat context as a limited engineering resource, not an infinite bucket.
The table below summarizes the main techniques:
| Technique | Best for | Main tradeoff |
|---|---|---|
| Fresh sessions | Switching tasks or recovering from degraded output | Requires a clean handoff summary |
| Smaller prompts | Keeping the agent focused on the current step | Requires more deliberate task planning |
| Trimmed inputs | Debugging with logs, stack traces, or large files | Requires manual cleanup before prompting |
| Prompt anchoring | Preserving critical constraints | Helps, but does not guarantee compliance |
| Compaction | Continuing long sessions with less history | Summaries may omit important details |
| Persistent context files | Reusing project rules across sessions | Large files can become their own context burden |
| Plan files | Long-running implementation tasks | Requires the agent to keep the plan updated |
| RAG or vector search | Large documentation or knowledge bases | Adds infrastructure and retrieval tuning overhead |
Starting a fresh session is still the simplest way to recover from context rot.
Each session should map to a specific task. For example, a session focused on debugging a production error should end once that error is understood and fixed. A new refactor, feature, or architecture decision should start in a separate session with a concise summary of the previous outcome.
This works because a new session starts with a cleaner context window. The model is no longer carrying every failed attempt, irrelevant file, stale assumption, or abandoned direction from the previous thread.
Fresh sessions work best when you pair them with a short handoff summary:
We fixed the ProductList rendering issue. The root cause was that products was undefined before the API response resolved. The component now defaults products to an empty array and renders a loading state while fetching. Next, we need to add tests for empty, loading, and populated states.
This keeps the useful context while dropping the noise.
Even if you dedicate each session to one task, complexity still matters. Dumping the entire project roadmap into the prompt can overload the session before the agent has done any useful work.
For example, this prompt is likely to create context problems:
You are an expert developer. I need you to build a full-stack ecommerce app. First, design the database schema for users and products. Then, write the authentication logic using JWT. After that, create the React components for the storefront, and finally, set up the Stripe API for payments. Start with the database.
The issue is not that the request is impossible. The issue is that the agent now has to track database design, authentication, frontend architecture, and payments while it is supposed to focus only on the schema.
A stronger version scopes the immediate milestone:
You are an expert developer. Phase 1 is the database schema. Focus only on designing SQL tables for users and products. Do not write frontend, authentication, or payment code yet.
By narrowing the prompt, you reduce the amount of irrelevant information competing for attention. This also makes the output easier to evaluate because the agent has a clear boundary.
For larger teams, this same idea applies at the workflow level. LogRocket’s AI-assisted development governance guide makes a similar point: AI coding tools need project context, but they also need guardrails that keep work scoped, reviewable, and enforceable.
Pasted input can quietly accelerate context rot. Error logs, stack traces, generated files, and copied documentation often include far more detail than the agent needs.
Consider this error log:
(node:12345) [DEP0040] DeprecationWarning: The punycode module is deprecated.
[10:42:01] Starting 'build'...
[10:42:05] Finished 'build' after 4.2 s
TypeError: Cannot read properties of undefined (reading 'map')
at ProductList (src/components/ProductList.tsx:14:25)
at renderWithHooks (node_modules/react-dom/cjs/react-dom.development.js:15486:18)
...[150 more lines of internal stack trace]...
For this issue, the agent probably needs only the error type, the file, and the line number:
TypeError: Cannot read properties of undefined (reading 'map')
at ProductList (src/components/ProductList.tsx:14:25)
The rest of the stack trace may be useful in some cases, but it usually adds noise. Before pasting large outputs into an agent session, remove:
This is a simple habit, but it has a large impact. Clean input gives the model fewer irrelevant tokens to weigh.
Models often pay more attention to the beginning and end of a prompt than to information buried in the middle. You can use this tendency by placing critical constraints at the edges of your prompt.
For example:
SESSION RULE: Do not auto-commit changes. Wait for explicit approval before committing. TASK: Refactor the authentication module to reduce duplicated validation logic. REMINDER: Do not auto-commit changes unless I explicitly instruct you to commit.
Prompt anchoring is especially useful for non-negotiable constraints, such as:
This technique is not a guarantee. If a rule is truly critical, pair it with tool permissions, hooks, tests, or repository-level policies. But anchoring still improves the odds that the model will preserve the rule across a longer task.
Compaction replaces part of the conversation history with a concise summary. The goal is to preserve the important decisions, current task state, and next steps while removing noisy intermediate turns.
This is useful when a session is still valuable, but the accumulated history is starting to work against you.
Tools such as Claude Code and OpenAI Codex support compaction workflows. Claude Code documents /compact as a slash command, and the Codex CLI docs describe /compact as a way to summarize the conversation so far and replace earlier turns with a concise summary.
If your agent does not support compaction directly, you can do it manually with a prompt like this:
We are about to move this task to a fresh session. Please summarize the current state of the work. Include: - The original goal - The files changed so far - Key technical decisions - Current architecture - Bugs fixed - Tests added or still needed - The exact next step Exclude: - Failed approaches that are no longer relevant - Repeated error logs - Old implementation ideas we rejected
Then copy the summary into a new session.
The key is to make the summary operational, not conversational. A good compaction summary should tell the next session what to do, what not to repeat, and what state the project is currently in.
Automatic compaction can happen at awkward moments, depending on the tool. It may trigger in the middle of debugging, planning, or implementation. Even when the compaction works, the resulting summary may omit details you were still using.
A better habit is to compact manually at natural checkpoints:
For example, instead of waiting until the agent is already struggling, you can run /compact after finishing the schema design and before moving into authentication work.
This makes the summary cleaner because the task state is easier to describe. It also reduces the chance that unresolved errors, half-finished ideas, or temporary debugging output will become part of the agent’s long-term working context.
Fresh sessions and compaction help manage individual conversations. Persistent context files help preserve project knowledge across sessions.
These files are usually Markdown files such as CLAUDE.md, AGENTS.md, or tool-specific equivalents. They live in your project directory and tell the agent how to work inside the repository.
A useful context file might include:
For example:
# Project context: Ecommerce MCP dashboard ## Tech stack - Framework: Next.js 15 App Router - Language: TypeScript - Database: Prisma with PostgreSQL - Styling: Tailwind CSS and shadcn/ui ## Core rules - Use Server Components by default. - Add `use client` only when client-side interactivity is required. - Follow the feature-based folder structure in `src/features`. - All API routes must use the error-handling wrapper in `src/lib/api-wrapper.ts`. - Do not modify database migrations without explicit approval. ## Build and test - Run the build with `npm run build` - Run tests with `npm test`
Persistent files give each new session a stable baseline. Instead of pasting the same rules repeatedly, you encode them once in the repository.
However, these files can also become bloated. A long CLAUDE.md or AGENTS.md can create the same problem it is supposed to solve: too much context, not enough signal. Keep these files lean, specific, and current.
For more on structuring these files, see LogRocket’s guide to context engineering for IDEs: AGENTS.md and agent skills.
For complex implementation work, a persistent context file is not enough. You also need a progress tracker.
A plan file, often named plan.md, PLAN.md, or PLANS.md, gives the agent a structured place to record what it is doing and what remains. This is especially useful when you compact the session, restart the agent, or move the task into a fresh conversation.
A practical plan file should include:
| Section | Purpose |
|---|---|
| Goal | Defines the user-visible outcome |
| Current state | Explains the relevant architecture and files |
| Milestones | Breaks the task into implementation stages |
| Progress | Tracks completed and remaining work |
| Decision log | Records important choices and why they were made |
| Validation | Lists the exact commands and expected results |
| Next action | Tells the agent where to resume |
A simple version might look like this:
# Plan: Add saved payment methods
## Goal
Allow users to save a payment method and select it during checkout.
## Current state
Checkout currently creates a one-time Stripe payment intent in `src/features/checkout/actions/create-payment.ts`.
## Milestones
1. Add a saved payment method table
2. Add an API action for listing saved methods
3. Update checkout UI to select a saved method
4. Add tests for empty and populated states
## Progress
- [x] Reviewed current checkout flow
- [x] Identified Stripe integration point
- [ ] Add database model
- [ ] Add UI state handling
## Decision log
- Use a separate `SavedPaymentMethod` table instead of storing metadata on `User` because users may have multiple saved methods.
## Validation
Run:
npm test
npm run build
## Next action
Add the `SavedPaymentMethod` Prisma model and generate the migration.
With this setup, you can tell the agent:
Continue from plan.md. Read the current progress, identify the next incomplete milestone, and implement only that step.
OpenAI’s Cookbook describes a more advanced version of this pattern with PLANS.md for multi-hour Codex tasks. The important idea is that the plan becomes a living document. It should be updated as the agent learns, changes direction, or completes milestones.
Persistent context files work well for concise project rules, but they do not scale cleanly to large documentation sets. If your agent has to read hundreds of pages of product specs, architecture docs, support tickets, or API references, loading everything into the prompt is inefficient.
This is where retrieval-augmented generation, or RAG, can help.
RAG uses external data retrieval to give the model only the most relevant chunks of information at runtime. A typical RAG workflow looks like this:
The advantage is that the agent no longer needs to ingest every document at once. It can retrieve the specific pieces of context needed for the current task.
For agentic systems, retrieval can also be exposed through tools, including MCP servers. LogRocket’s tutorial on building your first MCP server with Node.js shows how MCP can give models structured access to files, databases, and APIs.
However, RAG is not a default solution for every project. It adds infrastructure, retrieval tuning, chunking decisions, and quality-control work. It can also introduce its own failure modes if stale or low-authority documents are retrieved alongside current guidance.
Use RAG when:
For smaller projects, a concise AGENTS.md, CLAUDE.md, or plan file is usually enough.
The techniques above help recover from context rot, but the better strategy is to slow it down from the beginning.
Here are the habits that matter most:
| Practice | How it helps |
|---|---|
| Use atomic workflows | Smaller tasks create less context noise |
| Start new sessions by task | Each session begins with a cleaner working set |
| Trim pasted input | The model receives only what it needs |
| Compact at milestones | Summaries are cleaner and more useful |
| Keep context files lean | Persistent rules stay high-signal |
| Maintain a plan file | Long tasks remain resumable |
| Separate rules from references | Agents can distinguish instructions from documentation |
| Validate with tests and diffs | Correctness does not depend on the agent’s memory |
One useful rule is “one task, one working context.” If a session starts as a debugging session, do not let it turn into a refactor, then a redesign, then a documentation update. Finish the task, summarize the result, and start a new session for the next scope.
This is also where multi-agent workflows need caution. Splitting work across agents can help, but only when each agent has enough shared context to coordinate safely. As LogRocket’s experiment on whether splitting work across AI agents actually saves time showed, parallelism can shift the bottleneck to review if the agents are not aligned.
Context rot is a side effect of how LLMs and AI agents process long, noisy, and changing inputs. Larger context windows help, but they do not eliminate the need for context management.
The practical fix is not to give the model everything. It is to give the model the right information at the right time.
For day-to-day work, that means starting fresh sessions when tasks change, keeping prompts focused, trimming pasted logs, anchoring critical rules, and compacting before the session gets messy. For larger projects, it means externalizing stable knowledge into CLAUDE.md, AGENTS.md, plan files, or retrieval systems.
When you treat context as a limited resource, AI agents become more reliable. They make fewer repeated mistakes, follow constraints more consistently, and recover more cleanly when a task spans multiple sessions.

Learn about TypeScript v6’s breaking changes, new ES2025 features, and deprecated options. A complete migration guide from v5 to prepare for v7.

Learn how Vite+ unifies Vite, Vitest, Oxlint, Oxfmt, Rolldown, and Node.js management in one CLI.

AI companies are buying developer tools as coding agents turn runtimes, package managers, and linters into strategic infrastructure.

Learn how AI-assisted development governance uses rules, agents, hooks, and protocols to help AI coding tools produce safer, more consistent code.
Would you be interested in joining LogRocket's developer community?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up now