← Back to Autonomy
A field guide · developer tooling

Matching the Claude App in VS Code

Why the same model can feel sharper in the chat app than in your editor — and the handful of moves that close the gap.

By Majid Mazouchi

You pick the same model in VS Code that you use in the Claude app, ask a similar question, and the editor's answer feels thinner. It is a common and frustrating experience — and the usual conclusion ("the IDE version must be a weaker model") is almost always wrong.

The model is identical. What differs is the harness: the invisible layer around the model that decides what it gets told, how much of your code it actually sees, and how it is allowed to work. Think of it like a motor and its inverter. Swap a great inverter for a crude one and the same machine performs worse — not because the magnets changed, but because what reaches the machine changed.

This guide explains the harness in plain terms, lets you play with the three things you actually control, and ends with a checklist you can keep next to your keyboard.

01It's the harness, not the model

Three things reach the model on every request: a system prompt (background instructions you never see), the context (how much of your project is fed in), and the model tier itself. The Claude app sets all three generously. A bare IDE setup often trims them to save tokens. Toggle between them below and watch what the model receives.

Figure 1 · interactive

Same model, three harnesses

system prompt
context fed
model tier
→ output
The model never changes across these three. Only what reaches it does. The "tuned" editor recovers most of the app's quality by restoring the same inputs.
Practical noteMost "the IDE is dumber" complaints disappear once context is restored. Before blaming the model, ask: did it actually see the files it needed?

02The three levers you control

You can't see the system prompt, but you can influence all three inputs. Drag the sliders to feel how they combine. Context is weighted heaviest because it is where most real-world quality is won or lost.

Figure 2 · interactive

Tune the levers → estimate the match

100%
illustrative match to the app
A teaching model, not a benchmark. The point is the shape: drop context and the needle falls fast; a top model with starved context still underperforms a mid model that can see everything.

03Lever one — really run the same model

The fastest silent downgrade is the default model. In Claude Code, set it on purpose with /model. In GitHub Copilot, pick the model from the chat dropdown — recent Claude flagships are available there, with very large context windows. If your app habit is the top-tier model, match it; don't let a cheaper default decide your output quality.

Also switch on extended thinking for genuinely hard problems. The app leans on it; editor tools often leave it off by default.

Practical noteTreat model choice as deliberate per task: top tier for design and tricky debugging, a lighter model for boilerplate to conserve usage.

04Lever two — feed context on purpose

This is the big one. A project notes file at your repo root acts like a private, persistent system prompt — your stack, conventions, and rules, loaded every session so you never re-explain them. Claude Code reads CLAUDE.md; Copilot reads .github/copilot-instructions.md.

Two more habits: use file references (type @ in Claude Code, # in Copilot) to hand the model the exact files instead of letting it guess; and add an ignore file to keep large, irrelevant artifacts out of the window.

There's a quieter trap: a long-running conversation. As a thread fills with old back-and-forth, the model spends its attention budget on stale material and reasons worse. Starting fresh is a feature, not a reset. Try it:

Figure 3 · interactive

Why starting a fresh thread helps

6%
Each message piles onto the context window. Past a point the signal you care about is buried in old chatter. Clearing restores room — and reasoning quality — for the task in front of you.
Practical noteOne task per thread. When you switch problems, clear or open a new conversation rather than continuing a bloated one.

05Lever three — use the agentic harness, not autocomplete

Inline autocomplete reacts to keystrokes. The app-grade experience comes from the agent: it reads the codebase, drafts a plan, edits across files, runs commands, checks output, and loops. For the closest match to the app, use Anthropic's official Claude Code extension (it bundles the full engine, adds a side-by-side diff viewer, editable plan review, and conversation tabs). In Copilot, switch to Agent mode rather than the Ask or Edit modes.

Then extend it the way the app is extended: add tool servers (MCP) for things like browser access, and package repeated workflows as reusable skills so the model doesn't re-explore a path you've already mapped.

Practical noteScope the problem in a cheap "ask" turn first, then switch to agent mode for execution. You get the planning quality without burning iteration budget on exploration.

06The keep-it-by-your-keyboard checklist

07References & further reading