Method 05

Search First, Read Later: Cut Per-Task Token Usage 20–40%

Reading entire files to find one function is the #1 token waste habit among AI coding tool users. It's the equivalent of buying the whole grocery store when all you need is a carton of milk. By searching first and reading only what you need, you cut the fat without cutting capability.

Token Savings: 20–40%

The "Read Everything" Trap

Picture this: you need to understand how a single function works in your codebase. You tell your AI coding tool "read the auth service file." It loads an 800-line file. You look at 30 lines. The other 770 are irrelevant. But the model processed every single one of them — and you paid for every token.

This habit is so common it's practically muscle memory. You know the file name, so you open it. But in an AI coding context, "opening a file" means streaming its full contents into the model's context window. An 800-line TypeScript file with comments and type annotations can easily be 8,000–12,000 tokens. Multiply that by 10, 20, or 50 reads in a single coding session, and you've burned through hundreds of thousands of tokens on lines you never even looked at.

The trap is seductive because it feels fast. "Just read the file" is one sentence. But the token cost is hidden. You don't see a dollar sign flash every time the model ingests a file. By the end of the month, those invisible costs add up to a bill that's 20–40% higher than it needs to be.

Here's the good news: most modern AI coding tools have built-in search capabilities that run locally, cost zero tokens, and return results in milliseconds. The problem isn't tooling — it's habit. And habits can be changed.

The Search-First Workflow

Replace "read the file" with a three-step pattern. Each step gets you closer to the exact code you need, without loading anything you don't.

Step 1: Grep for the Symbol

Start with a content search to find exactly where your target lives. Instead of guessing which file contains the function, let grep tell you. The model's Grep tool searches file contents using regex — it's instant, free, and precise.

      # Find where handleLogin is defined or called

      grep -r "handleLogin" src/

      # Result: src/auth/service.ts:142 — right to the line

This one command replaces the "read auth service" guess. You now know the exact file and the exact line number, without loading a single token into the model.

Step 2: Glob for File Patterns

When you don't know the symbol name but know the file structure, use Glob. It matches filenames by pattern and is invaluable for narrowing scope before you search.

      # Find all test files related to auth

      glob "**/*auth*.test.ts"

      # Find all config files in the project

      glob "**/config*.{ts,js,json}"

Glob is your scope-narrowing tool. Use it before grep when you know the file pattern but not the exact path. Together, Glob + Grep give you surgical precision before you ever hit Read.

Step 3: Read With Line Offsets

Now that you know the exact file and line range, Read only the relevant section. Most Read tools support an offset and limit parameter — use them.

      # Read only the relevant function, not the whole file

      Read src/auth/service.ts offset=130 limit=50

      # 50 lines instead of 800 — a 94% reduction in tokens for this read

The difference is dramatic. Instead of loading 800 lines (8,000–12,000 tokens), you load 50 lines (500–800 tokens). Per read. Across a session with 20 file reads, that's the difference between 200,000 tokens and 15,000 tokens — an order of magnitude less.

Real Examples: Before vs. After

Nothing makes the case like real numbers. Here are three common scenarios with actual token counts.

Scenario	Before (Full Read)	After (Search First)	Savings
Find a function definition 800-line auth service, need one 40-line function	Read entire file ~10,000 tokens	Grep for function name, Read offset=130 limit=50 ~750 tokens	92%
Understand a React component 500-line Dashboard.tsx, need only the useEffect hooks	Read entire file ~7,000 tokens	Grep for useEffect, Read targeted sections ~1,200 tokens	83%
Locate all API route handlers Project with 15 route files, need to find POST /users	Read all 15 files ~60,000 tokens	Grep for "POST.*users" across routes/ dir ~2,000 tokens	97%
Find where a CSS class is used 20 components, need to find .btn-primary usages	Read all 20 component files ~45,000 tokens	Grep for "btn-primary" across src/ ~1,500 tokens	97%

These aren't cherry-picked edge cases. These are everyday tasks that AI coding tool users perform dozens of times per session. The cumulative savings across a full day of coding are in the 20–40% range — sometimes higher in large codebases.

Tool-Specific Tips

Every major AI coding tool has search built in. Here's how to use it effectively in each environment.

Claude Code

Claude Code has first-class Grep and Glob tools built into every session. They run as local tools (not through the model) and cost zero tokens. The key is phrasing: instead of saying "read X file", say "find where Y is defined" or "search for Z pattern." Claude Code will automatically choose the right search tool when your request signals a search rather than a read.

      # Good — triggers Grep (zero token cost)

      "Find where authenticateUser is defined in this project"

      # Bad — triggers full file Read

      "Read src/auth/service.ts and find authenticateUser"

Cursor

Cursor's Cmd+K (or Ctrl+K) inline search is your fast-path to targeted reading. Highlight a symbol and use Cmd+K to search across the codebase. The codebase-wide search (Cmd+Shift+F) uses ripgrep under the hood — same zero-token, instant-search principle. Train yourself to open files via search results rather than the file tree. Every file you open from search results already has the relevant line highlighted, so you read less.

Codex (OpenAI)

Codex's file search works through the sidebar and inline @-mentions. Use @file to narrow scope before asking questions. For example, "@auth/*.ts where is the login handler?" is far more efficient than "read all auth files." The @-mention syntax tells Codex to search rather than load, saving tokens before the model even processes your request.

Universal rule: If your request starts with "read" or "open," stop and rephrase. Start with "find," "search," "locate," or "where is." The verb you use determines the tool the model picks — and the tokens you'll pay. This habit also directly preserves context headroom: every file you don't read is headroom you don't lose. Combined with model routing (Haiku for searches), the per-task savings compound.

Building the Habit

Knowing the technique is easy. Replacing the muscle memory of "just read the file" is harder. Here are practical strategies that work.

1. Add a Post-It to Your Monitor

Literally. Write "SEARCH FIRST" on a sticky note and put it where you'll see it. Half the battle is catching yourself before you type "read the file." After about two weeks, the habit will be automatic — but those first two weeks need a visual trigger.

2. Audit One Session Per Week

Pick one AI coding session per week and review how many times you read entire files versus searching first. Count the full-file reads. Multiply by the average file size in your codebase. That's your weekly token waste figure. Watching that number shrink week over week is the best motivation you'll find.

3. Use the 5-Second Rule

Before every file read, pause for five seconds and ask: "Do I need the whole file, or just a section?" If the answer is "just a section," search first. Five seconds of thinking saves thousands of tokens. No other optimization technique has a better effort-to-reward ratio.

4. Make It a Team Norm

If you work on a team, add "search before read" to your AI coding guidelines. When reviewing PRs or discussing approaches, call out when someone describes reading an entire file for a targeted task. Peer reinforcement accelerates habit formation more than any individual effort.

5. Track Your Tool's Session Stats

Claude Code shows token usage per session in the footer. Cursor and Codex have usage dashboards. Check them at the end of each day. If your daily token count dropped after adopting search-first, that's concrete proof the habit is working. Numbers don't lie, and positive reinforcement is powerful.

When Full Reads Actually Make Sense

Search-first is a rule, not a religion. There are legitimate times when reading an entire file is the right call. Here's how to recognize them.

Scenario	Search First or Full Read?	Why
Architecture review of a core module	Full Read	You genuinely need the entire file to understand structure, patterns, and dependencies. The token cost is justified by the need for complete context.
Full-file refactor (e.g., rewrite a component)	Full Read	If you're replacing or restructuring most of a file, the model needs to see everything to produce a coherent rewrite. Partial reads create blind spots.
Finding a single function definition	Search First	You need 30 lines out of 800. Grep for the function name, Read with offset. The full file is noise, not signal.
Debugging a specific error	Search First	Search for the error message or stack trace line. Read the surrounding 30–50 lines. Reading the whole file before you know where the bug is wastes tokens and attention.
Onboarding to a new codebase	Mix	Read entry points and key files in full to build a mental model. Search for specific patterns as you explore. Don't read every file — be strategic about which ones deserve full attention.
Writing unit tests for a module	Search First	Grep for export signatures, then Read targeted function bodies. You need the public API surface, not every private helper's internals.

The grey area shrinks with practice. After a few weeks of search-first, you'll develop an instinct for when a full read is genuinely needed versus when it's just old habits creeping back. When in doubt, search first — you can always expand to a full read, but you can never un-burn tokens you've already spent.

FAQ

Almost always, yes. A grep search is practically instant and costs zero tokens (it runs as a local tool, not through the model). A full file read of 800 lines sends all 800 lines through the model's context window, costing both tokens and latency. The only exception is when you genuinely need the entire file — for architecture reviews or full-file refactors. In those cases, reading the whole file is the right call. For everything else, search first.

Claude Code has excellent built-in search tools (Grep, Glob), but it only uses them when you explicitly instruct it to. If you say "read auth.ts and find the login function," the tool reads the whole file. If you say "find where login is defined in the auth module," it greps first. The difference is in how you phrase your request. Training yourself to ask search-oriented questions is the key habit change.

Use Grep when you know what you're looking for (a function name, a string, a regex pattern). Use Glob when you're looking for files by name or pattern (*.ts, **/*.test.ts, config*.json). The two tools complement each other: glob first to narrow the file scope, then grep inside those files. Together they let you drill down to exactly the right location before reading anything.

← Back to all 5 methods