Will ECC slow down my coding workflow?

No. ECC operates at the configuration and routing layer, not between you and the model. Model routing decisions happen before the request is sent; thinking token caps are enforced server-side by the API; compaction triggers at the same intervals as native Claude Code. There is zero added latency. In practice, Haiku-routed lookups often feel faster because the lighter model responds quicker.

How do I know ECC is actually saving me money?

ECC includes a built-in cost audit command (ecc-tools-cost-audit) that shows per-session and per-task token breakdowns by model. It compares your actual spend against what you would have paid routing everything through Opus with default settings. The audit also shows cumulative savings over time, so you can track ROI across days, weeks, or months.

Method 03 — Recommended

ECC Deep Dive: The All-in-One Token Optimization Harness

Q: Does ECC work with tools other than Claude Code?

Yes. ECC supports Claude Code (primary), Cursor, Codex, OpenCode, Gemini CLI, and GitHub Copilot. The model routing and thinking token caps adapt to each platform's capabilities. Cursor and Codex benefit most from the routing layer; Copilot and Gemini benefit from the compaction and token cap settings.

Q: How is ECC different from manually setting model and compaction options?

Manual configuration requires you to remember and tune 5-8 separate settings across multiple files. ECC bundles them into one install with sane defaults that have been battle-tested across thousands of developer hours. It also adds features manual config cannot replicate: dynamic subagent routing, per-task cost auditing, and automatic fallback when a model is unavailable.

182K+ GitHub stars, winner of the Anthropic Hackathon. One install replaces five manual configurations — model routing, thinking token caps, strategic compaction, subagent routing, and cost auditing — all working together to deliver a verified 30–50% reduction in token spend.

Token Savings: 30–50%

What Is ECC

ECC stands for Everything Claude Code, created by Affaan Mustafa. It is an agent harness optimization system that wraps your AI coding tool in a layer of intelligent defaults — defaults that would otherwise take hours to configure by hand. With 182,000+ GitHub stars and an official Anthropic Hackathon win, ECC has become the de facto standard for developers who want maximum AI capability at minimum token cost.

At its core, ECC is not a separate tool you have to learn. It is a configuration harness that sits between your coding agent and the model provider, making real-time decisions about which model should handle each request, how many thinking tokens the model is allowed to consume, when context should be compacted, and which subagent should be dispatched for background work. All of this happens automatically — you keep coding, and ECC keeps optimizing.

ECC was born from a simple observation: most developers were running Opus for everything, hitting $200+ monthly API bills, and manually tweaking a dozen environment variables that kept drifting out of sync. Affaan's insight was that these optimizations should be centralized, tested, and shipped as one unit. The result is a system that has saved the community an estimated millions of dollars in token costs since its release.

The hackathon judges at Anthropic recognized ECC for solving a real, measurable problem. Unlike many hackathon projects that demonstrate clever ideas, ECC demonstrated measurable cost reduction with zero feature degradation. Users reported the same code quality, same task completion rates, and 30–50% lower bills. That is why it won — and why it continues to be the most recommended optimization tool for AI-powered development environments.

ECC supports a broad ecosystem of AI coding tools: Claude Code (primary and deepest integration), Cursor, Codex, OpenCode, Gemini CLI, and GitHub Copilot. The integration depth varies by platform, but the core optimizations — model routing, thinking token caps, and compaction triggers — work everywhere. The broader tool support means you can standardize your team's token optimization strategy regardless of which editor each developer prefers.

What ECC Automates

ECC bundles five optimization layers that would otherwise require manual configuration across multiple files. Each layer targets a different source of token waste.

1. Intelligent Model Routing

ECC inspects every request before it reaches the API and routes it to the appropriate model based on task complexity. Simple file lookups, grep searches, and variable renames go to Haiku (the fastest and cheapest model). Everyday coding tasks — feature implementation, refactoring, test writing — go to Sonnet. Complex architecture design, security audits, and multi-file debugging go to Opus. This single layer accounts for roughly half of ECC's total savings. You never have to remember to switch models mid-session; ECC handles it based on what you are actually doing.

2. Thinking Token Caps

Claude's extended thinking feature is powerful but expensive. The default maximum thinking token budget is 31,999 tokens — enough to burn significant cost on overthinking simple problems. ECC sets MAX_THINKING_TOKENS=10000, which is more than enough for complex reasoning while capping runaway thinking costs on straightforward tasks. Empirically, 10K thinking tokens covers even the hardest debugging sessions; the extra 22K in the default budget is rarely useful and frequently expensive. This layer alone saves 10–15% on sessions that use extended thinking.

3. Strategic Compaction

The native Claude Code compaction trigger fires at 95% context — far too late. By the time your context reaches 95%, the model has already lost track of early conversation details and is producing lower-quality output. ECC lowers the compaction threshold to 50% (AUTOCOMPACT_PCT=50), triggering summarization at safer intervals. This keeps the most important context in the active window, produces better compaction summaries, and prevents the costly scenario where you have to re-explain your architecture because the model forgot it three turns ago.

4. Haiku Subagents

When Claude Code spawns subagents for background tasks (file exploration, test running, search across the codebase), ECC routes those subagents to Haiku by default. Subagent work is typically mechanical — find files, grep patterns, list directories — and does not need the reasoning power of Sonnet or Opus. Routing subagents to Haiku can cut subagent token costs by 60–80% without affecting the quality of the subagent's output.

5. Cost Auditing

ECC ships with a built-in cost auditing tool (ecc-tools-cost-audit) that breaks down your token spending by task type, model, and session. It shows exactly how much you saved versus a baseline of running everything through Opus with default settings. The audit tool also tracks cumulative savings over time, so you can see your ROI grow day by day. This visibility is critical: what you cannot measure, you cannot improve.

Installation

ECC installs in one command. Choose the approach that matches your workflow. Both produce an identical result — ECC running as a plugin inside Claude Code.

Approach 1: Plugin Marketplace (Recommended)

If you are running Claude Code with plugin marketplace access, this is the simplest path. The marketplace handles discovery, installation, and updates.

      # Step 1: Add the ECC plugin from the marketplace

      /plugin marketplace add affaan-m/everything-claude-code

      # Step 2: Install the plugin (this activates all optimizations)

      /plugin install everything-claude-code@everything-claude-code

Approach 2: Direct Clone

If you prefer managing plugins manually or do not have marketplace access, clone the repository directly and source the configuration.

      # Clone ECC into your Claude Code plugins directory

      git clone https://github.com/affaan-m/everything-claude-code.git \

        ~/.claude/plugins/everything-claude-code

      # Source the ECC configuration

      echo 'source ~/.claude/plugins/everything-claude-code/init.sh' >> ~/.claude/settings.json

Verify Installation

After installation, confirm ECC is active and all five optimization layers are engaged.

      # Check ECC status — should show all 5 layers as ACTIVE

      /plugin status everything-claude-code

      # Expected output (abbreviated):

      everything-claude-code v2.4.1

        model-routing      ACTIVE  → Haiku/Sonnet/Opus

        thinking-cap       ACTIVE  → MAX_THINKING_TOKENS=10000

        compaction         ACTIVE  → AUTOCOMPACT_PCT=50

        subagent-routing   ACTIVE  → Haiku subagents

        cost-audit         ACTIVE  → ecc-tools-cost-audit

Configuration Deep Dive

ECC works out of the box with zero configuration, but every setting is tunable. Below are the recommended settings for each optimization layer, along with when and why you might adjust them.

Model Routing Configuration

ECC's routing table maps task patterns to models. The defaults are battle-tested, but you can override them if your workload differs from the norm. For example, if you work in a massive monorepo where even "simple lookups" span dozens of files, you might promote lookups from Haiku to Sonnet.

      # ~/.claude/settings.json — ECC routing overrides

      {

        "ecc": {

          "routing": {

            "lookup": "haiku",

            "coding": "sonnet",

            "architecture": "opus",

            "subagent": "haiku"

          }

        }

      }

Thinking Token Cap Configuration

The 10K default is conservative. If you frequently tackle deeply nested debugging sessions (tracing bugs across 5+ abstraction layers), you can raise it to 16K. If you primarily do CRUD work and rarely use extended thinking, drop it to 6K for even more savings.

      # ~/.claude/settings.json — ECC thinking cap override

      {

        "ecc": {

          "thinking": {

            "maxTokens": 10000,

            // Raise to 16000 for complex debugging

            // Lower to 6000 for CRUD-heavy work

            "enabled": true

          }

        }

      }

Compaction Configuration

The 50% threshold is the recommended sweet spot. At 50%, compaction triggers early enough to preserve context quality but not so early that you lose useful history. If your sessions are ultra-long (200+ turns), consider 40%. If your sessions are typically short (under 20 turns), 60% is fine.

      # ~/.claude/settings.json — ECC compaction override

      {

        "ecc": {

          "compaction": {

            "threshold": 50,

            // Range: 40 (aggressive) to 70 (conservative)

            "strategy": "preserve-recent"

          }

        }

      }

Cost Audit Configuration

The cost audit tool can be run on-demand or configured to display a summary after every session. Enabling session-end summaries is recommended for the first two weeks while you build intuition about your spending patterns.

      # ~/.claude/settings.json — ECC audit configuration

      {

        "ecc": {

          "audit": {

            "showAfterSession": true,

            "trackCumulative": true,

            "alertThreshold": 5.00

            // Alert when daily spend exceeds $5.00

          }

        }

      }

Real Cost Data

ECC's savings claims are backed by real production data from the community. The table below shows typical monthly costs for three usage profiles — before and after installing ECC. All figures are based on community-reported averages from the ECC GitHub discussions and the Anthropic developer forum.

User Profile	Before ECC (monthly)	After ECC (monthly)	Savings
Light user (1–2 hrs/day, casual coding)	$35–55	$18–28	~48%
Moderate user (3–5 hrs/day, professional dev)	$120–200	$72–120	~40%
Heavy user (6+ hrs/day, AI-native workflow)	$350–600	$210–360	~40%
Team of 5 (median usage, shared API key)	$800–1,400	$480–840	~40%
Team of 20 (enterprise, per-seat billing)	$3,200–5,500	$1,920–3,300	~40%

These figures assume US pricing for the Anthropic API (Opus, Sonnet, Haiku tiers) and include both input and output token costs. The savings percentage is slightly higher for light users because a larger proportion of their requests are simple lookups that get routed to Haiku. Heavy users tend to have a higher proportion of complex tasks that still need Sonnet or Opus, but the thinking token caps and compaction savings still deliver a strong 40% reduction.

Community data point: The most frequently reported figure in the ECC GitHub discussions is a 42% reduction in monthly token spend within the first full month of use. Users who had previously done zero optimization (running Opus for everything, default compaction, uncapped thinking) reported savings as high as 55% in their first month.

ECC vs Manual Configuration

You can achieve some of ECC's savings by manually configuring each setting — but the time cost, maintenance burden, and risk of drift make manual configuration a worse long-term strategy. This table compares the two approaches across the dimensions that matter.

Dimension	ECC (One Install)	Manual Configuration
Setup time	1 minute (one command)	30–60 minutes (5+ config files)
Model routing	Automatic, task-aware routing to Haiku/Sonnet/Opus	Manual per-session model selection; easy to forget
Thinking token cap	10K cap enforced automatically	Must remember to set env var; no default
Compaction threshold	50% auto-compaction in one setting	Manual per-project env configuration
Subagent routing	Haiku by default, configurable	Separate env var; often overlooked entirely
Cost visibility	Built-in audit with cumulative tracking	Manual Console checks; no aggregation
Multi-tool support	Claude Code, Cursor, Codex, Gemini, Copilot	Per-tool configuration; inconsistent coverage
Updates and improvements	Community-maintained, auto-updating	Manual research and reconfiguration
Risk of misconfiguration	Low — defaults are battle-tested	High — easy to set conflicting values
Ongoing maintenance	Zero — ECC adapts to API changes	Must track API pricing and feature changes

The bottom line: manual configuration can get you 60–70% of ECC's savings if you invest the time and maintain it perfectly. ECC gets you 100% of the savings for one minute of setup and zero ongoing maintenance. For individuals, the difference is convenience. For teams, the difference is thousands of dollars per year in recovered developer time that would otherwise be spent debugging configuration drift.

FAQ

Yes. ECC supports Claude Code (primary and deepest integration), Cursor, Codex, OpenCode, Gemini CLI, and GitHub Copilot. The model routing and thinking token caps adapt to each platform's capabilities. Cursor and Codex benefit most from the routing layer, which maps their internal model names to the ECC routing table. Copilot and Gemini benefit from the compaction and token cap settings. The installation is the same regardless of which tool you use — ECC detects your environment and applies the appropriate optimizations for each supported platform.

Manual configuration requires you to remember and tune five to eight separate settings across multiple files (settings.json, environment variables, per-project overrides). ECC bundles them into one install with defaults that have been battle-tested across thousands of developer hours. More importantly, ECC adds capabilities that manual configuration cannot replicate: dynamic task-aware model routing that inspects your prompt before deciding which model to use, per-task cost auditing with cumulative tracking, and automatic fallback routing when a requested model is unavailable or rate-limited. Manual configuration is a static set of rules; ECC is an adaptive optimization layer.

No. ECC operates at the configuration and routing layer — it does not sit between you and the model adding latency. Model routing decisions happen in microseconds before the request is sent. Thinking token caps are enforced server-side by the API, not by ECC adding overhead. Compaction triggers at the same intervals as native Claude Code; ECC just changes the threshold number. There is zero added latency to any request. In practice, many users report that Haiku-routed lookups feel faster than their previous workflow because the lighter model responds more quickly than Opus would for the same simple query.

ECC includes a built-in cost audit command: run /ecc-tools-cost-audit to see a detailed breakdown of your token spending by task type, model, and session. The audit compares your actual spend against a simulated baseline of what you would have paid routing everything through Opus with default settings. It also tracks cumulative savings over time — so you can see your total dollars saved across days, weeks, or months. The audit tool is transparent about its methodology: it uses the same Anthropic API pricing data that your billing dashboard uses, and it attributes costs to specific requests so you can verify the numbers yourself. For teams, the audit can be exported as JSON for integration with internal cost-tracking dashboards.

← Back to all 5 methods

See how ECC stacks up against manual config, editor settings, and DIY solutions → ECC vs Alternatives: full comparison