Does Sonnet 4.6 have multi-session memory like Opus 4.7?

Sonnet 4.6 doesn't ship with the same multi-session memory architecture as Opus 4.7. Per Anthropic's 4.7 release notes, multi-session memory persistence via scratchpad/notes file is a 4.7-specific capability. For matter-spanning work where the same legal question persists across days or weeks (M&A diligence runs 5-15 days, multi-day depositions span weeks of prep, white-collar matters run months), Opus 4.7 holds context cleanly while Sonnet 4.6 starts cold per session. The recovered re-priming time is real — for a 12-day M&A diligence engagement at $400/hour blended rate, Opus 4.7's persistence saves roughly $4,800 per matter. For one-shot research and standard contract review, the persistence differential doesn't matter and Sonnet's lower cost wins.

How do BigLaw firms route between Opus 4.7 and Sonnet 4.6 in practice?

Most BigLaw firms use workflow-routed deployment with the routing logic built into firm middleware. Discovery review pipelines route bulk relevance coding to Sonnet 4.6 and privilege review to Opus 4.7. Brief drafting routes first-draft generation to Sonnet and legal-argument refinement to Opus. M&A diligence routes document categorization to Sonnet and matter-spanning analysis to Opus. The pattern requires upfront prompt-engineering investment but produces the largest cost savings — at 500 attorneys, the routing differential typically saves $50,000-$100,000+/year against pure-Opus deployment. Firms running Microsoft Foundry, AWS Bedrock, or Vertex AI deployments build the routing logic into their middleware layer rather than user-facing prompts. Most BigLaw firms with structured AI procurement already have this capability or can build it within a quarter.

Should I use Opus 4.7 or Sonnet 4.6 for first-draft brief writing?

Mixed routing typically wins. Use Sonnet 4.6 for the first-draft generation step where the model produces a workable starting point that the partner will heavily edit. The 40% lower per-token cost matters at brief-drafting volume, and Sonnet's calibration handles common motion patterns and standard procedural posture. Use Opus 4.7 for the legal-argument refinement step where novel reasoning, jurisdictional nuance, or edge-case appellate strategy matters. Most firms find this two-step pattern produces better final briefs than running everything on either model alone — Sonnet handles speed and bulk; Opus handles depth where it matters. Run a 30-day eval on your firm's actual brief workload to find the right ratio for your practice mix.

Claude Opus 4.7 vs Sonnet 4.6 When to Use Which

Claude Opus 4.7 vs Sonnet 4.6 for legal teams is the within-Anthropic procurement question most firms haven't optimized yet, and that's the cost-recovery gap. Per Anthropic's pricing page, Opus 4.7 lists at $5/M input + $25/M output. Sonnet 4.6 lists at $3/M input + $15/M output — 40% cheaper input, 40% cheaper output. For high-volume legal workloads, the routing decision between Opus and Sonnet often saves more money than negotiating a vendor discount. The flagship narrative says "use Opus for important work." The operator answer is more nuanced: Sonnet 4.6 handles 60-80% of typical legal workload at meaningfully lower cost, with Opus 4.7 reserved for tasks where calibration, multi-session memory, or xhigh effort level genuinely matter. Here's the routing matrix by use case.

What separates Opus 4.7 from Sonnet 4.6 in actual legal work

Both models are part of Anthropic's Claude family. Both ship via the same surfaces (claude.ai, API, Foundry, Bedrock, Vertex per the Anthropic deployment options). The differences that matter for legal work:

Opus 4.7 ships with: - The xhigh effort level (per Anthropic's 4.7 docs) for finer reasoning-latency control. Claude Code defaults paid plans to xhigh. - 87.6% SWE-bench Verified score for software engineering tasks. - 94.2% GPQA Diamond for graduate-level reasoning. - Full multi-session memory persistence for long-horizon matter context. - Task budgets for deterministic spend per agentic loop. - Cybersecurity safeguards by default for prohibited use detection. - The new 1.0-1.35x tokenizer (more granular legal-prose tokenization). - Vision input at 3.75 megapixels.

Sonnet 4.6 ships with: - Strong general-purpose capability at a faster latency. - Comparable calibration on common legal questions (state-bar variations, standard contract review, common research patterns). - Same context window as Opus 4.7 (200K). - Same deployment surface options. - Lower per-token cost across the board.

The practical differentiation in legal work shows up in three places:

Calibration on niche legal questions. Opus 4.7's improved calibration (less likely to proceed confidently with bad plans) shows up most on edge cases — recent statute renumberings, jurisdiction-specific procedural variations, novel legal arguments. For high-volume routine legal work (standard contract review, common research, intake processing), Sonnet 4.6 produces calibrated answers at lower cost.

Multi-session memory for long-horizon matters. Opus 4.7's scratchpad/notes file persistence is the differentiator for matter-spanning work. M&A diligence (5-15 days), multi-day depositions, white-collar matters running months. Sonnet 4.6 doesn't ship with the same persistence architecture.

Reasoning depth on complex single-shot tasks. Opus 4.7 with xhigh effort level produces deeper reasoning traces on novel cause-of-action drafting, constitutional challenges, edge-case appellate strategy. Sonnet 4.6 handles common patterns at speed but doesn't sustain the same reasoning depth on rare edge cases.

The second-order angle: routing the right workload to the right model is a procurement skill, not a model skill. Most firms run all queries on Opus by default and pay 40% more than they need to for the high-volume routine work. The operator move is workload-aware routing — letting the firm's actual practice mix drive model selection per task type.

The routing matrix: which legal tasks get Opus vs Sonnet

Use Opus 4.7 for:

- M&A diligence and matter-spanning legal work where multi-session memory recovers analyst re-priming time. The multi-session memory M&A diligence guide covers the storage architecture. - Discovery review with task budgets where deterministic per-matter spend matters for budget memos. See the task budgets in discovery deep-dive. - Novel legal arguments — constitutional challenges, novel causes of action, edge-case appellate strategy. xhigh effort level produces the deeper reasoning trace that matters here. - High-stakes single-shot work where one error has outsized consequences. Sanctions cases (per Damien Charlotin's hallucination database, 1,227 documented globally as of early 2026) cluster around moments where calibration mattered and didn't show up. Opus 4.7's calibration improvements pay off most here. - Visual evidence review and OCR on scanned discovery — the 3.75-megapixel vision input is 3.26x higher fidelity than 4.6. - Privileged-context work where cybersecurity safeguards by default reduce the rogue-prompt risk surface (see the cybersecurity safeguards privileged context spoke).

Use Sonnet 4.6 for:

- Standard contract review against common templates. NDAs, MSAs, employment agreements, supply contracts where the patterns are well-established. Sonnet 4.6's calibration handles these at 40% lower cost. - High-volume intake processing. Initial conflict checks, basic engagement letter generation, standard form completion. - Common legal research where the question maps to settled doctrine. Routine motion practice, standard discovery requests, common procedural inquiries. - First-draft generation for partners who will heavily edit. The model doesn't need to nail the final version; it needs to produce a workable starting point. - Bulk classification and summarization. Document categorization, deposition summary generation, case law summary work. - Internal communications drafting. Status updates to clients, internal memos, partner-to-associate task delegation.

Mixed routing (use both within a single workflow):

- Brief drafting: Sonnet 4.6 for first-draft generation, Opus 4.7 for the legal-argument layer where novel reasoning matters. - Discovery pipeline: Sonnet 4.6 for high-volume relevance coding, Opus 4.7 for privilege review and citation extraction where the stakes warrant the calibration. - M&A diligence: Sonnet 4.6 for boilerplate document classification, Opus 4.7 for the matter-spanning analysis layer.

The operator pattern: build prompt templates that route by task complexity, not by user preference. Most firms find Sonnet handles 60-80% of routine work; Opus handles the 20-40% where the differentiated capability earns its premium.

Pricing reality: what the routing decision saves

A 25-attorney mid-market firm, baseline assumption: 100M tokens consumed monthly across the firm (roughly 4M tokens per attorney per month, modest agentic usage). 70% input / 30% output split.

100% Opus 4.7 routing: - Input: 70M tokens × $5/M = $350/month - Output: 30M tokens × $25/M = $750/month - Subtotal: $1,100/month, $13,200/year - Plus Team Standard seats: 25 × $20 × 12 = $6,000/year - Total: $19,200/year

Workload-aware routing (70% Sonnet, 30% Opus): - 70M tokens to Sonnet 4.6: 49M input × $3/M = $147 + 21M output × $15/M = $315 = $462/month on Sonnet - 30M tokens to Opus 4.7: 21M input × $5/M = $105 + 9M output × $25/M = $225 = $330/month on Opus - Subtotal: $792/month, $9,504/year - Plus Team Standard seats: $6,000/year - Total: $15,504/year

Savings from routing: $3,696/year for a 25-attorney firm at 100M tokens/month. Scales linearly with usage. A 100-attorney firm at 400M tokens/month saves roughly $14,800/year. A 500-attorney firm at 2B tokens/month saves $74,000/year.

The second-order math: the savings compound when the firm uses Sonnet for the high-volume bulk work where its capabilities are genuinely sufficient, and reserves Opus for the complex work where the differential matters. Firms that try to save by routing everything to Sonnet typically see verification cost increase (more iteration cycles on edge cases that Sonnet misses), which can offset the savings.

The third-order: the Opus 4.7 vs Claude 4.6 cost analysis covers the within-Opus tokenizer change. The same tokenizer dynamics apply when comparing Opus 4.7 against Sonnet 4.6, so the actual savings for legal-prose-heavy workloads sit closer to the upper end of estimates.

Operational patterns: how firms actually deploy both models

Three deployment patterns work in practice:

Pattern 1: Default-Opus with Sonnet fallback for bulk work.

The firm runs Opus 4.7 as the default model on claude.ai Team or via API. For specific high-volume workloads (intake processing, document classification, deposition summary work), the firm builds prompt templates that explicitly route to Sonnet 4.6 via direct API calls. The default user experience stays on Opus; the bulk work runs cheaper.

This pattern fits firms with limited prompt-engineering capability — most users never have to think about model selection.

Pattern 2: Default-Sonnet with Opus escalation.

The firm runs Sonnet 4.6 as the default model. Users explicitly invoke Opus 4.7 for tasks marked high-stakes (novel arguments, M&A diligence, privileged work). The default user experience stays on Sonnet; Opus is reserved for explicit escalation.

This pattern fits cost-conscious firms with mature internal AI training. Users learn when to escalate and the firm captures the cost differential as a default.

Pattern 3: Workflow-routed (most common at scale).

The firm builds workflow-specific prompt templates that route to the appropriate model per task. Discovery review pipelines route bulk classification to Sonnet and privilege/citation work to Opus. Brief drafting routes first-draft generation to Sonnet and legal-argument refinement to Opus. M&A diligence routes document categorization to Sonnet and the matter-spanning analysis layer to Opus.

This pattern fits firms with dedicated legal operations or AI engineering capability. The cost savings are largest, and the user experience is most polished, but it requires upfront prompt-engineering investment.

For BigLaw firms running the Microsoft Foundry deployment, or AWS Bedrock, or Vertex AI, the routing logic typically lives in the firm's middleware layer rather than in user-facing prompts.

Recommendation by firm profile

Solo practitioners and small firms (1-10 attorneys): Default-Opus pattern. The cost differential at solo volume is small enough that routing complexity isn't worth it. Claude Pro ($20/month) covers most workloads. For occasional high-volume tasks, fall back to Sonnet via direct API calls — but most solos don't need this optimization.

Mid-market firms (10-50 attorneys): Default-Sonnet with Opus escalation, or workflow-routed if the firm has legal ops capability. Annual savings at 25 attorneys: $3,000-$5,000. Worth the prompt-engineering investment if the firm can sustain it. The Opus 4.7 for legal teams operator read covers the broader procurement context.

BigLaw and AmLaw 100: Workflow-routed deployment. Annual savings at 500 attorneys: $50,000-$100,000+. Build the routing logic into firm middleware; let user-facing tools default to whatever feels natural. Most BigLaw firms running structured AI procurement already have this capability or can build it within a quarter.

By practice area: - High-volume transactional (employment, real estate, basic commercial) → Sonnet handles 80%+ of workload. - Complex transactional (M&A, securities, complex commercial) → Opus for matter-spanning work, Sonnet for boilerplate. - Litigation → Sonnet for high-volume discovery work, Opus for novel argument drafting and privilege review. - Regulatory and government contracts → mostly Opus due to calibration requirements on rare statutory and regulatory edge cases. - IP and tech-transactional → mixed with workload-driven routing.

Pick the deployment pattern that matches the firm's operational capability, not the pattern that promises the largest theoretical savings. A pattern the firm can't sustain delivers worse outcomes than a simpler pattern with smaller savings.

The Bottom Line: The verdict: Most firms running Opus 4.7 by default are paying 30-50% more than they need to for high-volume routine legal work that Sonnet 4.6 handles at comparable quality. The right answer is workload-aware routing — Sonnet for high-volume bulk work, Opus for complex single-shot reasoning, multi-session matter work, and high-stakes calibration. Annual savings scale with firm size: $3,000-$5,000 for mid-market, $50,000-$100,000+ for BigLaw. Build the routing logic; the upfront prompt-engineering investment pays back within the first month at any non-trivial usage volume.

AI-Assisted Research. This piece was researched and written with AI assistance, reviewed and edited by Manu Ayala. For deeper takes and the perspective behind the research, follow me on LinkedIn or email me directly.

Claude Opus 4.7 vs Sonnet 4.6 When to Use Which

What separates Opus 4.7 from Sonnet 4.6 in actual legal work

The routing matrix: which legal tasks get Opus vs Sonnet

Pricing reality: what the routing decision saves

Operational patterns: how firms actually deploy both models

Recommendation by firm profile

Frequently Asked Questions

Related Across AI Vortex

Need help with AI infrastructure?

What separates Opus 4.7 from Sonnet 4.6 in actual legal work

The routing matrix: which legal tasks get Opus vs Sonnet

Pricing reality: what the routing decision saves

Operational patterns: how firms actually deploy both models

Recommendation by firm profile

Frequently Asked Questions

More from Comparisons

Related Across AI Vortex

Need help with AI infrastructure?