Mid-market firms — call it the 50-500 lawyer band — sit in the worst negotiating position in legal AI procurement right now. They're too small to negotiate the Freshfields-style co-build Anthropic announced on April 23, 2026, and too large to run on consumer-tier Claude Pro without controls. Per Anthropic's pricing page, the relevant tiers are Team Standard at $20-25/seat/month and Team Premium at $100-125/seat/month, with Enterprise at $20/seat/month + usage at API rates. This checklist is the procurement memo a managing partner can hand to IT and finance — every line item references current pricing, current vendor posture, and the operational decisions that sit underneath the contract.
Step 1: Pick the deployment surface before picking the tier
Anthropic ships Claude through five surfaces: claude.ai (Pro, Max, Team, Enterprise), the Claude API, AWS Bedrock, Vertex AI on Google Cloud, and Microsoft Foundry. Model behavior is identical across surfaces. Deployment posture (data residency, SLA, audit trail handling, procurement velocity) differs materially.
For mid-market firms, the usual choice collapses to two options:
- claude.ai Team or Enterprise if the firm doesn't have a deep AWS, Azure, or GCP relationship. Fastest to deploy. Anthropic handles infra. Admin controls and data-protection commitments come with Team tier and above. - Microsoft Foundry if the firm runs Microsoft 365 across the practice. Per Anthropic's deployment options, Foundry inherits the existing Azure compliance posture, which means the IT security review is shorter because most of the procurement work was done when the firm signed the M365 enterprise agreement.
The second-order read: 90%+ of law firms run Microsoft 365. For firms in that majority, Foundry is the structurally cheapest procurement path because it skips the new-vendor security review. AWS Bedrock is the right choice for AWS-native firms (rare in legal but real). Vertex AI is the right choice for the small minority running Google Workspace primary.
Step 2: Match the tier to actual workload, not headcount
The default mistake at this firm size: provisioning Team Standard ($20-25/seat/month) seats for every lawyer based on headcount math, then watching usage concentrate in a small subset of practitioners.
The correct math runs the other direction. Per Anthropic's pricing, tier breakdown:
- Claude Pro at $17/user/month annual or $20/user/month monthly — consumer tier, no Team admin controls. Not appropriate for firm work where matter confidentiality requires explicit data-protection commitments. - Claude Team Standard at $20/seat/month annual or $25/seat/month monthly for 5-150 seats — admin controls plus explicit no-training commitment on team inputs. Floor for firm deployments. - Claude Team Premium at $100/seat/month annual or $125/seat/month monthly — premium seat tier with higher usage caps. Right for power users (e.g., M&A diligence partners running multi-day multi-session memory work). - Enterprise at $20/seat/month + usage at API rates with custom terms — for firms negotiating advanced security, compliance, and data residency requirements.
For a 200-lawyer firm: provision Team Standard for the full headcount as the floor, then upgrade specific power users to Premium based on observed usage. A reasonable starting split: 80% Team Standard ($20-25/seat), 15% Team Premium ($100-125/seat), 5% on dedicated API access for internal-tool builds. Total monthly run-rate at this split lands around $7,500-9,000/month for a 200-lawyer firm — $90,000-108,000 annual. The Anthropic Cowork pricing breakdown walks through the per-seat math by deployment scenario.
Step 3: Lock down the AI use policy before activating seats
Procurement that ships seats before policy invites the worst version of every AI risk simultaneously: associates pasting privileged context into consumer chat, hallucinated citations landing in court filings, scratchpad files containing matter-specific reasoning sitting on personal devices.
The baseline policy stack (each item gets a clause, not just a mention):
- Approved deployment surfaces. Name them. Claude Team or Foundry, not consumer Pro. Per US v. Heppner (SDNY, Feb 17, 2026), consumer Claude doesn't carry attorney-client privilege protection. The policy makes the deployment surface explicit so associates can't default to the consumer product they used at home. - Effort levels named, not just model names. Claude Code defaults to xhigh on paid plans. AI policies that name "Claude Opus 4.7" without naming effort levels are stale on day one of 4.7's release. - Citation verification step. Every AI-generated case citation gets verified against Westlaw, Lexis, or Google Scholar before filing. Per the Charlotin AI Hallucination Cases Database, 1,227 documented sanctions cases globally as of early 2026, accelerating at roughly 5-6 new cases per day. - Scratchpad and notes file storage. The Opus 4.7 multi-session memory feature lets Claude hold context across sessions via a scratchpad file. Those files contain matter-specific reasoning. They're firm data assets and need explicit storage, retention, and access controls in the policy. - Disclosure in court filings. Per the federal court AI disclosure landscape, 300+ federal judges have AI-related standing orders or local rules. The firm policy needs to name which courts require disclosure and the firm's default disclosure language. - Engagement letter language. Clients need to know what AI use looks like on their matters. The policy specifies the default engagement letter clause.
The firm AI policy template spoke covers the framework with the necessary clauses.
Step 4: Run a 60-day pilot with measurable outcomes before firm-wide rollout
Procurement that ships firm-wide on day one without a pilot stage learns its lessons in production. The cleaner pattern: 60-day pilot with 15-25 lawyers across two practice groups, with explicit measurable outcomes defined before the pilot starts.
Pilot scope basics:
- Two practice groups, not one. Mixing transactional and litigation surfaces different model behaviors and uncovers integration issues a single-practice pilot misses. - 15-25 lawyers, mixed seniority. Partners, senior associates, mid-level associates. Senior associates produce the highest signal because they have enough institutional context to catch model errors and enough individual workflow to actually use the tool day-to-day. - Three to five named workflows. Examples: NDA triage on incoming client work, first-pass discovery document review, due diligence memo drafting, contract clause comparison, deposition prep summary. Don't pilot "general AI use" — pilot specific workflows with specific success criteria. - Measurable outcomes defined before pilot starts. Hours saved per matter, error rate caught at QC, billable rate impact, partner-reported quality assessment. Per integrity rule: don't fabricate ROI. Measure what actually moved. - Weekly debrief, written. The pilot's output isn't the rollout decision — it's the documented playbook the rest of the firm uses when seats provision firm-wide.
The second-order read: a documented pilot becomes the artifact partners use to defend the procurement decision to their finance committee. That artifact has more value than the pilot's actual workflow output.
Step 5: Layer Cowork legal plugin and TR CoCounsel where the practice mix demands it
Anthropic's Claude alone covers the model layer. Mid-market practice mixes usually need at least one of two adjacent layers depending on workload:
- Cowork legal plugin for contract review and NDA triage if the firm does meaningful transactional volume. Per the Cowork legal plugin overview, the plugin is open-source and free — the cost is the underlying Claude subscription. For firms with 20%+ transactional practice, this layer pays for itself in the first month of deployment. - Thomson Reuters CoCounsel post-rebuild for litigation practices that need Westlaw or Practical Law content embedded in the AI workflow. The rebuilt CoCounsel runs on Anthropic models with proprietary research content embedded. Industry observers report tier prices of $75 (On Demand) up to $500 (All Access) per user/month per Costbench March 2026 and Above the Law August 2025 coverage, not vendor-confirmed. Verify direct with TR sales before quoting.
For a 200-lawyer mid-market firm with mixed practice (50% transactional, 30% litigation, 20% regulatory), a reasonable stack: Claude Team across the firm, Cowork plugin configured for the transactional group, CoCounsel Core or Westlaw Precision + CoCounsel for the litigation partners. Estimated annual: $90,000-108,000 (Claude) + $0 (Cowork plugin) + $50,000-100,000 (CoCounsel for 20-40 litigation partners) = roughly $140,000-208,000 total annual run-rate.
That number is the answer to the partner question "what does AI deployment cost us?" — for a firm that wants the model layer plus content depth on top, in 2026 dollars. The Cowork vs Microsoft Copilot vs Spellbook vendor war analysis walks through which adjacent layers fit which practice mixes.
Step 6: Build the metrics dashboard before activation, not after
Most procurement decisions get justified retroactively because nobody instrumented the baseline. The cleaner pattern: instrument the dashboard before activation so the numbers are defensible six months in.
Metrics to capture day one:
- Per-matter AI consumption. Token spend per matter via API logs or claude.ai admin export. Required for matter-cost recovery if the firm bills AI usage back to clients. - Workflow time-to-completion. Pre-AI vs post-AI cycle time on the named pilot workflows. The number that goes in the partner-defense memo six months later. - Citation verification catch-rate. How often did the verification step catch a hallucinated citation? Per GPT-5.5's calibration improvements, models are getting better but the verification step is non-negotiable. - Bing AI Performance grounding queries. Free per Bing Webmaster Tools. Shows which queries surface the firm's content via Microsoft Copilot citations. The dashboard is invisible to firms that haven't enabled it. - Engagement letter compliance. Track which matters have AI-disclosure language in the engagement letter vs which don't. Required for the audit trail.
The second-order read: dashboards aren't optional infrastructure — they're the artifact that decides whether the firm renews the procurement contract at the end of year one. Instrumenting before activation costs about a week of legal-ops engineering time. Instrumenting after activation costs the entire defensibility of the original decision.
The Bottom Line: My take: Mid-market firms have the cleanest procurement runway right now if they sequence the steps. Pick deployment surface first (Foundry if Microsoft 365 native, claude.ai Team if not). Match tier to workload (80% Team Standard, 15% Premium, 5% API). Lock down the policy before activating seats. Run the 60-day two-group pilot with measurable outcomes. Layer Cowork plugin and TR CoCounsel where practice mix demands. Instrument the dashboard before activation, not after. The total annual run-rate for a 200-lawyer firm with the full stack lands around $140,000-208,000 — defensible, predictable, and structurally cheaper than co-build deals the firm can't access at this size.
AI-Assisted Research. This piece was researched and written with AI assistance, reviewed and edited by Manu Ayala. For deeper takes and the perspective behind the research, follow me on LinkedIn or email me directly.
