Microsoft shipped contract-comparison-with-track-changes inside Word on April 15, 2026 as part of the lawyer-targeted Copilot capabilities release. The capability does three things in sequence: compares two agreements, lists differences and missing provisions, and lands every Copilot edit as an audit-trail tracked change a partner can review and accept or reject. That last piece — the track-changes integration — is the unlock most law firms have been asking AI vendors for since 2023. It's also why Copilot's $30/user/month enterprise add-on is now the procurement floor for AI in contract review at firms running Microsoft 365 (more than 90% of US law firms). Here's how the workflow runs in practice, what it replaces, what it doesn't replace, and how to ship it inside a firm without breaking partner-supervision audit trails.


What the contract-comparison capability actually does inside Word

Open two Word documents — typically a counterparty's draft and your firm's standard form. Click the Copilot pane. Ask Copilot to compare the agreements and flag differences. Copilot does four things in roughly 30-90 seconds depending on document length:

1. Lists structural differences. Sections present in one but not the other, sections renamed, sections reordered. Output lands as a structured summary in the Copilot pane. 2. Lists clause-level differences. Within sections that exist in both, Copilot identifies word-level and concept-level changes, different liability caps, different governing law, different termination triggers, different IP ownership language. 3. Flags missing provisions against a configurable firm playbook. If your firm's standard NDA includes a non-solicitation clause and the counterparty's draft drops it, Copilot surfaces the missing provision with a suggested redline. 4. Drops every edit as a tracked change when you ask Copilot to apply changes. Partners review in the standard Word track-changes pane, accept, reject, or comment per change. The full edit history persists in the document metadata for matter file retention.

The operational improvement over manual redlining is roughly 60-75% time reduction on standard agreements (NDAs, engagement letters, basic services agreements) and 30-50% on complex agreements (M&A purchase agreements, complex licensing, multi-party indemnification structures) per IT director surveys at firms running early Copilot pilots. The accuracy is high enough on standard work to ship without partner-by-partner verification of every flag, but not high enough on complex work to ship without a senior associate review.

Why track-changes integration is the procurement unlock

Every legal AI tool that's tried to compete in contract review since 2023 has fought the same workflow problem: how do you preserve partner-supervision audit trails when AI is doing the drafting? Harvey, Spellbook, CoCounsel, Luminance, and Kira all have versions of contract comparison. Most output to a separate review pane or a separate redline document. The partner reviewing the output has to manually port accepted changes into the working draft.

Copilot's track-changes integration eliminates that step. Every Copilot edit lands directly in the working Word document as a tracked change, same as if a partner or associate had made the edit manually. The partner's review workflow doesn't change. The matter file retention doesn't change. The ethics-committee audit trail doesn't change. Copilot fits inside the workflow law firms already run, instead of asking the workflow to fit inside Copilot.

The second-order effect: the procurement question shifts from "do we adopt a new AI workflow?" to "do we add an AI capability to the workflow we already run?" That's a materially easier sell to risk-and-ethics committees, partner-track approval committees, and IT security review. Most firms can move from contract decision to broad rollout in 90-120 days where Harvey or Spellbook deployments typically take 6-9 months at the same firm.

The third-order effect: federal court AI disclosure standing orders (300+ federal judges have AI-related orders as of 2026) increasingly require firms to identify which sections of court filings were drafted with AI assistance. The track-changes audit trail makes that disclosure mechanical, query the document metadata for Copilot-attributed edits, generate the disclosure statement automatically. Firms running Harvey or Spellbook in parallel with non-track-changes outputs are doing this disclosure manually, with materially higher error risk.

Configurable playbook — how to make Copilot match your firm's standards

Out of the box, Copilot's missing-provision detection runs against general industry-standard contract structures. For a firm with specific standards (a particular NDA structure, specific indemnification language, custom IP carve-outs), the comparison is only as useful as the playbook the firm configures.

Microsoft's approach is to use Microsoft Graph + SharePoint, store your firm's standard form documents in a designated SharePoint location, tag them with metadata identifying document type and use case, and Copilot grounds its comparison against those. Setup involves three layers:

- Document repository. Most firms use a designated SharePoint site collection with subfolders by agreement type (NDAs, engagement letters, services agreements, M&A precedents). Each agreement type has 3-7 representative standard forms tagged with metadata. - Metadata schema. Per Microsoft's Copilot setup documentation, the recommended metadata fields are: agreement type, jurisdiction, last partner review date, approved-by partner, version. This grounds Copilot's missing-provision detection against the right standard for the right context. - Prompt templates. Internal firm prompts ("compare against our 2026 NDA standard," "flag missing carve-outs from our IP policy template") get saved as reusable Copilot prompts inside the firm tenant.

The operational reality: setup is roughly 40-80 hours for a 50-attorney firm, depending on how many agreement types are in scope and how clean the existing form repository is. Most firms front-load the work in a 60-day pilot, focused on the 3-5 highest-volume agreement types (NDAs, engagement letters, master services agreements, software licensing if relevant, real-estate leases if relevant). The Microsoft Graph firm knowledge management spoke covers the SharePoint architecture in depth.

What Copilot doesn't do — and where Harvey/Spellbook still win

Copilot's contract comparison is broad-and-fast, not deep. The capability gaps that matter for serious firm contract review work:

- No specialized clause-library training. Spellbook ships with a precedent learning feature (Spellbook Library) trained on contract precedent. Copilot grounds in your firm's documents but doesn't carry domain-specialized clause intelligence beyond what your repository contains. For specialized practice areas (complex M&A, IP licensing, structured finance), Spellbook or a vertical-specialized tool will catch nuances Copilot misses. - No deep negotiation playbook. Harvey ships with sophisticated negotiation-position playbooks tailored to AmLaw 100 firm practice patterns. Copilot can be configured to your firm's positions but the configuration depth Copilot supports is narrower than Harvey's purpose-built layer. - No standalone clause extraction database. Kira and Luminance build firm-wide clause libraries that get richer with each matter. Copilot operates per-document; the institutional knowledge captured across matters is shallower unless you actively maintain the SharePoint repository. - Limited multi-document comparison. Copilot handles two-document comparison cleanly. For diligence work requiring 50+ document comparison (M&A data rooms, regulatory filings review), Copilot's interface struggles. Harvey's Vault feature and Spellbook's bulk processing are stronger here.

The honest read: for the 70-80% of firm contract review that's standard agreements (NDAs, engagement letters, vendor contracts, basic licensing), Copilot is more than sufficient and the procurement economics are an order of magnitude better than vertical legal AI. For the 20-30% that's specialized or high-stakes (complex M&A, structured finance, multi-party transactions), the vertical tools earn their per-seat premium. Most BigLaw firms will run both. The Copilot ROI vs Cowork vs Harvey comparison covers the per-firm-size math.

Deployment recommendations by firm size and practice mix

Solo practitioners and 1-5 attorney firms: Copilot Business standalone add-on at $18/user/month annually is sufficient. Setup the SharePoint repository with 3-5 standard forms covering 80% of typical engagements. Use the Word contract comparison primarily for NDA review, engagement letter intake, and basic services agreement review. Skip the complex playbook configuration, at this scale, the partner is doing the final review anyway, and Copilot's role is to surface obvious differences faster.

Mid-market firms (10-50 attorneys): Copilot for M365 enterprise add-on at $30/user/month annually. Allocate 60-day pilot to configure the firm playbook for the top 3 agreement types. Roll out to the contract review team first (associates, paralegals, transactional partners). Expand to litigation team for discovery document review use cases (different workflow, same Word integration). Set quarterly review of which agreement types are getting the most Copilot use and where partner-rejection rates are highest, those are the configurations that need refinement.

BigLaw and AmLaw 100 firms: Copilot is the floor, not the only AI surface. Most firms at this scale will run Copilot for the breadth (every associate, paralegal, staff attorney across all practice groups) and Harvey or CoCounsel for the depth (partner-supervised matter work, complex M&A, structured finance, regulatory practice). The configuration question is which agreement types route to which tool, typically standard agreements stay in Copilot, complex agreements over a defined complexity threshold route to the vertical tool. The Copilot procurement process for law firm IT covers the deployment timeline and committee structure.

By practice area: Transactional M&A practices get the most leverage from Copilot's contract comparison + missing-provision detection. Litigation practices benefit more from Teams deposition summaries and Outlook drafting (different Copilot capabilities, same procurement vehicle). In-house counsel, who often have lighter Microsoft 365 deployments than law firms, should evaluate whether their general-counsel-office stack is on Copilot-compatible licensing before adding the SKU.

The Bottom Line: My take: Copilot's contract-comparison-with-track-changes is the single capability that moved Microsoft from "interesting AI optionality" to "procurement floor" for law firm contract review in 2026. The track-changes integration eliminates the audit-trail friction that's blocked vertical legal AI adoption at most mid-market firms. For 70-80% of firm contract work, standard agreements, engagement letters, vendor contracts. Copilot is sufficient at $30/user/month. For specialized M&A, IP licensing, and structured finance, the vertical tools still win. Most firms will run both. The first move is a 60-day pilot with the firm's top 3 agreement types loaded into the SharePoint playbook.

AI-Assisted Research. This piece was researched and written with AI assistance, reviewed and edited by Manu Ayala. For deeper takes and the perspective behind the research, follow me on LinkedIn or email me directly.