Does Harvey AI train on law firm data?

No. Harvey explicitly commits to not training its models on client data. Your firm's documents, prompts, and outputs are processed but not used for model improvement. This is a critical distinction from free-tier AI tools (ChatGPT Free, Claude Free) where conversations may be used for training. Get this commitment in writing in your DPA.

Is Harvey AI SOC 2 certified?

Yes. Harvey maintains SOC 2 Type II certification, meaning its security controls have been independently audited over a sustained period. This covers data handling, access controls, encryption, availability, and incident response. Request the latest SOC 2 report from Harvey during your evaluation.

Can using Harvey AI waive attorney-client privilege?

Not if Harvey's data handling terms are properly structured. Harvey's no-training commitment and data isolation are designed to prevent privilege waiver. However, using free-tier AI tools that train on conversation data could constitute privilege waiver under evolving case law. Your DPA should explicitly address privilege protection.

Is it safe to upload client documents to Harvey AI?

With Harvey's enterprise plan — yes, with appropriate precautions. Harvey's data isolation, no-training commitment, and SOC 2 certification provide adequate security for most client documents. However, always verify your DPA terms, implement firm-wide AI use policies, and assess whether specific documents (trade secrets, national security) require additional handling.

What data privacy certifications does Harvey AI have?

Harvey holds SOC 2 Type II certification and runs on Microsoft Azure infrastructure (which carries FedRAMP, ISO 27001, and SOC 1/2/3 certifications). Harvey offers data residency options, encryption in transit and at rest, and role-based access controls. Request their full security documentation and most recent audit reports during evaluation.

Harvey AI Security & Data Privacy: What the Contract Says

When a law firm evaluates Harvey AI, security and data privacy aren't features — they're deal-breakers or deal-makers. Harvey handles the most sensitive data in any profession: attorney-client privileged communications, trade secrets, M&A intel, and litigation strategy. A single data breach or unauthorized model training on client data could trigger malpractice claims, bar complaints, and client exodus.

Here's what Harvey's enterprise security actually includes, how it compares to consumer AI tools your attorneys might already be using, and what your DPA should cover. The Heppner ruling context — where courts are increasingly scrutinizing AI data handling — makes this non-negotiable.

Harvey AI security infrastructure and certifications

Harvey's enterprise security framework includes the controls that large firm CISOs require before approving any SaaS platform:

SOC 2 Type II compliance: Harvey maintains SOC 2 Type II certification, meaning their security controls have been independently audited over a sustained period — not just a point-in-time snapshot. This covers data handling, access controls, encryption, availability, and incident response.

Azure infrastructure: Harvey runs on Microsoft Azure's enterprise cloud infrastructure, which provides the underlying security layers — physical data center security, network isolation, DDoS protection, and geographic data residency options. Azure's compliance portfolio (FedRAMP, ISO 27001, SOC 1/2/3) provides additional assurance layers.

Data isolation: This is the critical one for law firms. Harvey maintains logical data isolation between client organizations. Your firm's data is separated from other Harvey clients' data at the application layer. Custom agents, uploaded documents, and conversation history are siloed to your firm's environment.

Encryption: Data encrypted in transit (TLS 1.2+) and at rest (AES-256). This is table stakes for enterprise SaaS but still worth confirming in your DPA.

Access controls: Role-based access within your firm's Harvey instance, SSO integration with your firm's identity provider, and audit logging of all user activity. Managing partners and IT administrators can see who accessed what and when.

Incident response: Harvey maintains a documented incident response plan with notification timelines. Your DPA should specify notification windows (ideally 24-72 hours) for any breach affecting your firm's data.

Does Harvey AI train on your client data?

No — Harvey does not train its models on client data. This is Harvey's explicit commitment and the single most important security feature for law firms.

Here's how it works technically: when your attorneys use Harvey, the inputs (documents, queries, prompts) and outputs (analysis, drafts, extracted data) are processed by the model but not retained for model training purposes. Your firm's data improves your firm's experience (through conversation context and agent customization) but doesn't leak into Harvey's base model or other firms' agents.

This is fundamentally different from how consumer AI tools operate:

ChatGPT Free: OpenAI's terms explicitly allow using conversations for model training unless you opt out. Any client data entered into ChatGPT Free may be used to train future models — a clear attorney-client privilege risk.

Claude Free: Anthropic may use free-tier conversations for model improvement. The Pro and Team tiers have stronger data commitments, but they're still consumer-grade compared to Harvey's enterprise framework.

Harvey Enterprise: Your data is your data. Period. The DPA should explicitly state: no model training on client data, no data sharing across organizations, no retention beyond operational necessity, and client-directed deletion upon contract termination.

Why this matters legally: Under the evolving case law around AI and privilege (including the Heppner ruling context), inputting privileged information into an AI system that trains on that data could constitute waiver of attorney-client privilege. Harvey's no-training commitment is designed specifically to avoid this risk.

Harvey AI vs consumer AI tools: privacy comparison

The privacy gap between Harvey and the AI tools your attorneys might already be using is significant:

Harvey AI (Enterprise): - No model training on client data - SOC 2 Type II certified - Data isolation between organizations - Custom DPA with law-firm-specific terms - Audit logging and access controls - Data residency options - Contractual deletion obligations

Claude Team ($25/month): - Anthropic commits to not training on Team/Pro data - No SOC 2 certification for the Team tier - Shared infrastructure (no dedicated data isolation) - Standard business terms (not law-firm-specific DPA) - Basic usage logging - No data residency options - Standard data retention policies

ChatGPT Team ($25/month): - OpenAI commits to not training on Team data - SOC 2 Type II certified - Shared infrastructure - Standard business terms with limited DPA customization - Admin console with usage logging - Limited data residency - Standard retention policies

The risk gradient: Harvey provides the strongest data protection. Claude Team and ChatGPT Team are acceptable for non-privileged legal work but lack the enterprise isolation and custom DPA terms that large firms require for sensitive client data. Free tiers of any AI tool are never appropriate for client data.

Firms using consumer AI tools for client work should at minimum: require Team/Enterprise tiers, implement firm-wide AI use policies, and restrict privileged or highly confidential information to Harvey or similar enterprise platforms.

What your Harvey AI DPA should include

Your Data Processing Agreement with Harvey should cover these specific provisions — and your firm should negotiate them, not just sign Harvey's standard form:

1. No model training clause: Explicit prohibition on using your firm's data (inputs, outputs, documents, prompts) for model training, fine-tuning, or improvement of Harvey's base models or any third party's models.

2. Data isolation specifics: Technical description of how your firm's data is isolated from other Harvey clients. Logical isolation vs physical isolation. Whether your firm's data resides on shared or dedicated infrastructure.

3. Subprocessor disclosure: Full list of subprocessors who may access your firm's data — including cloud providers (Azure), model providers, and any third-party services. Notification requirements for subprocessor changes.

4. Data residency: Where your data is stored and processed geographically. For firms with international clients, data residency may be dictated by GDPR, client requirements, or cross-border data transfer restrictions.

5. Breach notification: Specific timeline for breach notification (push for 24-48 hours). Definition of what constitutes a breach. Your firm's rights to conduct its own investigation.

6. Data retention and deletion: How long Harvey retains your firm's data after processing. Your right to request deletion. Technical method of deletion (logical vs cryptographic erasure). Certification of deletion upon contract termination.

7. Audit rights: Your firm's right to audit Harvey's security practices — either directly or through an independent third-party auditor. Access to SOC 2 reports and penetration test results.

8. Attorney-client privilege protection: Express acknowledgment that data processed through Harvey remains subject to attorney-client privilege and work product doctrine. Harvey's agreement not to disclose data in response to third-party requests without your firm's prior approval (except where legally compelled).

Don't accept a standard-form DPA without reviewing these provisions. If Harvey's standard terms are missing any of these, negotiate them in.

Law firm AI policy: what every firm needs before deploying Harvey

Deploying Harvey (or any AI tool) without a firm-wide AI use policy is malpractice waiting to happen. Here's what your policy should cover:

Approved tools list: Which AI tools are approved for which types of work. Harvey for client matters with sensitive data. Claude Team for internal drafting and research. ChatGPT for non-confidential productivity. Free-tier AI tools for nothing client-related.

Data classification: What data can go into which AI tool. Privileged communications — Harvey only. Confidential client data — Harvey or approved enterprise tools. Non-confidential legal research — any approved tool. Personal/marketing — any tool.

Verification requirements: All AI-generated citations must be independently verified. All AI-generated legal analysis must be reviewed by a licensed attorney before use in any client deliverable. No AI output goes directly to clients, courts, or opposing counsel without attorney review.

Disclosure obligations: When and how to disclose AI use to clients. Some clients require notification. Some jurisdictions have emerging disclosure requirements. Your firm's position should be documented.

Training requirements: Mandatory AI training for all attorneys before access is granted. Annual refresher training. Practice-group-specific training on Harvey agents and workflows.

Incident response: What to do if privileged data is accidentally entered into an unapproved AI tool. Who to notify. How to document the incident. Steps to assess privilege waiver risk.

Firms without these policies are running blind. The ethical obligations around AI use in legal practice are evolving rapidly, and a documented policy is your first line of defense against bar complaints and malpractice claims. The governing rules are Model Rules 1.6 (confidentiality of client information) and 1.9 (duties to former clients) — both apply when attorney-client communications or prior client data enters any AI system. A vendor's SOC 2 certification does not insulate an attorney from a Rule 1.6 violation if the data handling exceeds what the client consented to.

The Bottom Line: Harvey's enterprise security (SOC 2, no model training on client data, data isolation) is genuinely strong — but it only matters if your firm negotiates a proper DPA and implements a firm-wide AI use policy that restricts sensitive data to enterprise-grade tools.

AI-Assisted Research. This piece was researched and written with AI assistance, reviewed and edited by Manu Ayala. For deeper takes and the perspective behind the research, follow me on LinkedIn or email me directly.

Harvey AI Security & Data Privacy: What the Contract Says

Harvey AI security infrastructure and certifications

Does Harvey AI train on your client data?

Harvey AI vs consumer AI tools: privacy comparison

What your Harvey AI DPA should include

Law firm AI policy: what every firm needs before deploying Harvey

Frequently Asked Questions

Related Across AI Vortex

Need help with AI infrastructure?

Harvey AI security infrastructure and certifications

Does Harvey AI train on your client data?

Harvey AI vs consumer AI tools: privacy comparison

What your Harvey AI DPA should include

Law firm AI policy: what every firm needs before deploying Harvey

Frequently Asked Questions

More from AI Tools

Related Across AI Vortex

Need help with AI infrastructure?