What Data Should Never Be Entered Into AI Tools?

Employees should never enter protected health information, personally identifiable information, client confidential data, trade secrets, financial data, legal privileged communications, or authentication credentials into any AI tool that has not been formally approved with appropriate data protection agreements in place.

Direct Answer

The following data categories should never be entered into AI tools without explicit organizational approval, a reviewed data processing agreement with the AI provider, and employee training on acceptable use: protected health information (PHI), personally identifiable information (PII), client confidential or proprietary data, trade secrets and intellectual property, financial data subject to regulatory requirements, attorney-client privileged communications, and authentication credentials or system access information.

Why This Matters More Than Most Employees Realize

AI tools are extraordinarily capable at processing and synthesizing information — but that processing happens on infrastructure controlled by a third party. When data enters an AI tool, it leaves your organization's security perimeter and enters a vendor's environment. Depending on the tool, the service tier, and the terms of service, that data may be:

  • Retained in conversation logs accessible to the vendor
  • Used to train or improve the AI model
  • Accessible to vendor support staff under certain circumstances
  • Subject to the vendor's security posture, not your organization's

For most categories of sensitive data, these conditions are incompatible with your legal obligations, contractual commitments, or organizational risk tolerance.

Data That Should Never Enter AI Tools (Without Proper Agreements)

Protected Health Information (PHI)

Any information that could identify a patient and relates to their health, treatment, or payment is PHI under HIPAA. Examples: patient names, diagnoses, medications, treatment records, insurance information, appointment data.

Entering PHI into a non-HIPAA-compliant AI tool without a signed Business Associate Agreement (BAA) is likely a HIPAA violation regardless of intent.

Personally Identifiable Information (PII)

PII includes: full names combined with other identifiers, Social Security numbers, driver's license numbers, financial account numbers, biometric data, precise geolocation, and similar data elements regulated under GDPR, CCPA, and other privacy laws.

Even partial PII — a name and employer, or an email and date of birth — can constitute regulated personal data under many frameworks.

Client Confidential and Proprietary Data

Client data shared under confidentiality agreements cannot legally be shared with third-party AI tools without the client's knowledge and consent (and often explicit contractual authorization). This includes: client business strategies, financial projections, legal matters, personnel information, and any data marked confidential.

Trade Secrets and Intellectual Property

Proprietary processes, formulas, source code, unreleased product designs, pricing models, competitive intelligence, and business methodologies are trade secrets. Sharing them with AI tools — particularly personal-account tools — potentially exposes them to third parties and may constitute disclosure that weakens legal trade secret protections.

Financial Data Subject to Regulatory Requirements

Material non-public information (MNPI), client financial account data, insider trading-sensitive information, audit work papers, and similar financial data are subject to SEC, FINRA, and banking regulatory requirements. Sharing them with unapproved AI tools may violate these obligations and fiduciary duties.

Attorney-Client Privileged Communications

Legal advice, privileged work product, and communications subject to attorney-client privilege should never enter AI tools. Disclosure to a third-party AI platform may constitute a waiver of privilege, with potentially severe legal consequences.

Authentication Credentials

Passwords, API keys, access tokens, private keys, and similar credentials should never be entered into AI tools under any circumstances. Even "just to test a script" represents a significant security risk if the AI platform logs or retains conversation history.

The Gray Zone: What Employees Often Get Wrong

Many employees assume that if they remove obvious identifiers (names, account numbers), the remaining data is safe to enter into AI. This is often incorrect:

  • De-identified data can be re-identified when combined with other data points.
  • Context alone can identify individuals (a specific job title, department, and medical condition may identify a single person).
  • Aggregated data from regulated datasets may still carry regulatory obligations.
  • Business strategy data is still confidential even when stripped of personal information.

When in doubt, employees should treat data as if it belongs in the "do not share" category.

Best Practices

  • Create a clear, specific written list of data categories employees may not share with AI tools — generic "sensitive data" guidance is insufficient.
  • Provide approved AI tools that have undergone data protection review for common use cases.
  • Train employees with realistic scenarios, not just policy documents.
  • Establish a process for employees to request AI tool evaluation before use.
  • Apply data classification labels so employees can easily identify restricted data categories.

Key Takeaways

  • PHI, PII, client confidential data, trade secrets, financial regulated data, privileged communications, and credentials should never enter unapproved AI tools.
  • Many employees underestimate what qualifies as sensitive data.
  • De-identification does not always make data safe for AI input.
  • The risk is not the AI itself — it is data leaving your organization's control.
  • Clear, specific policy guidance is more effective than general warnings.