The Health Insurance Portability and Accountability Act (HIPAA) establishes national standards for protecting sensitive patient health information. For healthcare organizations, health insurers, and their business associates, DLP plays a vital role in preventing unauthorized disclosure of Protected Health Information (PHI). This guide explains how to test DLP policies for HIPAA compliance and ensure PHI is properly protected across all communication channels.
Protected Health Information (PHI) is any individually identifiable health information that is created, received, maintained, or transmitted by a covered entity or business associate. PHI includes information that relates to an individual's past, present, or future physical or mental health condition, the provision of healthcare, or payment for healthcare — when combined with identifiers that can link the information to a specific individual.
HIPAA's Privacy Rule defines 18 types of identifiers that, when associated with health information, constitute PHI:
| # | Identifier | Examples |
|---|---|---|
| 1 | Names | Full name, maiden name |
| 2 | Geographic data | Street address, city, ZIP code |
| 3 | Dates | Birth date, admission date, discharge date, date of death |
| 4 | Phone numbers | Home, mobile, work |
| 5 | Fax numbers | Any fax number |
| 6 | Email addresses | Personal or work email |
| 7 | Social Security Numbers | SSN in any format |
| 8 | Medical record numbers | MRN, chart number |
| 9 | Health plan beneficiary numbers | Insurance member ID |
| 10 | Account numbers | Patient account numbers |
| 11 | Certificate/license numbers | Professional license numbers |
| 12 | Vehicle identifiers | License plate, VIN |
| 13 | Device identifiers | Medical device serial numbers |
| 14 | Web URLs | Patient portal URLs with identifiers |
| 15 | IP addresses | Connected medical device IPs |
| 16 | Biometric identifiers | Fingerprints, voiceprints |
| 17 | Full-face photographs | Photos that identify the individual |
| 18 | Any other unique identifying number | Custom patient identifiers |
PHI detection is more complex than detecting structured data like credit card numbers. A credit card number has a predictable format — PHI does not. The challenge is that individual data elements (a name, a date) are not PHI by themselves; they become PHI when combined with health information. DLP systems use several approaches:
Advanced DLP policies look for combinations of identifiers appearing together in the same document or transmission. For example: a patient name + date of birth + medical record number appearing near words like "diagnosis," "treatment," or "prescription" strongly indicates PHI. The more identifiers present, the higher the confidence score.
DLP policies for HIPAA include dictionaries of medical terminology — ICD-10 diagnosis codes, CPT procedure codes, medication names, and clinical terms. When these terms appear alongside personal identifiers, the DLP system flags the content as potential PHI.
Many DLP systems can classify documents based on their structure and content. Medical records, lab reports, insurance claims, and prescription forms follow recognizable patterns that DLP can be trained to identify.
Some PHI elements have structured formats that can be detected with pattern matching:
Start by testing whether your DLP detects individual HIPAA identifiers. Send SSNs, medical record numbers, and health plan IDs independently to establish a baseline for pattern-based detection.
Test with combinations of identifiers that constitute PHI. For example, send a payload containing: a patient name, date of birth, SSN, and a diagnosis description. This tests whether your DLP recognizes the combination as PHI even if individual elements wouldn't trigger a policy alone.
Create a test document formatted like a medical record — with patient demographics, visit dates, diagnoses, medications, and provider notes. Upload this as a file to test whether your DLP inspects document content and recognizes the clinical context.
Test with large batches of patient records — 50, 100, 500+ records in a single transmission. This simulates a potential data breach scenario and tests whether your DLP handles bulk PHI detection efficiently without timeouts or performance issues.
Healthcare organizations exchange data in many formats. Test PHI detection in:
Test PHI detection across HTTP and HTTPS. Healthcare organizations often have strict SSL inspection requirements, but some traffic (especially to cloud EHR systems) may bypass inspection. Verify that PHI detection works on both encrypted and unencrypted channels.
Understanding HIPAA's breach notification requirements underscores why DLP testing is critical. Under the Breach Notification Rule, covered entities must notify affected individuals within 60 days of discovering a breach involving unsecured PHI. Breaches affecting 500 or more individuals must also be reported to the Department of Health and Human Services (HHS) and prominent media outlets.
The cost of a HIPAA breach extends beyond fines (which can reach $1.5 million per violation category per year). Healthcare organizations face reputational damage, loss of patient trust, class-action lawsuits, and corrective action plans imposed by the Office for Civil Rights (OCR). Effective DLP — continuously tested and validated — is one of the strongest controls for preventing these incidents.
HIPAA audits (whether conducted by OCR or internal compliance teams) require evidence that technical safeguards are in place and functioning. For DLP, maintain documentation of:
Validate PHI detection across all formats and protocols with our free tools.
DLP Test Tool HIPAA Sample Data