· Michael Avdeev · Insights · 5 min read
How to Conduct Data Due Diligence for Healthcare M&A
Consider this composite case—a pattern I see regularly in healthcare M&A.
A PE firm acquires a 40-physician dermatology practice. Standard rollup play—buy, consolidate, optimize, repeat.
Three weeks post-close, the IT integration team finds a network drive nobody mentioned during due diligence. On it: 15 years of patient records from a practice acquired years earlier. Scanned intake forms. Insurance EOBs. Lab results. Hundreds of thousands of patients.
The practice had been “sold” multiple times. The data followed each time. Nobody ever cleaned it up.
The question every acquirer eventually asks: “How do we make sure this doesn’t happen again?”
The answer isn’t complicated. But almost nobody does it.
Healthcare M&A Is Booming—And So Is the Risk
PE healthcare deals hit $115 billion globally in 2024—the second-highest year on record. In specialties like dermatology, ophthalmology, and gastroenterology, PE involvement now exceeds 30%. Dental, behavioral health, urgent care—if it’s a fragmented specialty, someone’s rolling it up.
The problem: healthcare practices are data disasters.
- EMR migrations leave orphaned databases
- Scanned paper records live on random file shares
- Billing exports get saved “just in case”
- Staff turnover means nobody remembers where things are
- Fifteen years of HIPAA exposure sitting in a closet server
When you buy a practice, you buy all of it. Including the PHI nobody told you about.
What Traditional Due Diligence Misses
Standard M&A cyber due diligence looks at:
- Security policies and procedures
- Incident history
- Network architecture
- Vendor relationships
- Insurance coverage
What’s missing? A complete inventory of where PHI actually lives.
Not where the target thinks it lives. Where it actually lives.
I’ve seen practices pass due diligence checklists with flying colors—then turn up 2TB of unencrypted patient data on a shared drive labeled “OLD STUFF DO NOT DELETE.”
The checklist didn’t ask. Nobody looked.
The Healthcare-Specific Data Risks
Healthcare data has unique characteristics that make it particularly dangerous in M&A:
1. PHI Has Long Tails
Unlike credit card numbers (which expire), medical record numbers, diagnoses, and treatment histories are sensitive forever. That 2009 patient file? Still a HIPAA violation in 2026.
2. Paper-to-Digital Transitions Created Chaos
When practices went digital, they often scanned everything and dumped it on file shares. The paper got shredded. The digital copies were never organized, classified, or governed.
3. Multiple EMR Migrations
A practice that’s been around 20 years might have used 3-4 different EMR systems. Each migration left data behind. Exports, backups, “just in case” copies.
4. Departed Physicians Take (and Leave) Data
Physicians who left the practice might have copied patient panels to personal drives. Physicians who joined brought data from previous practices. It’s all intermingled.
5. HIPAA Liability Follows the Data
When you acquire a practice, you become responsible for all PHI—including data you didn’t know existed. OCR doesn’t care that you just bought the place.
How to Actually Conduct Healthcare Data Due Diligence
Here’s the process I recommend for healthcare M&A:
Pre-Close: Discovery Scan
Before the deal closes, run a comprehensive data discovery scan across:
- File shares: Network drives, NAS devices, SharePoint
- Cloud storage: Box, Dropbox, Google Drive, OneDrive
- Legacy systems: Old servers, archived databases, backup tapes
- Endpoints: Workstations, especially long-tenured staff
- Email: PST archives, shared mailboxes
You’re looking for PHI: patient names paired with SSNs, medical record numbers, diagnoses (ICD-10 codes), insurance information, treatment notes.
What You’ll Find
In typical healthcare data discovery engagements, most PHI lives where you’d expect—the EMR, the billing system. But a significant portion surfaces in “known unknown” locations: old file shares everyone forgot about, archived databases from previous systems.
The real surprises come from completely unexpected places—personal folders, email attachments, temp directories. That category is where deals get repriced—or killed.
Red Flags to Watch For
- Large archive folders with names like “OLD”, “BACKUP”, “DO NOT DELETE”
- PST files over 1GB (email archives full of attachments)
- Scanned document folders from paper-to-digital conversion
- Exports from previous EMR systems
- Personal folders of departed physicians
- Shared drives with wide-open permissions
Post-Discovery: Remediation Assessment
Once you know where PHI lives, you can assess:
- Scope: How many records? How many patients?
- Sensitivity: SSNs? Mental health? Substance abuse? HIV status?
- Exposure: Who has access? Is it encrypted?
- Remediation cost: What will it take to clean up?
This becomes a deal term. Either the seller remediates before close, the buyer gets a price reduction, or there’s an escrow holdback.
Why Speed Matters
Healthcare M&A timelines are tight. PE firms move fast. You don’t have six months to deploy a platform and run discovery.
You need answers in days, not quarters.
This is exactly why we built Risk Finder as a containerized scanner. Pull the Docker image, point it at the target’s infrastructure, get results. No agents. No complex deployment. No data leaving the environment.
For healthcare specifically:
- 150+ classifiers including ICD-10 codes, MRNs, NPIs, DEA numbers
- PHI detection across structured and unstructured data
- OCR scanning for scanned intake forms and faxes
- DICOM support for medical imaging metadata
The Deal Math
Let’s say you’re acquiring a practice for $15M. Due diligence costs maybe $200K total—legal, financial, operational.
Adding a proper data discovery scan costs a fraction of that. And finding a hidden data liability before close could save you:
- Remediation costs: $500K-$2M for a significant PHI exposure
- Regulatory fines: $100-$50,000 per record for willful HIPAA violations
- Deal repricing: 5-10% purchase price reduction is common
- Deal failure: Walking away from a deal that looked good on paper
The ROI isn’t hard to calculate. The cost of not looking is almost always higher than the cost of looking.
What Smart Acquirers Do Now
After surprises like this, smart PE firms change their process. Now they run a data discovery scan on every healthcare acquisition before close. It’s a standard line item in due diligence, right next to the financial audit.
What they’ve found since:
- A behavioral health practice with 10 years of unencrypted psychotherapy notes on a shared drive
- A dental group with patient SSNs in an Excel file emailed between 40 locations monthly
- An ophthalmology chain with DICOM files containing patient demographics in image metadata
None of these killed deals. But all of them changed deal terms.
Better to know before you sign than discover after you own it.
Start Before Someone Else Does
If you’re doing healthcare M&A—as a PE firm, health system, or practice acquirer—data due diligence isn’t optional anymore.
The question isn’t whether hidden PHI exists. It’s whether you find it before close, or after.
Scan before someone else does.