DaveWentzel.com            All Things Data

Data Quality

The NewThe New York Observer:
I was consulting with a healthcare IT vendor and one of their attorneys told me a story about a woman in labor who showed up at the hospital without her identity card (driver's license).  The hospital receptionist pulled up her data on their admissions system and turned her away because, according to their records, she already had the baby six months earlier.  Clearly this was a miscarriage (pun intended?) of justice/insurance/ObamaCare.  Turns out someone with some common sense stepped in and admitted her anyway.  I assume baby and mom were non-the-worse-for-wear.  Turns out her identity was stolen and apparently some other pregnant woman had a baby using her insurance/identity.  
Bizarre?  Not as bizarre as you may think if you've ever worked in healthcare IT.  Healthcare data quality is notoriously horrendous and, surprisingly, there's often some very good reasons for that.  
My Story Is Almost as Funny/Disturbing
(names, locations, and events have been changed to protect the guilty.  Otherwise, this is a true story...you can't make this stuff up!!!)
I worked on a project a long time ago to remove duplicate patients/medical records in an FDA-regulated medical "device" used at hospitals in a large metropolitan area.  Each hosptial/acute care facility/clinic/nursing home/etc had its own "device".  And each "device" had its own little database of patient information.  In the biz we call this the "medical record."  
This large metropolitan area has its own unique socialized medicine system that has evolved from what was once a bunch of independent, mostly religious-affiliated hospitals (presbyterian and jewish hospitals, for instance).  The oldest hospital in the system is close to 300 years old.  As you can imagine, after that many years of mergers, acquisitions, and consolidations there were a lot of duplicate patient records in each of these little device databases.  This causes problems, both from clinical and billing perspectives.  If PatientA gets a prescription at Hospital X and then goes to Hospital Y who has no knowledge of the prescription, possible drug interactions may occur.  You can imagine, I'm sure, what billing issues would arise from this as well.  Wouldn't it be nice if we could have a "best-of-breed" billing system that alerted each facility/device database/medical record of updates?  
Anyway, I was tasked with merging patient records.  I'm a firm believer in agile so I wanted to deliver something quickly that met a subset of the requirements so that user-acceptance testing could begin.  There was no magic with this project.  Very simply, using a simple algorithm I was able to merge patient records for "John A Smith" and "John Smith Jr" if they had 11 data points that were similar other than name.   Data points might be SSN, address, DOB. The hard part was merging the records together which, unfortunately, wasn't merely a case of changing MedicalRecordID 1234 to 6789.  (Bad normalization at play).  Basically, pretty simple requirements, pretty simple code.  
Within a couple of sprints I was done, QA signed off, and the "customer" was thrilled. The rollout went smoothly.  Since we had to merge MILLIONS of medical records the code took a "creepy/crawly" approach...it built some cursors (yes, there are good uses for cursors) and slowly started migrating data, ensuring we had no possibility of deadlocking or performance issues, migrating "active" data first, worrying about "archived" (read dead people) data later.  We could never do this during downtime (no such thing in the clinical healthcare space).  With smaller data sets the "devices" could all now communicate with each other and share patient records real time.  Performance was better, data quality was better, we even reclaimed disk space.  
So, where's the problem?
I arrived at work one day a month later and was told by a frantic VP that I needed to immediately initiate a rollback of the code.  He was out-of-breath and clearly frazzled, saying my code was faulty, etc etc, all the while gasping for air.  I tried to remain calm, "What's wrong? Is it something we can fix before we take drastic measures?"
"Heroin addicts can't get their methadone!"  
I kid you not, that was the exact statement.  "Come again?  Heroin addicts can't get their what?"  My mind was trying to figure out how my change could affect heroin addicts.  
The Problem
We did lots of testing so I felt confident in my code.  But at times like this I tend to panic too.  When you deal with FDA regulations you learn not to cut corners with QA.  Some of us developers pay off the DBAs with beer, in healthcare you pay off the QA folks (but not in beer).  You want QA to be coherent and you want them finding every bug.  You don't just pass off your code to QA, you make sure QA understands what you did and why.  Pairing and scrum are necessities in regulated industries.  
Pretty soon conference calls were arranged and I learned about the true nature of the problem.  In the good ol days before my code was in place, "John A Smith" would register at Hospital X and receive his methadone.  His patient record at Hospital A would be updated to reflect that his next methadone treatment was scheduled for x days.  Along with that he was given a subway token to get back home. How thoughtful.  
Ah, but nothing inspires ingenuity like a drug addict trying to get his next fix.  Entire TV shows are based on this simple premise.  So, instead of going home Mr Smith would use his subway token to go to Hospital B where he was also registered, albeit under John Smith Jr and a slightly different address or DOB.  Our Mr Smith would get his allotted methadone, again, and another subway token where he would go to Clinic C.  
Rinse and repeat, always repeat.  
After a month of my code being in place, with its "creepy/crawly features" the formerly-over-medicated patients were now experiencing rapid detox.  Perhaps I should take a step back.  Heroin is a dangerous, addictive opiate.  Methadone is a synthetic opioid that helps ween addicts off heroin if dosed carefully.  But, when taken in excess is just as addictive as heroin and can lead to overdose.  (I hope you learned something about drug culture in this blog post...you can now wow your friends with your new street cred).  
Now, I'm a very sheltered guy and it came as a complete surprise to me that rapidly detoxing drug addicts aren't a good thing to have running around a large metropolis.  Much better to have them over-medicated than under, I suppose.  
No one, from the customer (read government politicians who run this city's socialized medicine), to the BAs, to the QA people, to me, realized that de-duplicating data like this would be a problem...until it was a problem.  Only the hospital staff knew this would be a problem, having experienced these issues for years...but no one bothered to ask those users their opinion on some benign system change like purging duplicate medical records.  
In fact, months later we learned that it was a conscious decision by some of these clinics to ensure that patients were in there systems numerous times.  The system would limit the methadone dosing for patients that clearly needed more.  To override a dosing schedule would've put the prescriber on a "red list".  The easiest way to get a patient a bigger dose is to make a duplicate of the patient.  Once hospital staff worked with our software for years they knew the idiosyncracies that would allow them to enter a duplicate patient record by, say, altering an address from "1357 W 98th St" to "1357 W 98th Av".  
In Summary
A reasonable person might give me plaudits for my beautiful code that stopped this fraud...even if it was by accident.  But we're dealing with with the irrational.  The politicians were livid because, even though they now had more accurate medical records, and properly-medicated patients, they had a whole bunch of wild heroin addicts that couldn't get their overdose of methadone.  I had the change backed out in 48 hours and we eventually coded a system that eliminated duplicate patients from the devices...unless they were on methadone.  Those dups stayed.  
So, basically, we turned a blind eye to data quality...not to mention proper health care.  
Again, this story is modified to protect the guilty.  I assure you the actual circumstances were just as ridiculous.
True story... York Observer:
Lisa Schifferle, an attorney with the FTC, told me a story about a pregnant woman who after having her identity stolen 
showed up at the hospital in labor and was turned away because according to the hospital's records she had already had a 
baby six months earlier.
That's nothing.  My story is better.  I worked on a project to remove duplicate patient names in an FDA-regulated medical 
device used at many hospitals in NYC...which has its own unique socialized medicine system.  Using heuristics I was able 
to merge patient records for  John A Smith and John Smith Jr if they had 11 data points that were similar other than name. 
 The requirements and code were signed-off by all parties and everyone was happy.  
Until 2 weeks after it was "turned on" at all of the hospitals.  I arrived at work and was told that if I didn't roll back 
the change I would be fired on the spot.  I asked what was wrong.  
"Your system is working too well and our heroine addicts can't get their methadone!"  
Seriously, that was the statement.  Previously John A Smith would register at Hospital A and receive his methadone and a 
subway token to get back home and his patient record would be updated to reflect that his methadone was scheduled for 
refill in x weeks.  Instead of going home Mr Smith would use his subway ticket to go to Hospital B where he was registered 
as John Smith Jr and would receive his allotted methadone and a subway token.  He would take his token to Hospital C where 
he was registered as John A. Smith Jr ...you get the point.  
Now, one would think I should get plaudits for stopping this fraud...even if it was by accident.  Nope.  The politicians 
were livid because they now had more accurate hospital records but a whole bunch of wild heroine addicts that couldn't get 
their overdose of methadone.  I had the change backed out in 48 hours and we eventually coded a system that eliminated 
duplicate patients from the systems...unless they were on methadone.  
True story...

You have just read "Data Quality" on davewentzel.com. If you found this useful please feel free to subscribe to the RSS feed.  

Add new comment