FDA’s “Mini-Sentinel” project released a report on the massive amounts of personally-identifiable health data it is collecting about 160 million patients.

According to Alec Gaffney’s summary of the report:

“As of July 2012, the Mini-Sentinel System’s Distributed Database (MSDD) includes information on 160 million individuals, 3.5 billion medication dispensings—more than 45 million per month—and 3.8 billion unique medical encounters.

“For reference,” noted Gaffney, “the entire population of the United States was just 314 million in 2012, meaning the Mini-Sentinel database contained data on more than half the United States.” (Read “At 160 Million Patients, FDA’s Mini-Sentinel Isn’t so ‘Mini’ Anymore“)

Recall that FDA is seeking even more data about U.S. citizens via analysis of social media (read “Is FDA Seeking NSA Capabilities?“), supposedly to monitor its communications with consumers and healthcare professionals.

As the agency casts an ever wider data net, however, there’s no telling how big it’s databases will get and how secure the information will be against hackers. Not only that, but imagine if FDA and NSA combined their databases. The most intimate details of the lives of more than half of all U.S. citizens will be an open book to Big Government.

What kinds of information is the FDA collecting?

“Mini”-Sentinel’s summary report, says Gaffney, “notes that the longitudinal data they have on some patients allows them to track their health over a period of more than 10 years, and half the database for more than one year.”

It is suggested that this data helps FDA track adverse events that only become apparent over a long time. That’s like NSA saying the data it collects helps prevent terrorist attacks. Both agencies, however, have failed to prove that such massive data collection achieves the goals they say it does.

The FDA report (find it here) includes some interesting tables and charts showing outpatient pharmacy dispensings, unique health care visits in the ambulatory or inpatient setting, death records, selected outpatient laboratory test results, age, sex, height, weight, blood pressure, and smoking status data, and unique person identifiers.

Figure 13 shows the number of deaths captured by the system:

Click for larger view.

I’m not sure how to interpret that peak, which corresponds roughly to the period of the recent “great recession.”

Figure 19 shows the mean systolic blood pressure by sex and age group (N=over 7 million patients):

Click for larger view.

This is all very interesting and useful public health information. What worries me, however, is that FDA also has patient identifiable information. I only hope these data are secure — which seems to be an impossible goal these days — and that the NSA doesn’t get its hands on it!