While my son was away at school this spring, I asked him how he was doing. “OK dad,” he said, “but I have this cough the last few days.” I didn’t have to search Google or call our physician or tweet about it to learn what may be the underlying problem. I only had to use my nose to know it was allergy season. Given my son’s history, I surmised that was the root cause of his problem.
But public health officials cannot depend upon their noses to make important decisions. They need actionable real time data. How do they get it?
The Centers for Disease Control (CDC) offers the most dependable disease surveillance data. The system depends on reports from partners in state, local, and territorial health departments, public health and clinical laboratories, vital statistics offices, healthcare providers, clinics, and emergency departments. I imagine a lot of paperwork and time are involved.
Some time ago, Google decided that search trends can be used to track diseases such as influenza. It published results of a study of its data in a white paper: “Detecting influenza epidemics using search engine query data” (find it here). They found a strong correlation between search data and CDC data as shown in the chart below:
Now, researchers at the Johns Hopkins Center for Language and Speech Processing have analyzed 2 billion public tweets posted between May 2009 and October 2010 to learn if it is possible to use Twitter to track important public health trends (see “Analyzing Twitter for Public Health“).
The researchers point out the differences between search and Twitter (or other social media) with regard to the intent of the user. “In web search,” says Mark Dredze (one of the researchers; see a video of his presentation of results here), “the user expresses a need for information. Whereas in social media, people actually say something about themselves.” In that sense, it’s easier to conclude that the Twitter poster actually has the flu, whereas the searcher may or may not.
Another advantage of Twitter is that people disclose a lot of information about themselves that can add value to the public health data. This includes information about the drugs they may be taking. That information, of course, is of interest to pharmaceutical companies.
Here are the results from the Johns Hopkins study, which analyzed 1.5 million messages (out of 2 billion total collected) that referred to health matters:
Pharmaceutical companies — and the FDA (see “FDA is Monitoring This Blog and Perhaps You Too!“) — are already mining social media to learn what the public is saying about them, their products, and their competitors’ products (see “Are J&J Agents Trolling for Adverse Events on the Internet?“). But I suspect the technology they are using is relatively primitive compared to that used by the Johns Hopkins researchers.
Alex Butler posed a question during yesterday’s #hcsmeu chat: “Have we been concentrating too much on SM as pure communication and not enough on impact of ‘big data’ to revolutionise health care?” This lead to a lively discussion on the value of “crowdsourcing” to somehow change healthcare. For more on that topic, see “Data Mining in the Deep, Dark Social Networks of Patients.”
I can see the value of social media to do surveillance as was done in the studies mentioned above. Such surveillance certainly helps public health officials deal with certain diseases and other health issues (ie, obesity). But it doesn’t change the fundamental problem of health care, which is the cost burden. To truly “revolutionise” healthcare — IMHO — you have to lower costs and make even rudimentary health care affordable for EVERYONE. But that’s a matter for another post!