IE11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Chong: Big Predictions for Big Data Impact on Public Health

Big data projects are improving health care in ways that could improve patient care and lower costs, government leaders and medical experts told audiences in Stanford this week at the Big Data Roadshow sponsored by TechAmerica and the Stanford Center for Professional Development.

In fact, 80 percent of public sector leaders are betting on innovations in real time data platforms, analytics and mobility to use big data to their advantage, according to SAP.

"We don’t just want big data per se," said Tom Kalil, the keynote speaker and director for policy for the White House Office of Science and Technology Policy. "The administration’s interest in big data is for the data to bring us knowledge that will bring us to action. There are demonstrated productivity increases for firms that are mastering big data and analytics."

Big data holds tremendous promise for critical public health applications, with the most oft-cited case being the use of massive data crunching on The Human Genome Project, an international effort to sequence and map out all the human genes.

The global healthcare industry generates about 30 percent of the world’s data, according to The Bridgehead Software’s 2011 International Health Care Data Management Survey.

In the health arena, Kalil said big data could help improve the quality of care and lower costs by using new infrastructures for clinical research and drug discoveries that would integrate with major federal initiatives like electronic health records and mobile health applications (m-health).

On education, he pointed to online courses that improve the more students use them, much like the "recommended for you" feature on the Amazon.com website. Big data also is critical in energy initiatives like Smart Grid and energy efficiency, and it provides real time labor market information.

When it comes to fostering big data, Kalil said the federal government’s role would be to invest in R&D projects related to big data technologies and support efforts to expand the big data workforce with data scientists. He described the key policy issues as privacy, transborder data flows, and global applications. He said the federal government has begun to make more federally-controlled data sets available to the public through President Obama’s Open Government Initiative, an executive order Kalil said should be a catalyst for the private sector to innovate using the data.

In March 2012, more than $200 million in grants and solicitations went to R&D for big data projects, such as the DARPA XDATA program, the National Science Foundation and National Institute of Health (NIH) projects, and Department of Energy’s Institute of Scientific Visualization. This last July, the National Institute of Health funded 6-8 Big Data Centers of Excellence for $24 million a year. Kalil said the Centers of Excellence have been tasked to develop better tools for data sharing, integration and management, and to train students and researchers to use and develop data science methods. They intend to create a data catalog for biomedical data that is findable and citable.

To advance big data, Kalil said he hoped big data software will be open source, and that data philanthropy by private companies will occur in cases where a nonprofit group might own data but lack the resources to contribute the data into data pools. The White House wants to help set the national agenda for big data along with groups representing the private sector and the affected industry, like TechAmerica and universities like Stanford and the University of California (particularly Berkeley and Santa Cruz).

On the state level, Dr. Linette Scott, the chief medical officer of the California Department of Health Care Services, highlighted efforts to apply big data to health services. The crown jewel is the $5.4 billion that flowed to California to finance the critical switchover to electronic health records. Scott described the move as an important milestone because patient records previously were locked away in handwritten files that prevented the data from being used to assess the quality of care and cost effectiveness of health care providers. For efficiency, she touted the fact that the state has built only one system to determine eligibility for MediCal and the Covered California health insurance effort. Finally, she added that the introduction of the state’s new Geoportal later this year would continue the push for transparency of its data sets.

About 72 percent of office-based doctors now produce electronic medical records, medical industry consultant Dr. Sujata Iyer told the audience. That data can now be aggregated to reveal trends that were unable to be found when the data was locked into handwritten patient files. For example, Iver said big data can result in personalized medicine for an individual because the cost of sequencing a person’s genome has plunged. By predicting disease susceptibility, a doctor can personalize a disease healing program in a very granular way. Big data can impact bioinformatics, help view system biology, advance patient safety, and mine clinical documentation.
Iyer said big data also is critical in revealing relationships and dependencies to expand scientific knowledge at a fundamental level. She said the five V’s of big data are Volume (sheer magnitude of the data), Variety of the data type, Velocity of speed and transactions, Veracity (data diversity) and Value that can be extracted.

Big data can also discover how medications might affect a person’s health. Dr. Russ Biagio Altman, a professor of bioengineering, genetics and medicine at Stanford, presented a case where big data found a link between two commonly prescribed drugs that caused an increase in glucose production and resulted in diabetes. Microsoft search logs of Internet and social media helped link the taking of both drugs to the symptoms of increased glucose production. Other speakers echoed theme, saying people are quick to tweet how they are feeling and check symptoms on Internet websites like WebMD. By crunching anonymous data regarding tweets, social media and Internet searches, experts can predict epidemics more quickly.

Private sector speakers from SAP and Xerox discussed how big data can help improve cancer treatment by 1000X faster tumor analysis. Big data can also bring users information on physician quality and cost effectiveness, a piece of information gravely missing in our current health environment.

Dr. Jennifer Olsen, who works in the area of Pandemics for Skoll Global Health Fund, said big data has allowed for more rapid identification of pandemics, bringing the detection of a pandemic down from 167 days in 1996 to 23 days in 2009. She said she believed the increasing use of big data sources can bring pandemic detection down to seven days as a goal, saving countless lives. She advocated for data mining of social media, FitBit activity, SMS messaging, microblogging, email, Internet search, social networking and chatting in order to identify a pandemic early. Even a satellite image of a hospital’s parking lot in Latin America can reveal a growing pandemic, along with other data, she said.

Despite the benefits of big data, speakers said barriers remain such as overly blunt privacy and confidentiality protections on personal data in the aftermath of the leaks by former NSA contractor Edward Snowden. The quality of the data, pushing data sets into the public domain, and improved meta data standards are also issues.