Please login to the form below

What's the Big Idea?

Paul Hartigan, CEO, PharmiWeb Solutions discusses Big Data.You can’t escape the topic of Big Data at the moment. It’s everywhere – even the pharma sector has not escaped. Every conference and trade magazine seems to have Big Data speakers and articles. Like this one. But my purpose here is simply to explain what we mean by Big Data, and to provide a couple of real-world examples where it’s made a difference.

Whats the Big Idea
Put simply, Big Data is the name given to huge amounts, or sets, of data, that are difficult to collate and analyse using traditional database management techniques. Take the Large Hadron Collider (LHC) in CERN. Every second, 600 million particles collide in the LHC, and millions of sensors collect data from these collisions. Pretty much everything the LHC does is big, and generating data is no exception.  In fact it creates around 30 petabytes of data each year - a petabyte is 1015 bytes, or a million Gb. A petabyte is equivalent to 20 million 4 drawer filing cabinets filled with text. It’s big.  

The LHC team uses a sophisticated grid computing system to analyse this data, to see if any of these collisions have produced any interesting physics. That's another part of the broader definition of Big Data - all the tools needed to store, search, analyse, visualise and share the data. The thousands of physicists around the world working on the LHC data call upon such tools to structure and sift through this mass of data. The CERN team spent years before the LHC went live putting the organisational structure, teams, tools, techniques, culture - and mindset - in place to help achieve this. Making use of Big Data in itself is a big step.  

Now, in the pharma sector, we may not be looking to unlock the secrets of the universe, but curing disease and improving the well being of patients are equally worthy ambitions, involving processes that create huge amounts of data. Think of areas such as R&D, clinical trials, marketing, pharmacoeconomics, and pharmacoepidemiology - and the floods of data that they produce. Some of this data will already be siphoned off and analysed conventionally. But a Big Data mindset means looking at the data we would normally discard, to see what secrets it might reveal.                    

Take a real example. Washington Hospital Center (DC) was concerned about the number of A&E patients who were readmitting after treatment. They called in Dr. Eric Horvitz from Microsoft Research, who built a model to collect and interpret the data from the records 300,000 patients, over a 9 year period, with 25,000 variables in the dataset. The analysis resulted in many useful findings, some perhaps not too surprising, such as a patient stay of over 14 hours being a red flag for readmission. Others were baffling - a mention of the word 'fluid' anywhere on a patient's chart indicated an above average chance of readmission. These findings helped form the platform for a readmissions manager software application that helps doctors create a readmissions forecast and management plan for the benefit of the individual patient.   And that is a Big Idea.

18th September 2013