What can I do with all this data?

Wed Apr 25th 2012

There are times when you might have lots of available data, but you don’t know how to get it to give you the answers that you need. This situation can come up a lot because you had the foresight of anticipating certain problems to occur before they happened. Therefore, the data that has been collected may or may not be directly relevant to the problem you now want to solve, but they were gathered “just in case”.

The first step is to determine what the problem of interest is. Identifying more specific questions of interest further refines the process. Next is to figure out the data needs to address the problem at hand. Is the existing data sufficient for the data needs? If we are lucky, then no additional data collection is needed, and we can proceed directly to analysis.

However, if the data is missing some elements of what we would like to have, we may want to step back and try to reformulate the problem of interest first, before jumping into another data collection phase. Are there similar, but less stringent, questions that can be investigated instead? The data may give you weaker results, but they can be used to provide an initial assessment of the problem at hand. Imagine if the initial assessment gives you something completely unexpected! Re-doing a more detailed data collection procedure could result in a waste of time in that case. Instead, it would have been better to consider developing questions that ask why the data speaks to different expectations and devote the data collection efforts in that direction.

The last step is data analysis, which, again, can vary in terms of the complexity of the Statistics involved. Like the scenario where you need to gather data, assumptions still need to be tested so that an appropriate analysis procedure can be applied.


