::: nBlog :::

Predicting the future is hard. Some of the earliest and even nowadays most powerful supercomputers are used eg. for weather forecasting and financial market analysis. Accuracy has surely increased, but not even nearly at the pace of the available storage and processing capacity.

As an experiment, New Scientist magazine recently tried to predict the sales numbers of future issues, based on all measurable characteristics – content, timing, colors, correlation to other publications and numerous other parameters, all available and tractable.

Several commercial and academic groups analyzed gigabytes of social media keywords, content correlation with other media, website activity and naturally historical sales data among a number of other factors. Statistical algorithms were tuned especially for this purpose, while some groups used crowd sourcing, eg. a large amount of individuals across the Internet analyzing small chunks of the overall data set.

Although there are privacy concerns with accumulating credit card data, automatic traffic monitoring via license plates – and with metering energy consumption data accurately – the NewSci results were in a way comforting: Most forecasts were thrown off by thousands and were little better than random guesses. People’s behavior is still a great unknown and it may contain some chaotic (and perhaps quantum) elements providing us with new and exciting content even in near future.

So what was the best prediction method (for a while)? Colors. Too much purple is bad. Magazine title in black is good. When massive amount of data is available, regression analysis, one of the oldest statistical methods, gives us valuable information about primal reactions. Utilizing this does not rob us our inherent human unpredictability, but accumulates knowledge.

//Pasi

Leave a Reply

Your email address will not be published. Required fields are marked *

More to explore