Predictive analytics is the use of statistical techniques like predictive modelling, machine learning and data mining based on historical data in order to predict what will happen in the future. Applications of predictive analytics can be found in the fields of financial services, insurance, healthcare, travel, retail and many more. To see how predictive analytics can be used in practice, we take a look at maybe the best predictive technique, Neural Networks.

What is a Neural Network and how does it work?

A Neural network is a predictive analytics technique used for classification or prediction of a variable. A Neural Network is a form of machine learning. It tries to predict values in a way a human brain would.

A Neural Network is made[1] by creating a web of input nodes (which is the start of the network, where you insert the data), output nodes (which show the results/prediction when the data has passed through the network) and a hidden layer between these nodes. The hidden layer between the input and output nodes is what makes the Neural Network so unique and efficient. Every time the Neural Network is ‘fed’ data, the algorithm incorporates the data that passes through it, by giving ‘weights’ to the nodes in the hidden layer, which may alter the outcome in the output nodes.


Figure 1: A simplified Neural Network

In what way does a Neural Network differ from regular predictive tools?

It differs from for example linear regression (which is also a predictive tool) in the way that linear regression models are a lot simpler. Consider a neural network with only one output node, input nodes and hidden layers. If you take away the hidden layers of a neural network, you are left with only input nodes and output nodes. The network then tries to predict the output nodes by using only the input nodes. This is exactly the way linear and logistic regression models try to predict values. The hidden layer of the Neural Network is what makes the network smarter and more accurate than traditional predictive tools, as it ‘learns’ the way a human would, it remembers past connections in data by incorporating this data in the algorithm.

Downsides of a Neural Network

The high accuracy of Neural Networks raises the question why they aren’t used more frequently than they are now. As expected, Neural Networks have a few downsides. Neural Networks require a greater computational power than regular predictive tools, which makes them more expensive. Additionally, Neural Networks need a large amount of data to train them, which isn’t always available. Neural Networks also have a ‘black box’ nature, which means that you can see the data that goes in and the outcome that it produces, but you can’t really grasp what happens in between. This means it is difficult for humans to tweak the algorithm and it is hard to predict what outcome the network will produce in a new scenario.

Example of a practical application in healthcare

A recent study[2] by Stephen Weng, an epidemiologist at the University of Nottingham in the United Kingdom, compared the performance of four machine-learning algorithms in predicting cardiovascular disease in patients with the performance of doctors following medical guidelines. The machine-learning algorithms that were used were random forest, logistic regression, gradient boosting, and neural networks.

The algorithms trained themselves with data which came from the electronic medical records of 378,256 patients in the United Kingdom. Using this data, they built their own internal ‘guidelines’ and used the data of 2005 to predict which patients would have their first cardiovascular event over the next 10 years. These predictions were then checked versus the records of 2015. All of the algorithms scored better than the guidelines which human doctors would use. The best one, which is Neural Networks, correctly predicted 7.6% more events than the doctor method, and it raised 1.6% less false alarms. The use of these prediction models could potentially save a lot of lives, as prediction lead to the prevention of the cardio-vascular disease by changes in diet or cholesterol-lowering medication.

Future possibilities

Predictive analytics have numerous possible applications, such as analytical customer relationship management, the detection of fraud in accounting, project risk management or cross-selling in retail. As the costs of computation power are becoming cheaper due to Moore’s law and will be cheaper in the future, predictive analytics will become a more viable tool. It will be interesting to see what the future applications of predictive analytics will be.

[1] Data-Mining for Business Analytics, Shmueli, Patel, Bruce, John-Wiley, 2016 (Third Edition)




Artikel door Martijn Seij