• Susantini Behera

An Ensemble Method for prediction(Bagging)

In this blog ,i am going to discuss on Prediction models.Some of basic powerful Prediction methods based on trees which are called as Ensemble methods.The term Ensemble means taking a group of things instead of individual things.There are 3 types of Ensemble methods:


2.Random Forests


The concept evolving of Ensemble methods is the lacuna behind normal decision tree.The Problem behind decision tree is high variance i.e.when the training data is split up into 2 halves and 2 models are brought ,we often see that the trained two models are very different.Hence from Figure 1 example ,we noticed that the output of decision trees has high variance.Basically the 2 datasets along with trained regression models are also different .

Figure 1:Normal Decision Tree

so ,a natural way to reduce the variance and increase prediction accuracy is to take many training sets from the population and then build separate prediction model using each training sets .Then average individual prediction models to get the final prediction model.


If N observations have variance (sigma sq.),the variance of mean of these observations is (sigma sq.)/N

Figure 2:Bagging Technique

These methods takes multiple training sets so called as ensemble and this particular technique using all variables but multiple training sets is called Bagging.The output value reduces the variance.

Practically, we do not have access to multiple training sets instead we use Bootstrapping to create multiple samples from single training dataset.

Bootstrapping :

suppose we have one sample training dataset {7,9,5,4,3}.we randomly take 3 samples in which we pick numbers from given set that can be repeated.

7 9 5 4 3

sample 1 :9 5 4 3 4

sample 2:7 9 5 4 7

sample 3:7 9 9 4 3

By repetition of samples, we are just are able to create multiple sample sets out of single training set.

Brought to you by-

CoE-AI(CET-BBSR)-An initiative by CET-BBSR,Tech Mahindra and BPUT to provide to solutions to Real world Problems through ML and IoT

source credit-Udemy