Bootstrap Aggregating (Bagging) is a method to improve classification and regression models in terms of stability and
classification accuracy. Bagging also reduces variance and helps to avoid over-fitting. Although this method is usually
applied to decision tree models, it is not limited to any model type. Bagging is a special case of model averaging approach.
Given a standard training set D of size N, we generate L new training sets <math>D_i<math> also of size N by sampling examples
uniformly from D, and with replacement. By sampling with replacement it is likely that some examples will be repeated in
each <math>D_i<math>. This kind of sample is known as a bootstrap sample. The L models are fitted using the above L bootstrap samples
and combined by averaging the output (in case of regression) or voting (in case of classification)
References
Leo Breiman. Bagging predictors. Machine Learning, 24(2):123140, 1996.
See also