Artifical Intelligence and Machine Learning CS3491
3 views 0 purchase
Course
Information Technology (CS3491)
Institution
Loyola-icam College Of Engineering And Technology
This notes is understandable and each and every topic regarding the Artificial Intelligence and Machine Learning course is well listed . One can understand it so well and it will be so useful .
UNIT IV ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING 9
Combining multiple learners: Model combination schemes, Voting, Ensemble Learning - bagging,
boosting, stacking, Unsupervised learning: K-means, Instance Based Learning: KNN, Gaussian mixture
models and Expectation maximization
Combining Multiple Learners
• When designing a learning machine, we make choices like parameters of machine, training data,
representation, etc. This implies some sort of variance in performance. For example, in a classification
setting, we can use a parametric classifier or in a multilayer perceptron, we should also decide on the
number of hidden units.
• Each learning algorithm dictates a certain model that comes with a set of assumptions. This inductive
bias leads to error if the assumptions do not hold for the data.
• Different learning algorithms have different accuracies. The learning algorithms can be combined to
attain higher accuracy.
• Data fusion is the process of fusing multiple records representing the same real-world object into a
single, consistent, and clean representation..
• Combining different models is done to improve the performance of deep learning models.
Building a new model by combination requires less time, data, and computational resources.
The most common method to combine models is by a weighted average improves the averaging
multiple models, where taking a weighted average improves the accuracy.
1. Generating Diverse Learners:
• Different Algorithms: We can use different learning algorithms to train different base-learners.
Different algorithms make different assumptions about the data and lead to different classifiers.
• Different Hyper-parameters: We can use the same learning algorithm but use it with different hyper-
parameters.
• Different Input Representations: Different representations make different characteristics explicit
allowing better identification.
• Different Training Sets: Another possibility is to train different base-learners by different subsets of
the training set.
,Model Combination Schemes
• Different methods are used for generating final output for multiple base learners are Multiexpert and
multistage combination.
1. Multiexpert combination
• Multiexpert combination methods have base-learners that work in parallel.
a) Global approach (learner fusion): given an input, all base-learners generate an output and all these
outputs are used, such as voting and stacking
b) Local approach (learner selection): in mixture of experts, there is a gating model, which looks at
the input and chooses one (or very few) of the learners as responsible for generating the output.
2. Multistage combination
Multistage combination methods use a serial approach where the next multistage combination base-
learner is trained with or tested on only the instances where the previous base-learners are not accurate
enough.
• Let's assume that we want to construct a function that maps inputs to outputs from a set of known
Ntrain input-output pairs.
D train = {(x, y)}i=1Ntrain
where xi Є X is a D dimensional feature input vector, yi Є Y is the output.
Voting
A voting classifier is a machine learning model that gains experience by training on a collection of
several models and forecasts an output (class) based on the class with the highest likelihood of
becoming the output.
• Voting is an ensemble machine learning algorithm.
• For regression, a voting ensemble involves making a prediction that is the average of multiple other
regression models.
Voting Strategies:
Hard Voting – The class that receives the majority of votes is selected as the final prediction. It is
commonly used in classification problems. In regression, it predicts the average of the individual
predictions.
Soft Voting – Weighted average of predicted probabilities is used to make the final prediction. It
is suitable when classifiers provide probability estimates. In other words, for each class, it sums
the predicted probabilities and predicts the class with the highest sum.
,• In this methods, the first step is to create multiple classification/regression models using some training
dataset.
Each base model can be created using different splits of the same training dataset and same algorithm, or
using the same dataset with different algorithms, or any other method.
Fig. 9.1.2 shows general idea of Base-learners with model combiner.
, • When combining multiple independent and diverse decisions each of which is at least more accurate
than random guessing, random errors cancel each other out, and correct decisions are reinforced. Human
ensembles are demonstrably better.
• Use a single, arbitrary learning algorithm but manipulate training data to make it learn multiple
models.
Base Models: Individual models that form the ensemble. For example, Support Vector Machines,
Logistic Regression, Decision Trees.
Classifier and Regressor Variants:
Voting Classifier – Combines multiple classifiers for classification tasks.
Voting Regressor – Combines multiple regressors for regression tasks.
Ensemble Learning
Ensemble Learning
Ensemble modeling is the process of running two or more related but different analytical models and
then synthesizing the results into a single score or spread in order to improve the accuracy of predictive
analytics and data mining applications.
• Ensembles of classifiers is a set of classifiers whose individual decisions combined in some way to
classify new examples.
• Ensemble methods combine several decision trees classifiers to produce better predictive performance
than a single decision tree classifier.
The main principle behind the ensemble model is that a group of weak learners come together to form a
strong learner, thus increasing the accuracy of the model.
• Why do ensemble methods work?
• Based on one of two basic observations :
1. Variance reduction: If the training sets are completely independent, it will always helps to average
an ensemble because this will reduce variance without affecting bias (e.g., bagging) and reduce
sensitivity to individual data points.
2. Bias reduction: For simple models, average of models has much greater capacity than single model.
Averaging models can reduce bias substantially by increasing capacity and control variance by Citting
one component at a time.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller amalamelfa. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $9.99. You're not tied to anything after your purchase.