high flexibility correlates to - correct answer ✔✔high variance
MLR:
When an outlier has been identified, the following approaches are accepted... - correct answer
✔✔remove the outlier only after confirming that it resulted from error
retain the outlier in the analysis and thoroughly document its influence
perform the regression twice: once with outlier and once without
include the observation but comment on its effects
delete the observation from the dataset
Create a binary variable to indicate the presence of an outlier.
Prediction:
There is no guarantee for any two models to produce the same prediction - correct answer ✔✔T - it is
more likely for them to produce different predictions
Prediction:
It is assumed that the new observation follows the same model as the one used in the sample - correct
answer ✔✔T - if it doesn't follow the same model, we shouldn't be using it to make predictions
Prediction:
Is a point prediction is more reliable than an interval prediction? - correct answer ✔✔F - neither is more
reliable than the other & there is no easy way to compare the reliability of the two
,Prediction:
A wider prediction interval means that the standard error is lower / higher - correct answer ✔✔higher
Prediction:
Which type of interval is more informative? wide/narrow - correct answer ✔✔narrow - it gives us a
better idea of the true value of an observation
Should a prediction interval contain the single point prediction? - correct answer ✔✔Yes - the interval
contains the most likely values with the point prediction being the single most likely point
Recursive binary splitting process:
the predictor and cut point for each split are chosen to minimize the overall impurity of the tree - correct
answer ✔✔T - these impurities can be measured by RSS, gini index, entropy, classification error
Recursive binary splitting process:
Does each split have to use a different predictor to ensure diversity in that tree? - correct answer ✔✔No
- the same predictor can be used for multiple splits if it continues to provide the best reduction in
impurity
Recursive binary splitting process:
is a top-down approach, starting from the root and expanding downwards - correct answer ✔✔T
Recursive binary splitting process:
works with both quantitative and qualitative predictors - correct answer ✔✔T
Recursive binary splitting process:
Does it stop as soon as a single split has been made? - correct answer ✔✔No - it continues until a
predefined stopping criterion is met (normally a min. node size, max tree depth, or min reduction in
impurity)
, Bagging:
For a sufficiently large number of bootstrap samples, out-of-bag error is virtually equivalent to ____ -
correct answer ✔✔LOOCV validation error
Bagging:
Out-of-bag error estimation uses only the trees for which the specific observation was not in the
bootstrap sample - correct answer ✔✔T - it does not requires each observation to be predicted by all
trees in the ensemble
Bagging:
Increasing the number of trees does not lead to overfitting due to the aggregation of predictions -
correct answer ✔✔T - a very high number of bootstrap samples will NOT lead to overfitting in bagged
models
Bagging:
Bagging is useful for improving prediction accuracy in classification settings - correct answer ✔✔T
Bagging:
Each bootstrapped dataset likely contains repeated observations due to sampling with replacement -
correct answer ✔✔T - not all observations will be different from the original dataset
Regression trees:
A smaller tree with fewer splits might lead to lower variance and better interpretation at the cost of a
little bias - correct answer ✔✔T
A decision tree considers all predictors X1-Xp and all the possible values of the cutpoint for each of the
predictors, and then chooses the predictor and cutpoint such that the resulting tree has the _____ sum
of squares - correct answer ✔✔lowest
Regression trees:
Estimating the Cross-Validation error for every possible subtree would be too cumbersome, so it leads to
cost complexity pruning that considers.... - correct answer ✔✔a sequence of trees indexed by a non-
negative tuning parameter, alpha
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller BravelRadon. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $17.99. You're not tied to anything after your purchase.