ISYE 6414 - Unit 5 || with 100% Accurate Solutions.
9 views 0 purchase
Course
ISYE 6414 - Unit 5
Institution
ISYE 6414 - Unit 5
What are three problems that variable selection tries to minimize? correct answers high dimensionality, multicollinearity, prediction vs explanatory
high dimensionality correct answers In linear regression, when the number of predicting variables P is large, we might get better predictions by om...
ISYE 6414 - Unit 5 || with 100% Accurate Solutions.
What are three problems that variable selection tries to minimize? correct answers high
dimensionality, multicollinearity, prediction vs explanatory
high dimensionality correct answers In linear regression, when the number of predicting
variables P is large, we might get better predictions by omitting some of the predicting variables.
Models with many predictors have... correct answers low bias, high variance
Models with few predictors have... correct answers high bias but low variance
prediction risk correct answers a measure of the bias-variance tradeoff
How do we estimate prediction risk? correct answers we can use an approach called Training
Risk
training risk correct answers compute the prediction risk for the observed data and take the sum
of squared differences between fitted values for sub model S and the observed values
Is training risk biased? why or why not? correct answers Yes, the training risk is a biased
estimate of prediction risk because we use the data twice. Once for fitting the model S and once
for estimating the prediction risk. Thus, training risk is biased upward.
The larger the number of variables is for training risk.... correct answers the larger the training
risk
What can we do since the training risk is biased? correct answers We need to correct for this bias
by penalizing the training risk by adding a complexity penalty.
Mallow's Cp correct answers This is the oldest approach to variable selection. This assumes that
we can estimate the variance from the full model, however this is NOT the case when p is larger
than n.
Akaike Information Criterion (AIC) correct answers A more general approach, for linear
regression under normality this becomes training risk + penalty that looks like Mallow's
EXCEPT the variance is the true variance not the estimate.
Leave-One-Out Cross Validation (LOOCV) correct answers This is a direct measure of
predictive power. This is just like Mallow's, except the variance is for the S submodel, not the
full model. The LOOCV penalizes complexity less than Mallow's Cp.
To correct for complexity for GLM, what can we use? correct answers AIC and BIC
, AIC vs BIC correct answers BIC is similar to AIC except that the complexity is penalized by
log(n)/2
An important aspect in prediction is.... correct answers how it performs in new settings.
We'd like to have prediction with... correct answers low uncertainty for new settings.
If p(predictors) is large, is it feasible to fit a large number of submodels? correct answers No
If p is large, what can we do instead? correct answers We can perform a heuristic search, like
stepwise regression
If p is small,.... correct answers fit all submodels
Forward stepwise regression correct answers we start with no predictor or with a minimum
model, and add one predictor at a time
Backward stepwise regression correct answers we start with all predictors, the full model and
drop one predictor at a time.
Forward-Backward stepwise regression correct answers meaning adding and discarding one
variable at a time iteratively.
Stepwise regression is a greed algorithm, what does that mean? correct answers It does not
guarantee to find the model with the best score.
Forward stepwise tends to select... correct answers smaller models
Which is preferred, forward or backward? correct answers Forward, because it selects smaller
models versus backwards which starts with a full model.
Do the three stepwise approaches select the same model? correct answers No, especially when p
is large.
Which is more computationally expensive, forward or backward? correct answers Backward
When is Mallow's Cp useful? correct answers When there are no control variables
Penalized or regularized regression correct answers When we perform variable selection and
estimation simultaneously.
If we add the bias squared and the variance, correct answers we get Mean Squared Error (MSE)
Introducing some bias yields a decrease in.... correct answers MSE
The bigger the lambda,..... correct answers the bigger the penalty for model complexity.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller FullyFocus. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.79. You're not tied to anything after your purchase.