100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
ISYE 6501 Lecture Notes ISYE 6501 Midterm 2 with complete solution $9.99   Add to cart

Exam (elaborations)

ISYE 6501 Lecture Notes ISYE 6501 Midterm 2 with complete solution

 26 views  0 purchase

ISYE 6501 Lecture Notes ISYE 6501 Midterm 2 with complete solution

Preview 3 out of 29  pages

  • April 30, 2022
  • 29
  • 2021/2022
  • Exam (elaborations)
  • Questions & answers
  • isye 6501
All documents for this subject (5)
avatar-seller
Yongsam
ISYE 6501 Lecture Notes
ISYE 6501 Midterm 2
with complete solution # Week 5 Notes Variable Selection
what do we do with a lot of factors in our models?
variable selection helps us choose the best factors for our models
variable selection can work for any factor based model - regression / classification why do we not want a lot of factors in our models?
-overfitting: when the number of factors is close or larger than number of data points our model will overfit
-overfitting: model captures the random effect of our data instead of the real effects
too many factors is the same idea - we will model too much of the random effects in our model with few data points overfitting can cause bad estimates
if too many factors our model with be influenced too much by the random effect of that data with few data our model can even fit unrelated variables!
-simplicity: simple models are more easier to interpret
-collecting data can be expensive
-with less factors less data is required to production the model
-fewer factors - less chance of including factor that is meaningless
-easier to explain to others
-we want to know the why? hard to do with too many factors
-need to clearly communicate what you model is doing fewer factors is very beneficial!
building simpler models with fewer factors helps avoid -
-overfitting
-difficulty of interpretation
# Week 5 Notes Variable Selection Models
models can automate the variable selection process all can be applied to all types of models
two types:
-step-by-step building a model
forward selection: start with a model that has no factors
-step by step add variables and keep the variable is there is model improvement
-we can limit the model by a number of thresholds
-after built up we can go back and remove any variables that might not be important after full model is fit
-we can judge factors by p value (.15 of exploration or .05 for final model)
backward elimination: start with a model with all factors
-step by step remove variables that are 'bad' based on p value
-continue to do this until all variables included are 'good' variables or we reached a factor number criteria
-factors can be judged by p value (.15 for exploration and .05 for final model) stepwise regression: combination of forward selection and backward elimination
-start with all or no variables
-at each step add or remove a factor based on some pvalue criteria
-model will adjust older factors based on what new values we add to the model
-we can use other metrics AIC, BIC, R^2 to measure 'good' variables in any step by step method
step-by-step = greedy algorithm, does the one step that is the best without taking future options into account
-these are model 'classical'
newer methods based on optimization models that look at all possible options at the same time
-LASSO: add a constraint to the standard regression equation to bound coefficients from getting large
-sum of the coefficients sumof(|ai|) <= t
-regression has a budget t to use on coefficients
-factors that are not important will be dragged down to 0
-constraining any variables means we need to scale the data beforehand!
-how to be pick t?
-number of variables and quality of model?
-try LASSO with different values of t and choose the best performance
-Elastic Net: combination of LASSO and RIDGE regression
-constrain a combination of the absolute value of the sum of coefficients vs. the squared sum of coefficients
-need to scale the data
-sumof(ai^2) <= t
-without the absolute term we have RIDGE regression
-these are global approaches to variable selection
what is the key difference between stepwise and LASSO regression?
-lasso has a regularization term and requires the data to be scaled beforehand
-in regression contexts LASSO needs to be scaled
-size constraint will pick up the wrong values because magnitude of factors messes with the coefficient estimates!
# Week 5 Notes variable selection
-greedy variable selection - stepwise
-global optimization - LASSO, ridge, Elastic net how do we choose between these methods?
stepwise methods: good for exploration and quick analysis
-stepwise is the most common
-can give set of variables that fit to random effects
-they might generalize as well to new data
global optimization: slower but better for prediction
-LASSO, Elastic Net

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Yongsam. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $9.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

80202 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$9.99
  • (0)
  Add to cart