The variance of a statistical learning method increases as the method's flexibility increases - correct
answer ✔✔True
Bias of statistical learning method increases as the model's flexibility increases - correct answer ✔✔False
When inference is the goal, there are clear advantages to using a lasso method vs. a bagging method -
correct answer ✔✔True
The accuracy of the prediction of y depends on both the reducible error and the irreducible error -
correct answer ✔✔True
As K increases, flexibility decreases, bias increases, the method produces a decision boundary close to
linear, and variance decreases (for K nearest neighbors - correct answer ✔✔True
RSE - correct answer ✔✔Estimate of the standard deviation of the error, the average amount that
response will deviate from the true regression line
Variance increases monotonically as flexibility increases - correct answer ✔✔True
Logistic Regression is parametric, but regression trees and KNN are not - correct answer ✔✔True
RSE info - correct answer ✔✔Considered a measure of the lack of fit of the model to the data, we want it
to be small, it's measured in units of y, so it's not always clear what constitutes a good RSE
R-squared statistic - correct answer ✔✔Provides an alternative measure of fit, takes form of a
proportion, proportion of variance explained, always takes on values between 0 and 1, independent of
scale of y
,R squared = - correct answer ✔✔SSR/TSS
TSS measures - correct answer ✔✔The total variance in the response Y, the amount of variability
inherent in the response before regression is performed
RSS measures - correct answer ✔✔The amount of variability that is left unexplained after performing the
regression
R squared measures - correct answer ✔✔The proportion of variability in Y that can be explained by using
X, we want it to be close to 1
Residuals (e1, e2,..) must sum to zero - correct answer ✔✔True
F stat= - correct answer ✔✔T stat squared
To test relationship between response and predictors, we check if the betas equal zero by - correct
answer ✔✔Computing F stat=((ssr/p)/(rss/n-p-1))
When there is no relationship between response and predictors, we expect F-stat to be - correct answer
✔✔Close to 1
If there is a relationship between response and predictors, we expect F stat to be - correct answer
✔✔Greater than 1, and the alternative hypothesis to be true
How large does the F stat need to be before we reject H0? - correct answer ✔✔It depends on n and p;
when n is large, an F stat that is just a little larger than 1 might still provide evidence against H0; a larger
F stat is needed to reject H0 if n is small
When does F stat work? - correct answer ✔✔When p is relatively small, and small compared to n; if p is
greater than n, then there are more betas to estimate than observations to estimate from, in this case
we cannot use least squares to build model, so F stat cannot be used
, R squared in MLR - correct answer ✔✔The square of the correlation between the response and the
fitted linear model; an r squared close to 1 indicates that the model explains a large portion of the
variance in the response variable
Prediction intervals are always wider than confidence intervals because it incorporates irreducible error
and reducible error - correct answer ✔✔True
The least squares line always passes through (xbar, ybar) - correct answer ✔✔True
Additive assumption - correct answer ✔✔The effect of changes in a predictor Xj on the response Y is
independent of the values of the other predictors
Linear Assumption - correct answer ✔✔The change in the response Y due to a one unit change in Xj is
constant regardless of the value of Xj
It is clear the true relationship isn't additive if, - correct answer ✔✔The p value for interaction term is
low, there is strong evidence for the alternative hypothesis
Hierarchical Principle - correct answer ✔✔If we include a interaction in a model, we should also include
the main effects, even if the p values associated with their coefficients are not significant
Problems that may occur when fitting a linear regression model - correct answer ✔✔Non linearity of
data
Correlation of error terms
Non constant variance of error terms
Outliers
High leverage points
Collinearity
Heteroscedasticity - correct answer ✔✔Presence of funnel shape in residual plot, variance of error terms
may increase with the value of the response, a solution would be to transform the response y using a
concave function
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller BravelRadon. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $17.99. You're not tied to anything after your purchase.