100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
ISYE 6501 WEEK 1 HOMEWORK – SAMPLE SOLUTIONS $7.49   Add to cart

Exam (elaborations)

ISYE 6501 WEEK 1 HOMEWORK – SAMPLE SOLUTIONS

 4 views  0 purchase
  • Course
  • Institution

The file credit_card_ contains a dataset with 654 data points, 6 continuous and 4 binary predictor variables. It has anonymized credit card applications with a binary response variable (last column) indicating if the application was positive or negative. The dataset is the “Credit Approval Data S...

[Show more]

Preview 2 out of 9  pages

  • March 15, 2022
  • 9
  • 2022/2023
  • Exam (elaborations)
  • Questions & answers
avatar-seller
ISYE 6501 WEEK 1 HOMEWORK – SAMPLE SOLUTIONS




IMPORTANT NOTE
These homework solutions show multiple approaches and some optional extensions for most of
the questions in the assignment. You don’t need to submit all this in your assignments; they’re
included here just to help you learn more – because remember, the main goal of the homework
assignments, and of the entire course, is to help you learn as much as you can, and develop
your analytics skills as much as possible!




Question 1

Describe a situation or problem from your job, everyday life, current events, etc., for which
a classification model would be appropriate. List some (up to 5) predictors that you might
use.

One possible answer:

Being students at Georgia Tech, the Teaching Assistants for the course suggested the following
example. A college admissions officer has a large pool of applicants must decide who will make
up the next incoming class. The applicants must be put into different categories – admit,
waitlist, and deny – so a classification model is appropriate. Some common factors used in
college admissions classification are high school GPA, rank in high school class, SAT and/or ACT
score, number of advanced placement courses taken, quality of written essay(s), quality of
letters of recommendation, and quantity and depth of extracurricular activities.

If the goal of the model was to automate a process to make decisions that are similar to those
made in the past, then previous admit/waitlist/deny decisions could be used as the response.
Alternatively, if the goal of the model was to make better admissions decisions, then a different

, measure could be used as the response – for example, if the goal is to maximize the academic
success of students, then each admitted student’s college GPA could be the response; if the
goal is to maximize the post-graduation success of admitted students, then some measure of
career success could be the response; etc.

Question 2

The file credit_card_data.txt contains a dataset with 654 data points, 6 continuous and 4 binary
predictor variables. It has anonymized credit card applications with a binary response variable
(last column) indicating if the application was positive or negative. The dataset is the “Credit
Approval Data Set” from the UCI Machine Learning Repository
(https://archive.ics.uci.edu/ml/datasets/Credit+Approval ) without the categorial variables and
without data points that have missing values.

1. Using the support vector machine function ksvm contained in the R package kernlab, find a
good classifier for this data. Show the equation of your classifier, and how well it classifies
the data points in the full data set. (Don’t worry about test/validation data yet; we’ll cover
that topic soon.)

Notes on ksvm

• You can use scaled=TRUE to get ksvm to scale the data as part of calculating a classifier.

• The term λ we used in the SVM lesson to trade off the two components of correctness and
margin is called C in ksvm. One of the challenges of this homework is to find a value of C
that works well; for many values of C, almost all predictions will be “yes” or almost all
predictions will be “no”.

• ksvm does not directly return the coefficients a0 and a1...am. Instead, you need to do the last
step of the calculation yourself. Here’s an example of the steps to take (assuming your data
is
1
stored in a matrix called data):

# call ksvm. Vanilladot is a simple linear kernel.
model <-
ksvm(as.matrix(data[,1:10]),as.factor(data[,11]),type=”C-
svc”,kernel=”vanilladot”,C=100,scaled=TRUE)
# calculate a1...am
# a <- colSums(data[model@SVindex,1:10] * model@coef[[1]]) # for unscaled
data a <- colSums(data[model@xmatrix[[1]]] * model@coef[[1]]) # for scaled data

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller NurseAmy. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

82191 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$7.49
  • (0)
  Add to cart