,SOLUTION MANUAL FOR MODERN BUSINESS ANALYTICS 1ST
EDITION BY MATT TADDY AND LESLIE HENDRIX AND
MATTHEW HARDING.
Chapter 1
Regression
Problem 1.1 For this problem set, we will use 13,103 observations of hourly counts from 2011
to 2012 for bike rides (rentals) from the Capital Bikeshare system in Washington DC. The data
are recorded for hours after 6am every day. (We omit earlier hours for convenience since they
often include zero ride counts.) This dataset is adapted from data originally compiled by Fanaee
and Gama in ‗Event labeling combining ensemble detectors and background knowledge‘ (2013).
This data can be used for modeling system usage (ride counts). Such usage modeling is a key
input for operational planning.
bikeshare.csv contains:
dteday: date
mnth: month (1 to 12)
holiday: whether day is holiday or not
weekday: day of the week, counting from 0:sunday.
workingday: if day is either weekend or holiday is 0, otherwise is 1.
weathersit: broad overall weather summary (clear, cloudy, wet)
temp: Temperature, measured in Celsius
hum: Humidity %
windspeed: Wind speed, measured in km per hour
cnt: count of total bike rentals that day
<bikeshare.csv>
<bikeshareReadme.txt>
Read the bikeshare.csv data into R. Plot the marginal distribution for the count of bike
rentals and the conditional count distribution given the broad weather situation
(weathersit).
a-1. Use a histogram to plot the marginal distribution for the count of bike rentals. What is the
shape of the distribution?
a. skewed left
b. fairly symmetric
, c. skewed right
Explanation/Solution
The following code draws a histogram plotting the marginal distribution for the count of bike
rentals:
a-2. If you haven‘t already, read the bikeshare.csv data into R and make sure to use the
strings=T argument in the read.csv function
Create side-by-side boxplots to show the conditional count distribution, given the broad weather
situation (weathersit). What does the side-by-side boxplot look like?
(The top image is correct.)
a.
, b.
c.
Explanation/Solution
The following code draws side-by-side boxplots showing the conditional count distribution,
given the broad weather situation:
Problem 1.2 Read the bikeshare.csv data into R and make sure to use
the strings=T argument in the read.csv function. Fit a regression for ride count as a function of
the weather situation variable to answer the following questions.
a. On wet days, is the expected ride count higher or lower compared to clear days?
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Succeed. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $19.49. You're not tied to anything after your purchase.