100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Test Bank for Introduction to Data Mining 2nd Edition (Global Edition) By Pang-Ning Tan, Michael Steinbach, Vipin Kumar (All Chapters, 100% Original Verified, A+ Grade) $28.49   Add to cart

Exam (elaborations)

Test Bank for Introduction to Data Mining 2nd Edition (Global Edition) By Pang-Ning Tan, Michael Steinbach, Vipin Kumar (All Chapters, 100% Original Verified, A+ Grade)

1 review
 119 views  1 purchase
  • Course
  • Introduction to Data Mining 2nd Edit
  • Institution
  • Introduction To Data Mining 2nd Edit

This Is Original 2nd Edition of Test Bank From Original Author. All Other Files in the market are fake/old Edition. Other Sellers Have changed old Edition Number to new But Test Bank is old Edition. Test Bank for Introduction to Data Mining 2nd Edition (Global Edition) By Pang-Ning Tan, Michael...

[Show more]

Preview 4 out of 233  pages

  • April 15, 2024
  • 233
  • 2023/2024
  • Exam (elaborations)
  • Questions & answers
  • Introduction to Data Mining 2nd Edit
  • Introduction to Data Mining 2nd Edit

1  review

review-writer-avatar

By: garenachannel • 2 months ago

avatar-seller
tutorsection
Introduction to Data Mining 2e (Global Edition) Pang-Ning Tan,
Michael Steinbach, Vipin Kumar (Test Bank All Chapters, 100%
Original Verified, A+ Grade)


1

Introduction
1. [Fall 2008]
For each data set given below, give specific examples of classification,
clustering, association rule mining, and anomaly detection tasks that
can be performed on the data. For each task, state how the data matrix
should be constructed (i.e., specify the rows and columns of the matrix).

(a) Ambulatory Medical Care data1 , which contains the demographic
and medical visit information for each patient (e.g., gender, age,
duration of visit, physician’s diagnosis, symptoms, medication, etc).
Answer:
Classification
Task: Diagnose whether a patient has a disease.
Row: Patient
Column: Patient’s demographic and hospital visit information (e.g., symptoms), along with
a class attribute that indicates whether the patient has the disease.
Clustering
Task: Find groups of patients with similar medical conditions
Row: A patient visit
Column: List of medical conditions of each patient
Association rule mining
Task: Identify the symptoms and medical conditions that co-occur together frequently
Row: A patient visit
Column: List of symptoms and diagnosed medical conditions of the patient
Anomaly detection
Task: Identify healthy looking patients with rare medical disorders
Row: A patient visit
Column: List of demographic attributes, symptoms, and medical test results of the patient
1
See for example, the National Hospital Ambulatory Medical Care Survey http://www.
cdc.gov/nchs/about/major/ahcd/ahcd1.htm

,2 Chapter 1 Introduction

(b) Stock market data, which include the prices and volumes of various
stocks on different trading days.
Answer:
Classification
Task: Predict whether the stock price will go up or down the next trading day
Row: A trading day
Column: Trading volume and closing price of the stock the previous 5 days and a class
attribute that indicates whether the stock went up or down
Clustering
Task: Identify groups of stocks with similar price fluctuations
Row: A company’s stock
Column: Changes in the daily closing price of the stock over the past ten years
Association rule mining
Task: Identify stocks with similar fluctuation patterns(e.g., {Google-Up, Yahoo-Up})
Row: A trading day
Column: List of all stock-up and stock-down events on the given day.
Anomaly detection
Task: Identify unusual trading days for a given stock (e.g., unusually high volume)
Row: A trading day
Column: Trading volume, change in daily stock price (daily high − low prices), and average
price change of its competitor stocks
(c) Database of Major League Baseball (MLB).

Classification
Task: Predict the winner of a game between two MLB teams.
Row: A game.
Column: Statistics of the home and visiting teams over their past 10 games they had played
(e.g., average winning percentage and hitting percentage of their players)
Clustering
Task: Identify groups of players with similar statistics
Row: A player
Column: Statistics of the player
Association rule mining
Task: Identify interesting player statistics (e.g., 40% of right-handed players have a batting
percentage below 20% when facing left-handed pitchers)
Row: A player
Column: Discretized statistics of the player
Anomaly detection
Task: Identify players who performed considerably better than expected in a given season
Row: A (player,season) pair e.g, (player1 in 2007)
Column: Ratio statistics of a player (e.g., ratio of average batting percentage in 2007 to
career average batting percentage)



2

, 2

Data
2.1 Types of Attributes
1. Classify the following attributes as binary, discrete, or continuous. Also
classify them as qualitative (nominal or ordinal) or quantitative (interval
or ratio). Some cases may have more than one interpretation, so briefly
indicate your reasoning if you think there may be some ambiguity.

(a) Number of courses registered by a student in a given semester.
Answer: Discrete, quantitative, ratio.
(b) Speed of a car (in miles per hour).
Answer: Discrete, quantitative, ratio.
(c) Decibel as a measure of sound intensity.
Answer: Continuous, quantitative, interval or ratio. It is actually
a logratio type (which is somewhere between interval and ratio).
(d) Hurricane intensity according to the Saffir-Simpson Hurricane Scale.
Answer: Discrete, qualitative, ordinal.
(e) Social security number.
Answer: Discrete, qualitative, nominal.

2. Classify the following attributes as:

• discrete or continuous.
• qualitative or quantitative
• nominal, ordinal, interval, or ratio

, 4 Chapter 2 Data

Some cases may have more than one interpretation, so briefly indicate
your reasoning if you think there may be some ambiguity.

(a) Julian Date, which is the number of days elapsed since 12 noon
Greenwich Mean Time of January 1, 4713 BC.
Answer: Continuous, quantitative, interval
(b) Movie ratings provided by users (1-star, 2-star, 3-star, or 4-star).
Answer: Discrete, qualitative, ordinal
(c) Mood level of a blogger (cheerful, calm, relaxed, bored, sad, angry
or frustrated).
Answer: Discrete, qualitative, nominal
(d) Average number of hours a user spent on the Internet in a week.
Answer: Continuous, quantitative, ratio
(e) IP address of a machine.
Answer: Discrete, qualitative, nominal
(f) Richter scale (in terms of energy release during an earthquake).
Answer: Continuous, qualitative, ordinal
In terms of energy release, the difference between 0.0 and 1.0 is not
the same as between 1.0 and 2.0. Ordinal attributes are qualitative;
yet, can be continuous.
(g) Salary above the median salary of all employees in an organization.
Answer: Continuous, quantitative, interval
(h) Undergraduate level (freshman, sophomore, junior, and senior) for
measuring years in college.
Answer: Discrete, qualitative, ordinal

3. For each attribute given, classify its type as:

• discrete or continuous AND
• qualitative or quantitative AND
• nominal, ordinal, interval, or ratio

Indicate your reasoning if you think there may be some ambiguity in
some cases.
Example: Age in years.
Answer: Discrete, quantitative, ratio.

4

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller tutorsection. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $28.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

67163 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$28.49  1x  sold
  • (1)
  Add to cart