This document contains a summary of the material needed for the exam. It includes information from the book, lectures, knowledge clips, cases, and papers. Written in the academic year and based on the book "Data Science for Business", which has been used in all lectures.
,Week 1: Chapters 1 and 2
The ubiquity of data opportunities in the digital era.
Over the last 25 years, many devices are alle linked with each other through data. The costs
of storing these data have decreased.
some observations in our daily life:
- Marketing
o Online advertising
o Recommendations for cross selling
o Customer relationship management
- Finance
o Credit scoring and trading
o Fraud detection
o Workforce management
- Retail
o Marketing
o Supply chain management
Data is used in many organizations daily. Different technologies → more data → use for
better decisions.
Fundamental concepts: data science and data driven decision making (DDD)
Data-driven decision making (DDD): practice of basing decisions on the analysis of data,
rather than purely on intuition.
- Replying on data and analysis. DDD always assumes there is a lot of data, which is not
always the case. For example, in a pandemic, we learn as we go. Often in initial
stages, there hasn’t been a lot of data.
Data science: involves principles, processes and techniques for understanding phenomena
via the analysis of data.
- For example, what can we do to retain customers? Predict customer churn.
Data science supports DDD, but is also overlapping with DDD. Business decisions are
increasingly being made automatically. Data engineering includes data science but is useful
for much more.
The sort of decisions of interest:
1. Decisions which need discovery within data.
2. Repetitive decisions (especially at massive scale).
The type of decisions that are interesting for a company requires packages that are not
obvious, they need more discovery, it’s not quite intuitive. The other important elements are
the repetitive decisions. If there is a problem you frequently challenge, data science is
important.
Fundamental concepts: Big data
, Data vs information? → Data can invert into information, so that they have a meaning. Data
in itself has no meaning.
Big data: simple, very large dataset, but with three distinct characteristics (3Vs):
- Volume: quantity of generated and stored data.
- Variety: type and nature of data.
- Velocity: speed at which the data is generated and processed.
Big Data 1.0 is transformed into Big Data 2.0 (social networking components, rise of voice of
individual consumer).
Fundamental concepts: data science, data mining, and machine learning
Data science: involves principles, processes.
- Involves collection → storage → analysis → implementation.
Data mining: extraction of knowledge from data, via technologies that incorporate these
principles. Data mining is one aspect of data science, extracting the knowledge.
We focus on the business understanding (how do you translate business problems into data
problems) and data analysis (what kind of models can you use, understanding data analysis).
Data analytics: process of examining datasets in order to draw conclusions about the useful
information they may contain. What value do these models have? What is the framework?
- How much value does it create for a manager and how can you use it for the future?
Types of data analysis:
- Descriptive analysis (BI): what has happened?
o Simple descriptive statistics, dashboard, charts, diagrams.
- Predictive analysis: what could happen?
o Segmentation, regression.
- Prescriptive analysis: what should we do?
o Complex models for product planning and stock optimization.
Big data analysis as a strategic asset.
Data comes from both internal and external sources, structured and unstructured. The idea
is to make meaning of the data. When you combine big data with analysis, this is key for
having competitive advantage.
Strategic asset: data and the capability to extract useful knowledge from data can be a
strategic asset. One has to think of data science as a strategic asset you invest in. you need
good people, good infrastructure and a process over years. It’s like R&D, long term project
which pays off in the long run.
Classification Predict – for each individual in a population – which of a set of class this
individual belongs to. It predicts whether something will happen. Often a
binary target (categorical, not numerical as in a regression).
Scoring Class probability estimation: applying score, representing the probability
that an individual belongs to each class.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller denizd. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.65. You're not tied to anything after your purchase.