Summary of the subject Machine Learning which was given in year 3 of the education Applied Computer Science. This document contains following chapters: Designing BI solutions with Microsoft SQL Server, Analysis Services Tabular Modeling, Multidimensional Modeling, Data Mining, Integration Services ...
Chapter 2: Designing BI solutions with
Microsoft SQL Server
Components of a BI infrastructure
Data sources
OLTP Databases, Legacy system, back office applications, ERP, CRM, accounting apps, flat files or any
kind of database that users adopt in order to manage business.
ETL
• Extract from sources
• Transform schema & content
• Load into destination
The ETL process is a strictly technical domain. This is where tables are deleted, added and modified.
Temporary tables are stored in a staging area, these tables are useless at the end of the process and
cannot be seen by users. Example of an ETL tool: SSIS.
Data cleansing
• Data value validation
• Duplicate record matching
Master data management
• Business entity integrity
JDK 2021 1
,Data Warehouse
Is the database that will contain all the tables,
views, procedures and code that end-users will
use for their daily reporting, dashboarding and
analytical activities.
Querying is more important than inserts/
updates/deletes.
According to Kimball, DWH is the union of all the
data marts.
According to Inmon, DWH is a relational model
in 3rd normal form of the corporate data model.
Data marts source their info from the EDW
(Enterprise data warehouse)
There cannot be one clear definition of a data
warehouse, the content of the data mart highly
depends upon the complexities of the specific BI.
ODS = operational data store
SODA = staging/ODS/Archive
A data mart contains a subset of organization-wide data. This subset of data is valuable to a specific
community of knowledge workers.
For example, the marketing data mart may contain data related to products, customers and sales and will
be used by the marketing analysts.
The best way to model a data mart is a star schema or snowflake schema with a fact table surrounded by
dimension tables.
Kimball vs Inmon
- There is no right or wrong.
- There is no clear separation. The shared idea is dimensional modeling.
- Different data warehousing philosophies.
- DWH in most enterprises are closer to Ralph Kimball's idea. This is because most started out as a
departmental effort as a data mart
- Inmon’s solution takes time and money
JDK 2021 2
, Data models
Benefits of data models:
▪ Abstract data warehouse tables
▪ Simplify analysis for users
▪ Add business logic
▪ Pre-aggregate measures
▪ Provide a standard interface
2 types of models: Multidimensional & Tabular
Data Models are built with SQL Server Analysis Services. SSAS is an extra layer of metadata, or a
semantic model that sits on top of a data warehouse in a relational database.
The layer includes models containing the business logic of your data
A data model contains information about:
- how fact tables and dimension tables should be joined
- how measures should be aggregated
- how users should be able to explore the data through hierarchies
- the definitions of common calculations
End user applications query these models rather than the underlying database.
An LOB (Line of Business) application is one of the sets of critical computer applications that are vital to
running an enterprise. LOB applications are usually large programs that contain a number of integrated
capabilities and tie into databases and database management systems. For example: ERP – CRM
SSAS
• Tabular & multidimensional
• The concepts involved in designing the two types of model are very different, and you cannot
convert a Tabular database into a Multidimensional or vice versa, without rebuilding everything from
scratch
JDK 2021 3
Les avantages d'acheter des résumés chez Stuvia:
Qualité garantie par les avis des clients
Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.
L’achat facile et rapide
Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.
Focus sur l’essentiel
Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.
Foire aux questions
Qu'est-ce que j'obtiens en achetant ce document ?
Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.
Garantie de remboursement : comment ça marche ?
Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.
Auprès de qui est-ce que j'achète ce résumé ?
Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur GraduateITF. Stuvia facilite les paiements au vendeur.
Est-ce que j'aurai un abonnement?
Non, vous n'achetez ce résumé que pour €7,99. Vous n'êtes lié à rien après votre achat.