Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien
logo-home
COMPLETE Summary - Interactive Data Transformation (2023) €6,99   Ajouter au panier

Resume

COMPLETE Summary - Interactive Data Transformation (2023)

 50 vues  1 fois vendu
  • Cours
  • Établissement

Includes everything from: Lecture 1: Data Management Systems, Relational, and SQL Lecture 2: Entity Relationships, and Translating from a Natural Lecture 3: Translating ERD to DB Schema, and Database Normalization Lecture 4: Evolution of Data Management, Big Data, and Data Intensive Systems...

[Montrer plus]

Aperçu 3 sur 23  pages

  • 8 juin 2023
  • 23
  • 2022/2023
  • Resume
avatar-seller
Interactive Data Transformation
Unit 4 – Year 2022/2023


Contents
Lecture 1: Data Management Systems, Relational, and SQL .............................................................. 2
Relational data model ............................................................................................................................ 3
Single table queries using SQL ............................................................................................................. 4
Lecture 2: Entity Relationships, and Translating from a Natural Language Perspective................ 5
Basic concepts ........................................................................................................................................ 5
Lecture 3: Translating ERD to DB Schema, and Database Normalization ....................................... 8
Transforming entity relationship diagram to relational schema ..................................................... 8
Normalization ........................................................................................................................................ 11
Lecture 4: Evolution of Data Management, Big Data, and Data Intensive Systems .................... 12
Evolution of Data Management ......................................................................................................... 12
Big data and its challenges .................................................................................................................. 12
Big data analytics .................................................................................................................................. 13
Reasons for going beyond traditional RDBMS ................................................................................ 13
Lecture 5: The Spark Ecosystem, RDDs, Programming Model, and PySpark ................................ 17
Lambda expressions ............................................................................................................................. 17
Apache Spark ........................................................................................................................................ 18
Programming models ........................................................................................................................... 19
Lecture 6: Data Transformations with SQL, Entity Recognition, Data Cleaning Tools ................ 20
Views ...................................................................................................................................................... 21
Functions ............................................................................................................................................... 21
Creating and populating ...................................................................................................................... 22
Data from websites, integration & cleaning, entity extraction & resolution .............................. 22

,Lecture 1: Data Management Systems, Relational, and SQL
Reasons for Database Management Systems (DBMS):
In the early days database applications were built on top of file systems,
Drawbacks:
• Data redundancy and inconsistency: multiple file formats, duplication in different files
• Difficulty in accessing data: need to write a new program to carry out each new task
• Data isolation: multiple files and formats
• Integrity problems: integrity constraints become buried in program code rather than being
stated explicitly
- Hard to add new constraints or change existing ones
• Atomicity of updates: failures leave data in an inconsistent state with partial updates carried
out
• Concurrent access by multiple users: needed for performance, uncontrolled concurrent
accesses can lead to inconsistencies
• Security problems: hard to provide user access to some, but not all, data

A database (DB) is a collection of data with the same structure, including correlations and
relationships, that is defined for a particular use and used by several users
A database management system (DBMS) is a collection of programs over DB that specify the
data types, structure, constraints, and store this on a disk, retrieve, update, and mange access
rights
• A black box interacting between users/applications and the database
• The ultimate goal is to separate data from application
- Provide an interface that the application programmer must follow
- Allow system administrator to make modifications without having an impact on the user
- Users can change their view of the data without having to worry about how it is stored

There are different layers within the DBMS:
• External layer: communication with users
- Analysis of user requests (queries)
- Access control
- Answer presentation
• Logical layer:
- Optimization of queries
- Resolving conflicting accesses, i.e., multiple users
- Guarantees constant availability even in case of failures
• Internal layer:
- Storing the data
- Software for structuring the data
- Efficient access methods, i.e., keys, indices, etc.

System development life cycle
1. Planning: develop a preliminary understanding of the business situation and how information
systems might help solve the problem
• Enterprise modeling: analyze current data processing, the general business functions and
their database needs, and justify need for new data and databases in support of business
• Conceptual data modeling: identify scope of database requirements for proposed
information system and analyze overall data requirements for business function(s) supported
by database
• Understand current data processing
• Understand general business functions and needs

, 2. Analysis: analyze the business situation thoroughly to determine requirements and to
structure those requirements
• Conceptual data modeling:
- Develop preliminary conceptual data model, including entities and relationships
- Compare to enterprise data model
- Develop detailed conceptual data model, including entities, relationships, attributes, and
business rules
- Make conceptual data model consistent with other models of information systems
- Populate repository with all conceptual database specifications
• Output: conceptual schema
• Corresponds to a detailed technology independent specification of the overall organizational
data structure

3. Design
• Logical: representation of the DB; transform the conceptual schema into a logic schema,
which describes the data in terms of the data management technology that will be used to
implement the database
• Physical: the set of specifications that describe how data from a logical schema are stored in
a computer’s secondary memory by a specific database management system

4. Implementation: a designer writes, tests, and installs the program/scripts that access, create,
or modify the database
• Finalize all database documentation, train users, and put procedures into place for the
ongoing support of the information system users
• Populate with data
• Install application(s) and test
• Complete documentation and training materials

5. Maintenance: add, delete, or change characteristics of the structure of the database in order
to meet changing business conditions, to correct errors in database design, or to improve the
processing speed of database applications
• Monitor the operation and usefulness of the system
• Repair by fixing errors in database and applications
• Enhance by analyzing the database and applications to ensure that evolving information
requirements are met

Different types of DBMS:
• Traditional DBMS: text and numerical data
• Multimedia DBMS: multimedia data
• Spatial DBMS: geographic and geometric data
• Data warehouses
Relational data model
A relational model is an approach to managing data by representing it grouped into relations

A relational DBMS (RDBMS) is a database management system that manages data as a collection
of tables in which all relationships are represented by common values in related tables

Les avantages d'acheter des résumés chez Stuvia:

Qualité garantie par les avis des clients

Qualité garantie par les avis des clients

Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.

L’achat facile et rapide

L’achat facile et rapide

Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.

Focus sur l’essentiel

Focus sur l’essentiel

Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur anneTBKIM. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour €6,99. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis)

80796 résumés ont été vendus ces 30 derniers jours

Fondée en 2010, la référence pour acheter des résumés depuis déjà 14 ans

Commencez à vendre!
€6,99  1x  vendu
  • (0)
  Ajouter