The summery contains all the lectures of the course Foundations of Databases of the minor Data Science in Business. The elaborated slides and the summary slides, as well as some coding examples are included.
Lecture 1
Data engineering
- Data engineers are the designers, builders, and managers of the information or ‘big data’
infrastructure
o They develop the architecture that helps analyze and process data in the way the
organization needs it
o And they make sure those systems are performing smoothly
Hierarchy of needs
Stages in a Big Data pipeline
,General pipeline components
Data engineering and processing:
- Underlies (necessary for) Data Science and data-driven decision making
- Has other positive effects on data processing
Data mesh
Main purpose of a database: storing data and processing it into information
Terminology
- Data: given facts, denoted e.g. by sequences of characters or numbers
- Information: the interpretation of data within a certain context
- Database: a collection of permanently and digitally stored data
, Relational database
- Relationships
- Rows and columns
o Row, records, or tuples
o Columns, or attributes
- General language: SQL
Database Management System
- Providing one logical structure for everyone
- Applications access data at the same time
Different models for organizing data
- A database model is a collection of rules with which it is possible to describe the structure,
the consistency rules, and the behavior of a database
- The database model describes how data are to be structured in a database system and, thus,
in a database management system
NoSQL databases: common classifications
- Column store or column-oriented database
o Data is structured in columns
o Name, value, timestamp
- Document store or document-oriented database
o Data is structured in documents
o Typically in some standard format or encoding
- Key-value store/database
o Data is structured into associative array
o Like a dictionary or hash table
o A collection of objects, which in turn have many different fields within them, each
containing data
- Graph database
o Data is structured in nodes, edges and properties describing the nodes
Structured vs. unstructured data
- Unstructured data
o Text files
- Structured files
o XML, database
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller njjfikkers. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.42. You're not tied to anything after your purchase.