DATABRICKS CERTIFIED DATA ENGINEER ASSOCIATE
STUDY GUIDE WITH COMPLETE SOLUTION!!
Which of the following describes a benefit of a data lakehouse that is unavailable
in a traditional data warehouse?
A. A data lakehouse provides a relational system of data management.
B. A data lakehouse captures snapshots of data for version control purposes.
C. A data lakehouse couples storage and compute for complete control.
D. A data lakehouse utilizes proprietary storage formats for data.
E. A data lakehouse enables both batch and streaming analytics
answers: E. A data lakehouse enables both batch and streaming analytics
Which of the following locations hosts the driver and worker nodes of a
Databricks-managed cluster?
A. Data plane
B. Control plane
C. Databricks Filesystem
D. JDBC data source
E. Databricks web application
answers: A. Data plane
A data architect is designing a data model that works for both video-based machine
learning workloads and highly audited batch ETL/ELT workloads. Which of the
following describes how using a data lakehouse can help the data architect meet
the needs of both workloads?
A. A data lakehouse requires very little data modeling.
B. A data lakehouse combines compute and storage for simple governance.
C. A data lakehouse provides autoscaling for compute clusters.
,D. A data lakehouse stores unstructured data and is ACID-compliant.
E. A data lakehouse fully exists in the cloud.
answers: D. A data lakehouse stores unstructured data and is ACID-compliant.
Which of the following describes a scenario in which a data engineer will want to
use a Job cluster instead of an all-purpose cluster?
A. An ad-hoc analytics report needs to be developed while minimizing compute
costs.
B. A data team needs to collaborate on the development of a machine learning
model.
C. An automated workflow needs to be run every 30 minutes.
D. A Databricks SQL query needs to be scheduled for upward reporting.
E. A data engineer needs to manually investigate a production error.
answers: C. An automated workflow needs to be run every 30 minutes.
A data engineer has created a Delta table as part of a data pipeline. Downstream
data analysts now need SELECT permission on the Delta table. Assuming the data
engineer is the Delta table owner, which part of the Databricks Lakehouse Platform
can the data engineer use to grant the data analysts the appropriate access?
A. Repos
B. Jobs
C. Data Explorer
D. Databricks Filesystem
E. Dashboards
answers: C. Data Explorer
Two junior data engineers are authoring separate parts of a single data pipeline
notebook. They are working on separate Git branches so they can pair program on
, the same notebook simultaneously. A senior data engineer experienced in
Databricks suggests there is a better alternative for this type of collaboration.
Which of the following supports the senior data engineer's claim?
A. Databricks Notebooks support automatic change-tracking and versioning
B. Databricks Notebooks support real-time coauthoring on a single notebook
C. Databricks Notebooks support commenting and notification comments
D. Databricks Notebooks support the use of multiple languages in the same
notebook
E. Databricks Notebooks support the creation of interactive data visualizations
answers: B. Databricks Notebooks support real-time coauthoring on a single
notebook
Which of the following describes how Databricks Repos can help facilitate CI/CD
workflows on the Databricks Lakehouse Platform?
A. Databricks Repos can facilitate the pull request, review, and approval process
before merging branches
B. Databricks Repos can merge changes from a secondary Git branch into a main
Git branch
C. Databricks Repos can be used to design, develop, and trigger Git automation
pipelines
D. Databricks Repos can store the single-source-of-truth Git repository
E. Databricks Repos can commit or push code changes to trigger a CI/CD process
answers: E. Databricks Repos can commit or push code changes to trigger a
CI/CD process
Which of the following statements describes Delta Lake?
A. Delta Lake is an open source analytics engine used for big data workloads.
B. Delta Lake is an open format storage layer that delivers reliability, security, and
performance.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller IANWAZASKISTUVIA. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $12.49. You're not tied to anything after your purchase.