100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Nosql And Big Data Exam Questions With Verified And Updated Answers $11.49   Add to cart

Exam (elaborations)

Nosql And Big Data Exam Questions With Verified And Updated Answers

 0 view  0 purchase
  • Course
  • Nosql
  • Institution
  • Nosql

©THESTAR EXAM SOLUTIONS 2024/2025 ALL RIGHTS RESERVED. 1 | P a g e Nosql And Big Data Exam Questions With Verified And Updated Answers T/F Spark is a database - answerF, it is a query engine. What does RDD stand for? - answerResilient Distributed Dataset T/F A transformation changes a RDD. ...

[Show more]

Preview 2 out of 8  pages

  • November 24, 2024
  • 8
  • 2024/2025
  • Exam (elaborations)
  • Questions & answers
  • Nosql
  • Nosql
avatar-seller
TheStar
©THESTAR EXAM SOLUTIONS 2024/2025

ALL RIGHTS RESERVED.



Nosql And Big Data Exam Questions With
Verified And Updated Answers


T/F Spark is a database - answer✔F, it is a query engine.

What does RDD stand for? - answer✔Resilient Distributed Dataset

T/F A transformation changes a RDD. - answer✔F, it defines a NEW RDD based on the current
one. RDDs are immutable.

T/F the line mydata.upper() will trigger an execution for an RDD. - answer✔F, RDDs are not
processed until an action is performed. upper() is a transformation.
T/F Resilience in RDD means that we lose data in memory, we can redo the transformations
based on the RDD lineage. - answer✔T, you can view the lineage of the RDD using
toDebugString

What is pipelining in Spark - answer✔When possible, Spark will perform sequences of
transformations by row, so no data is stored.
T/F If have the line "the cow eats grass" as input in our map function, then the transformation
.map(lambda x: x.upper()), will create a new RDD that transform the line to upper letters. The
line in the new RDD would read "THE COW EATS GRASS". - answer✔T

T/F Each RDD stores data in memory. - answer✔F, RDDs do NOT store data.

T/F Spark can work with all types of input file formats. - answer✔T

T/F RDDs are partitions - answer✔T, An RDD dataset is a collection of partitioned data. Tasks
are performed in parallel in each partition.

T/F Spark can only run on a cluster with YARN as Resource Manager software. - answer✔F,
Spark can run either standalone or with a cluster manager like Yarn, but can also be other
managers like Mesos.

T/F In Spark with RDDs, a groupBy is a wide transformation - answer✔T, data may reside in
multiple partitions. This would require a re-partitioning

1|Page

, ©THESTAR EXAM SOLUTIONS 2024/2025

ALL RIGHTS RESERVED.
What is a narrow transformation in the context of Spark - answer✔The records required to
compute the record resided in a single partition in the parent RDD (e.g., map, flatMap, filter)

What is a wide transformation in the context of Spark - answer✔Data required to compute
records in a partition may reside in multiple partitions of the parent RDD (e.g., groupBy,
reduceByKey, distinct, join)

T/F If Sparks runs on HDFS, then a RDD partition is created for each HDFS partition. - answer✔T,
an RDD partition is created for each HDFS partition.

Where does Spark process data - answer✔Main memory in executor
T/F An execution plans consists of stages. Each stages has a collection of tasks. Each stage only
includes transformations that are narrow. As soon as a wide transformation is applied, a new
stage starts. - answer✔T, Operations that can run on the same partition are executed in stages.
Tasks within a stage are pipelined together. Every time re-partition is needed, a new stage
starts.
T/F In Spark 3, a reshuffle triggered by a wide transformation will always result in 200 partitions
for the new DataFrame. - answer✔F, with Spark 3, there is a setting for adaptive optimization.
`spark.sql.adaptive.enable` to True

T/F Hive is a database. - answer✔F, Hive is a data warehouse system on top of Hadoop that
allows access to data using HiveQL.

T/F Impala does not rely on MapReduce or Tez, but relies on its own query engine. - answer✔F,
Impala has its own execution engine and does not use MapReduce. It was developed because
MapReduce was too slow for real-time.

T/F Impala is fault tolerant - answer✔F, Impala is not fault tolerant, if a node fails, your query
will also fail.

T/F spark.sql('SELECT * FROM rating') returns a DataFrame - answer✔T, We create a temporary
view and use spark.sql to create DataFrames with a query language you may be more familiar
with.

A wide column database is a - answer✔Row store, i.e., data in one row is stored together

What are strengths of a Widecolumn database? - answer✔Fast read and writes of data
Scalability to large amount of data
High availability of system



2|Page

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller TheStar. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $11.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

67096 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$11.49
  • (0)
  Add to cart