No student devices needed. Know more
20 questions
________________refers to the biases, noise and abnormality in data, trustworthiness of data.
Value
Veracity
Velocity
Volume
_______________refers to the connectedness of big data.
Value
Veracity
Velocity
Valence
Consider the following statements:
Statement 1: Volatility refers to the data velocity relative to timescale of event being studied
Statement 2: Viscosity refers to the rate of data loss and stable lifetime of data
Only statement 1 is true
Only statement 2 is true
Both statements are true
Both statements are false
What are the main components of Hadoop Ecosystem?
MapReduce, HDFS, YARN
MLlib, GraphX
Gelly, Table, CEP
None of the mentioned
True or False ?
NoSQL databases store unstructured data with no particular schema.
True
False
Which of the following is not a NoSQL database?
HBase
SQL Server
Cassandra
None of the mentioned
________________is a resource management platform responsible for managing compute resources in the cluster and using them in order to schedule users and applications.
Hadoop Common
Hadoop Distributed File System (HDFS)
Hadoop YARN
Hadoop MapReduce
Which of the following tool is designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Apache Sqoop
Pig
Mahout
Flume
Consider the following statements:
Statement 1: The Job Tracker is hosted inside the master and it receives the job execution request from the client.
Statement 2: Task tracker is the MapReduce component on the slave machine as there are multiple slave machines.
Only statement 1 is true
Only statement 2 is true
Both statements are true
Both statements are false
_____________is the slave/worker node and holds the user data in the form of Data Blocks.
NameNode
DataNode
Data block
Replication
The number of maps in MapReduce is usually driven by the total size of________________
Inputs
Outputs
Tasks
None of the mentioned
_______________function processes a key/value pair to generate a set of intermediate key/value pairs.
Map
Reduce
Both Map and Reduce
None of the mentioned
True or False ?
The main duties of task tracker are to break down the receive job that is big computations in small parts, allocate the partial computations that is tasks to the slave nodes monitoring the progress and report of task execution from the slave.
True
False
Point out the correct statement in context of YARN:
YARN extends the power of Hadoop to incumbent and new technologies found within the data center
YARN is highly scalable
YARN enhances a Hadoop compute cluster in many ways
All of the mentioned
Apache Hadoop YARN stands for:
Yet Another Reserve Negotiator
Yet Another Resource Network
Yet Another Resource Negotiator
Yet Another Resource Manager
Consider the pseudo-code for MapReduce's WordCount example (not shown here). Let's now assume that you want to determine the frequency of phrases consisting of 3 words each instead of determining the frequency of single words. Which part of the pseudo-code do you need to adapt?
Only map()
Only reduce()
Both map() and reduce()
The code does not have to be changed
For which of the following operations is NO communication with the NameNode required?
A client writing a file to HDFS.
A client requesting the filename of a given block of data.
A client reading a block of data from the cluster.
A client reading a file from the cluster.
Which of the following components reside on a NameNode?
Filenames, blocks and checksums
Blocks and heartbeat messages
Filenames, block locations
Blocks and block locations
Consider the pseudo-code for MapReduce's WordCount example (not shown here). Let's now assume that you want to determine the average amount of words per sentence. Which part of the (pseudo-)code do you need to adapt?
Only map()
Only reduce()
map() and reduce()
The code does not have to be changed.
Which of the following statements are true about key/value pairs in Hadoop?
A map() function can emit up to a maximum number of key/value pairs (depending on the Hadoop environment).
A map() function can emit anything between zero and an unlimited number of key/value pairs.
A reduce() function can iterate over key/value pairs multiple times.
A call to reduce() is guaranteed to receive key/value pairs from only one key.
Explore all questions with a free account