1. Columns in HBase are organized to
A. Column group
B. Column list
C. Column base
D. Column families
Answer: D) Column families
Explanation: The HBase table consists of a column family which is a logical and physical grouping of columns. Columns of one family are stored separately from columns of other families.
2. HBase is a distributed database built on top of the Hadoop file system.
A. Row-oriented
B. Tuple-oriented
C. Column-oriented
D. None of the mentioned
Answer: C) Column-oriented
Explanation: Column-oriented distributed data storage capable of horizontally scaling up to 1,000 standard servers and petabytes of indexed storage.
3. A small chunk of data residing in one machine which is part of a cluster of machines holding one HBase table is known as
A. Region
B. Split
C. Rowarea
D. Tablearea
Answer : A) Region
Explanation: In Hbase, table Split into regions and served by region servers.
4. In HBase, is a combination of row, column family, column qualifier and contains a value and a timestamp.
A. Cell
B. Stores
C. HMaster
D. Region Server
Answer: A) Cell
Explanation: Data is stored in the HBASE table Cells and Cells are a combination of rows, column families, column qualifiers and contain values and timestamps.
5. HBase architecture has 3 main components:
A. Client, Column family, Region Server
B. Cell, Rowkey, Stores
C. HMaster, Region Server, Zookeeper
D. HMaster, Stores, Region Server
Answer: C) HMaster, Region Server, Zookeeper
Explanation: HBase architecture has 3 main components: HMaster, Region Server, Zookeeper.
1. HMaster: The Master Server implementation in HBase is HMaster. This is the process in which the region is assigned to the region server and DDL operations (create, drop tables). Monitors all Regional Server instances in the cluster.
2. Region Servers: The HBase table is divided horizontally by row key range into regions. Regions are the basic building blocks of the HBase cluster which consist of distribution tables and consist of column families. Region Server running on HDFS DataNode which is in Hadoop cluster.
3. Zookeeper: It's like the coordinator at HBase. It provides services such as maintaining configuration information, naming, distributed synchronization, server failure notification, etc. The client communicates with the regional server via zookeeper.
6. HBase stores data in
A. As many filesystems as the number of region servers
B. One filesystem per column family
C. A single filesystem available to all region servers
D. One filesystem per table.
Answer : C) A single filesystem available to all region servers
7. Kafka is run as a cluster comprised of one or more servers each of which is called
A. cTakes
B. Chunks
C. Broker
D. None of the mentioned
Answer: C) Broker
Explanation: Kafka broker allows consumers to retrieve messages by subject, partition and offset. Kafka brokers can create Kafka clusters by sharing information directly or indirectly using Zookeeper. A Kafka cluster has exactly one broker acting as a controller.
8. True or False ?
Statement 1: Batch Processing provides ability to process and analyze data at-rest (stored data)
Statement 2: Stream Processing provides ability to ingest, process and analyze data in- motion in real or near-real-time.
A. Only statement 1 is true
B. Only statement 2 is true
C. Both statements are true
D. Both statements are false
Answer: C) Both statements are true
9. _________________is a central hub to transport and store event streams in real time.
A. Kafka Core
B. Kafka Connect
C. Kafka Streams
D. None of the mentioned
Answer: A) Kafka Core
Explanation: Kafka Core is a central hub to transport and store event streams in real time.
10. What are the parameters defined to specify window operation ?
A. State size, window length
B. State size, sliding interval
C. Window length, sliding interval
D. None of the mentioned
Answer: C) Window length, sliding interval
Explanation:
Following parameters are used to specify window operation:
i) Window length: duration of the window
(ii) Sliding interval: interval at which the window operation is performed Both the parameters must be a multiple of the batch interval
11. _________________is a Java library to process event streams live as they occur.
A. Kafka Core
B. Kafka Connect
C. Kafka Streams
D. None of the mentioned
Answer: C) Kafka Streams
Explanation: Kafka Streams is a Java library to process event streams live as they occur.