Friday, September 24, 2021

Big Data Computing: Quiz Assignment-IV Solutions (Week-4)

1. Identify the correct choices for the given scenarios:
P: The system allows operations all the time, and operations return quickly
Q: All nodes see same data at any time, or reads return latest written value by any client R: 

The system continues to work in spite of network partitions
A. P: Consistency, Q: Availability, R: Partition tolerance
B. P: Availability, Q: Consistency, R: Partition tolerance
C. P: Partition tolerance, Q: Consistency, R: Availability
D. P: Consistency, Q: Partition tolerance, R: Availability
 

Answer: B) P: Availability, Q: Consistency, R: Partition tolerance 

Explanation:
CAP Theorem states following properties:
Consistency: All nodes see same data at any time, or reads return latest written value by any client.
Availability: The system allows operations all the time, and operations return quickly.
Partition-tolerance: The system continues to work in spite of network partitions.

2. Cassandra uses a protocol called to discover location and state information about the other nodes participating in a Cassandra cluster.
A. Key-value
B. Memtable
C. Heartbeat
D. Gossip

Answer: D) Gossip

Explanation: Cassandra uses a protocol called Gossip to obtain information about the location and status of the other nodes participating in a Cassandra cluster. Gossip is a peer-to-peer communication protocol in which nodes regularly exchange status information about themselves and about other nodes they know.


3. In Cassandra, is used to specify data centers and the number of replicas to place within each data center. It attempts to place replicas on distinct racks to avoid the node failure and to ensure data availability.
A. Simple strategy
B. Quorum strategy
C. Network topology strategy
D. None of the mentioned 

Answer: C) Network topology strategy

Explanation: The network topology strategy is used to specify the data centers and the number of replicas to be placed in each data center. Try to place replicas in different racks to avoid node failure and ensure data availability. In the network topology strategy, the two most common methods for configuring multiple data center clusters are: two replicas in each data center and three replicas in each data center.

 

4. True or False ?
A Snitch determines which data centers and racks nodes belong to. Snitches inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into data centers and racks.
A. True
B. False

Answer: True

Explanation: A snitch determines which data centers and rack nodes they belong to. The snitches inform Cassandra about the network topology so that requests can be routed efficiently and Cassandra can distribute replicas by grouping machines in data centers and racks. In particular, the replication strategy places the replicas based on the information provided by the new Snitch. All nodes must return to the same rack and data center. Cassandra tries her best not to have more than one replica on the same shelf (which is not necessarily a physical location).


5. Consider the following statements:
Statement 1: In Cassandra, during a write operation, when hinted handoff is enabled and If any replica is down, the coordinator writes to all other replicas, and keeps the write locally until down replica comes back up.
Statement 2: In Cassandra, Ec2Snitch is important snitch for deployments and it is a simple snitch for Amazon EC2 deployments where all nodes are in a single region. In Ec2Snitch region name refers to data center and availability zone refers to rack in a cluster.
A. Only Statement 1 is true
B. Only Statement 2 is true
C. Both Statements are true
D. Both Statements are false

Answer: C) Both Statements are true

Explanation: Cassandra uses a protocol called Gossip to obtain information about the location and status of the other nodes participating in a Cassandra cluster. Gossip is a peer-to-peer communication protocol in which nodes regularly exchange status information about themselves and about other nodes they know.

 

6. What is Eventual Consistency ?
A. At any time, the system is linearizable
B. If writes stop, all reads will return the same value after a while
C. At any time, concurrent reads from any node return the same values
D. If writes stop, a distributed system will become consistent

Answer: B) If writes stop, all reads will return the same value after a while

Explanation: Cassandra offers Eventual Consistency. Is says that If writes to a key stop, all replicas of key will converge automatically.

 

7. Consider the following statements:
Statement 1: When two processes are competing with each other causing data corruption, it is called deadlock
Statement 2: When two processes are waiting for each other directly or indirectly, it is called race condition
A. Only Statement 1 is true
B. Only Statement 2 is true
C. Both Statements are false
D. Both Statements are true 

Answer: C) Both Statements are false 

Explanation: The correct statements are:
Statement 1: When two processes are competing with each other causing data corruption, it is called Race Condition
Statement 2: When two processes are waiting for each other directly or indirectly, it is called deadlock.


8. ZooKeeper allows distributed processes to coordinate with each other through registers, known as
A. znodes
B. hnodes
C. vnodes
D. rnodes

Answer: A) znodes

Explanation: Every znode is identified by a path, with path elements separated by a slash.



9. In Zookeeper, when a is triggered the client receives a packet saying that the znode has changed.
A. Event
B. Row
C. Watch
D. Value

Answer: C) Watch

Explanation: ZooKeeper supports the concept of watches. Clients can set a watch on a znodes.


10. Consider the Table temperature_details in Keyspace “day3” with schema as follows:
temperature_details(daynum, year,month,date,max_temp)
with primary key(daynum,year,month,date) 

DayNum

Year

Month

Date

MaxTemp (°C)

1

1943

10

1

14.1

2

1943

10

2

16.4

541

1945

3

24

21.1

9970

1971

1

16

21.4

20174

1998

12

24

36.7

21223

2001

11

7

16

4317

1955

7

26

16.7

 There exists same maximum temperature at different hours of the same day. Choose the correct CQL query to:

Alter table temperature_details to add a new column called “seasons” using map of type
<varint, text> represented as <month, season>. Season can have the following values season={spring, summer, autumn, winter}.
Update table temperature_details where columns daynum, year, month, date contain the following values- 4317,1955,7,26 respectively.
Use the select statement to output the row after updation.
Note: A map relates one item to another with a key-value pair. For each key, only one value may exist, and duplicates cannot be stored. Both the key and the value are designated with a data type.

A)
cqlsh:day3> alter table temperature_details add hours1 set<varint>;
cqlsh:day3> update temperature_details set hours1={1,5,9,13,5,9} where daynum=4317; cqlsh:day3> select * from temperature_details where daynum=4317;


B)
cqlsh:day3> alter table temperature_details add seasons map<varint,text>;
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317 and year =1955 and month = 7 and date=26;
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;


C)
cqlsh:day3>alter table temperature_details add hours1 list<varint>;
cqlsh:day3> update temperature_details set hours1=[1,5,9,13,5,9] where daynum=4317 and year = 1955 and month = 7 and date=26;
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;


D) cqlsh:day3> alter table temperature_details add seasons map<month, season>;
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317;
cqlsh:day3> select * from temperature_details where daynum=4317;

Answer: B)
cqlsh:day3> alter table temperature_details add seasons map<varint,text>;
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317 and year =1955 and month = 7 and date=26;
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;


Explanation:
The correct steps are:
a) Add column “seasons”
cqlsh:day3> alter table temperature_details add seasons map<varint,text>;

b) Update table
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317 and year =1955 and month = 7 and date=26;

c) Select query
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;

 

daynum

year

month

date

hours

hours1

max_temp

seasons

4317

1955

7

26

{1,5,9,13}

[1,5,9,13,5,9]

16.7

{7:’spring’}

No comments:

Post a Comment

Open Researcher and Contributor ID (ORCID)

Search Aptipedia