Showing posts with label Apache Spark GraphX. Show all posts
Showing posts with label Apache Spark GraphX. Show all posts

Sunday, October 24, 2021

Big Data Computing: Quiz Assignment-VIII Solutions (Week-8)

1. Which of the following are provided by spark API for graph parallel computations:
i. joinVertices
ii. subgraph
iii. aggregateMessages
A. Only (i)
B. Only (i) and (ii)
C. Only (ii) and (iii)
D. All of the mentioned
Answer: D) All of the mentioned


2. Which of the following statement(s) is/are true in the context of Apache Spark GraphX operators ?
S1: Property operators modify the vertex or edge properties using a user defined map function and produces a new graph.
S2: Structural operators operate on the structure of an input graph and produces a new graph. S3: Join operators add data to graphs and produces a new graphs.
A. Only S1 is true
B. Only S2 is true
C. Only S3 is true
D. All of the mentioned
Answer: D) All of the mentioned


3. True or False ?
The outerJoinVertices() operator joins the input RDD data with vertices and returns a new graph. The vertex properties are obtained by applying the user defined map() function to the all vertices, and includes ones that are not present in the input RDD.
A. True
B. False
Answer: A) True


4. Which of the following statements are true ?
S1: Apache Spark GraphX provides the following property operators - mapVertices(), mapEdges(), mapTriplets()
S2: The RDDs in Spark, depend on one or more other RDDs. The representation of dependencies in between RDDs is known as the lineage graph. Lineage graph information is used to compute each RDD on demand, so that whenever a part of persistent RDD is lost, the data that is lost can be recovered using the lineage graph information.
A. Only S1 is true
B. Only S2 is true
C. Both S1 and S2 are true
D. None of the mentioned
Answer: C) Both S1 and S2 are true


5. GraphX provides an API for expressing graph computation that can model the
abstraction.
A. GaAdt
B. Pregel
C. Spark Core
D. None of the mentioned
Answer: B) Pregel


6. Match the following:
A. Dataflow Systems i. Vertex Programs
B. Graph Systems ii. Parameter Servers
C. Shared Memory Systems iii. Guinea Pig
A. A:ii, B: i, C: iii
B. A:iii, B: i, C: ii
C. A:ii, B: iii, C: i
D. A:iii, B: ii, C: i
Answer: B) A:iii, B: i, C: ii


7. Which of the following statement(s) is/are true in context of Parameter Servers.
S1: A machine learning framework
S2: Distributes a model over multiple machines
S3: It offers two operations: (i) Pull for query parts of the model (ii) Push for update parts of the model.
A. Only S1 is true
B. Only S2 is true
C. Only S3 is true
D. All of the mentioned
Answer: D) All of the mentioned


8.



What is the PageRank score of vertex B after the second iteration? (Without damping factor)
Hint:- The basic PageRank formula is:

Where, PRt+1(u): page rank of node u under consideration PRt(v): previous page rank of node ‘v’ pointing to node ‘u’ C(v): outgoing degree of vertex ‘v’
A. 1/6
B. 1.5/12
C. 2.5/12
D. 1/3 

Answer: A) 1/6

Explanation: The Page Rank score of all vertex is calculated as follows: 

 

Iteration0

Iteration1

Iteration2

Page Rank

A

  1/4

1/12

1.5/12

1

B

1/4

2.5/12

2/12

2

C

1/4

4.5/12

4.5/12

4

D

1/4

4/12

4/12

3

 

Search Aptipedia