Monday, September 27, 2021

Promote your blog or website

1. Submit to Search Engine - Submit your blog / website to the major and regional search engine for easy search and visit. Use meta tags to make the search engine crawl the website. Use the correct keyword for your website and the title that describes your website. Use search engine submitters. There are many websites that offer free shipping to search engines.

For example: http://www.coltdurl.com/

2. Submit RSS Feed - When submitting your website URL, you must submit your website RSS feed (if available) in the same way. This improves the revenue generated by AdSense for feeds, a feature of Google AdSense. There are many websites and software that offer RSS feeds for free. This feature sends RSS feeds to online news channels and readers like Google Reader, etc.

For example: http: // www.pingomatic.com/

3. Submit URL to Directory - Many search engines rely on the online directory for searches. the relevant content / website being searched for. So please submit your blog / website url to a specific directory of a specific category. For example, Google uses the directory www.dmog.org to crawl the relevant website, so submit your blog to a specific category. If you have an educational website / blog, please submit it in the Education / Academic category etc.

For example: http://www.freedirectorysubmission.com/

4. Email signature - Set your email signature with the name of your website and the URL with your name. When you send the email to someone, that person will see your email signature with a clickable URL, and if they are interested, they will click on this URL. However, do not send mass emails promoting your website, as this is the privacy policy of all email provider sites. Email is limited to 1000. Best used in email signature.

For example: http://www.gmail.com/

5. Join the forum - Join any website advertising forum to promote your website, and you can also discuss your website and receive opinions from other members. There are many experienced bloggers and website marketers who will surely help you promote your website / blog. They will also help you design the blog / website to match the advertisement.

For example: http://www.linkreferral.com/

6. Join the social networking site - Join the social networking site as often as possible by providing the link on your website with the correct title and description . Please use the promotional feature provided by these social media sites. Attach your feed to www.twitter.com to promote the website using the publishing service www.feedburner.com.

For example: http://www.facebook.com/

7. Use reciprocal links - This is also known as backlinks or exchange links. There are many websites that offer backlink and exchange link functionality. But you can also exchange the link with your friend and put a clickable link on your website / blog with the agreement with your friend that they will also put your website link on their website.

For example: http://www.stufenraffic.net/

8. Online Advertising - There are many websites that offer free online advertising. In it you have to provide all the information about your website and your contact details. If you have an approved Google AdSense account, you will receive a letter from Google AdWords providing you with Rs. 1500 / for free advertising from your Google AdSense publishers. You can advertise on the huge Google network with redemption code

provided by Google AdWords. For example: http://www.quikr.com/

9. Set as homepage - If you depend on an Internet cafe for your work, set the browser's homepage to the URL of your website. When someone else opens their web browser, they will see your website and will keep your website name / URL if interested. Feel free to do this. I discovered this idea and it works for me because a lot of people like my blog.

10. Tell a friend -Talk to your friends about your website and blog and try to convince them to use them. It doesn't matter what they think of your website. If possible, tell someone else about your website and the features you have provided on your website. Always speak positively about your website.

11. Email Subscription - Many online readers and websites made email subscription functionality available. Through this feature, every visitor will get the warning message on their email ID (if subscribed) when they update / publish content on their blog / website.

For example: http://www.feedburner.com/

12. SMS Alert - Google has provided this SMS notification feature, with which each member who subscribes to your website / blog via their email number mobile phone receives an SMS when updating the blog. Website.

For example: http://www.googlelabs.com/

 

SEO techniques:

SEO: Search engine optimization is a technique to improve the ranking of websites by using some techniques and suitable keywords, etc. . Here are a few:

1. MetaTag: Create or generate a meta tag with the meta tag generator or you can create it yourself using the tag in HTML and inserting it into the HEAD section of the HTML document. It should contain keywords about your website, the description and title of your website, and information about you.

2. Keywords: Keywords are the most important feature of any website that the search engine wants to crawl. Include the most searchable keyword in your meta tag. You can also use the Keyword Builder, but these are not good enough. So, use your SEO knowledge and do it manually.

3. Website Title: Make the website title easy for your website visitors to remember and suitable for search engine searches. Your website title should be perfect and it is best to use common English words and phrases as they are easy to remember.

4. Description: Please describe your website as accurately as possible and be very correct. Before writing a description, analyze your website. Please use the correct keyword in your description so that it is traceable.

5. HTML Tags: Use fewer HTML tags if possible, as HTML tags interfere with the search engine when crawling a search term. Use h1, h2, h3 tags for titles and subtitles and B tag for blogs, etc. 

6. Minimize: minimize the use of CSS, Flash, etc. It slows down the loading of the website in the client-side web browser. You can use it if you need to, but you can't use it in the content area.

Friday, September 24, 2021

Big Data Computing: Quiz Assignment-IV Solutions (Week-4)

1. Identify the correct choices for the given scenarios:
P: The system allows operations all the time, and operations return quickly
Q: All nodes see same data at any time, or reads return latest written value by any client R: 

The system continues to work in spite of network partitions
A. P: Consistency, Q: Availability, R: Partition tolerance
B. P: Availability, Q: Consistency, R: Partition tolerance
C. P: Partition tolerance, Q: Consistency, R: Availability
D. P: Consistency, Q: Partition tolerance, R: Availability
 

Answer: B) P: Availability, Q: Consistency, R: Partition tolerance 

Explanation:
CAP Theorem states following properties:
Consistency: All nodes see same data at any time, or reads return latest written value by any client.
Availability: The system allows operations all the time, and operations return quickly.
Partition-tolerance: The system continues to work in spite of network partitions.

2. Cassandra uses a protocol called to discover location and state information about the other nodes participating in a Cassandra cluster.
A. Key-value
B. Memtable
C. Heartbeat
D. Gossip

Answer: D) Gossip

Explanation: Cassandra uses a protocol called Gossip to obtain information about the location and status of the other nodes participating in a Cassandra cluster. Gossip is a peer-to-peer communication protocol in which nodes regularly exchange status information about themselves and about other nodes they know.


3. In Cassandra, is used to specify data centers and the number of replicas to place within each data center. It attempts to place replicas on distinct racks to avoid the node failure and to ensure data availability.
A. Simple strategy
B. Quorum strategy
C. Network topology strategy
D. None of the mentioned 

Answer: C) Network topology strategy

Explanation: The network topology strategy is used to specify the data centers and the number of replicas to be placed in each data center. Try to place replicas in different racks to avoid node failure and ensure data availability. In the network topology strategy, the two most common methods for configuring multiple data center clusters are: two replicas in each data center and three replicas in each data center.

 

4. True or False ?
A Snitch determines which data centers and racks nodes belong to. Snitches inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into data centers and racks.
A. True
B. False

Answer: True

Explanation: A snitch determines which data centers and rack nodes they belong to. The snitches inform Cassandra about the network topology so that requests can be routed efficiently and Cassandra can distribute replicas by grouping machines in data centers and racks. In particular, the replication strategy places the replicas based on the information provided by the new Snitch. All nodes must return to the same rack and data center. Cassandra tries her best not to have more than one replica on the same shelf (which is not necessarily a physical location).


5. Consider the following statements:
Statement 1: In Cassandra, during a write operation, when hinted handoff is enabled and If any replica is down, the coordinator writes to all other replicas, and keeps the write locally until down replica comes back up.
Statement 2: In Cassandra, Ec2Snitch is important snitch for deployments and it is a simple snitch for Amazon EC2 deployments where all nodes are in a single region. In Ec2Snitch region name refers to data center and availability zone refers to rack in a cluster.
A. Only Statement 1 is true
B. Only Statement 2 is true
C. Both Statements are true
D. Both Statements are false

Answer: C) Both Statements are true

Explanation: Cassandra uses a protocol called Gossip to obtain information about the location and status of the other nodes participating in a Cassandra cluster. Gossip is a peer-to-peer communication protocol in which nodes regularly exchange status information about themselves and about other nodes they know.

 

6. What is Eventual Consistency ?
A. At any time, the system is linearizable
B. If writes stop, all reads will return the same value after a while
C. At any time, concurrent reads from any node return the same values
D. If writes stop, a distributed system will become consistent

Answer: B) If writes stop, all reads will return the same value after a while

Explanation: Cassandra offers Eventual Consistency. Is says that If writes to a key stop, all replicas of key will converge automatically.

 

7. Consider the following statements:
Statement 1: When two processes are competing with each other causing data corruption, it is called deadlock
Statement 2: When two processes are waiting for each other directly or indirectly, it is called race condition
A. Only Statement 1 is true
B. Only Statement 2 is true
C. Both Statements are false
D. Both Statements are true 

Answer: C) Both Statements are false 

Explanation: The correct statements are:
Statement 1: When two processes are competing with each other causing data corruption, it is called Race Condition
Statement 2: When two processes are waiting for each other directly or indirectly, it is called deadlock.


8. ZooKeeper allows distributed processes to coordinate with each other through registers, known as
A. znodes
B. hnodes
C. vnodes
D. rnodes

Answer: A) znodes

Explanation: Every znode is identified by a path, with path elements separated by a slash.



9. In Zookeeper, when a is triggered the client receives a packet saying that the znode has changed.
A. Event
B. Row
C. Watch
D. Value

Answer: C) Watch

Explanation: ZooKeeper supports the concept of watches. Clients can set a watch on a znodes.


10. Consider the Table temperature_details in Keyspace “day3” with schema as follows:
temperature_details(daynum, year,month,date,max_temp)
with primary key(daynum,year,month,date) 

DayNum

Year

Month

Date

MaxTemp (°C)

1

1943

10

1

14.1

2

1943

10

2

16.4

541

1945

3

24

21.1

9970

1971

1

16

21.4

20174

1998

12

24

36.7

21223

2001

11

7

16

4317

1955

7

26

16.7

 There exists same maximum temperature at different hours of the same day. Choose the correct CQL query to:

Alter table temperature_details to add a new column called “seasons” using map of type
<varint, text> represented as <month, season>. Season can have the following values season={spring, summer, autumn, winter}.
Update table temperature_details where columns daynum, year, month, date contain the following values- 4317,1955,7,26 respectively.
Use the select statement to output the row after updation.
Note: A map relates one item to another with a key-value pair. For each key, only one value may exist, and duplicates cannot be stored. Both the key and the value are designated with a data type.

A)
cqlsh:day3> alter table temperature_details add hours1 set<varint>;
cqlsh:day3> update temperature_details set hours1={1,5,9,13,5,9} where daynum=4317; cqlsh:day3> select * from temperature_details where daynum=4317;


B)
cqlsh:day3> alter table temperature_details add seasons map<varint,text>;
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317 and year =1955 and month = 7 and date=26;
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;


C)
cqlsh:day3>alter table temperature_details add hours1 list<varint>;
cqlsh:day3> update temperature_details set hours1=[1,5,9,13,5,9] where daynum=4317 and year = 1955 and month = 7 and date=26;
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;


D) cqlsh:day3> alter table temperature_details add seasons map<month, season>;
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317;
cqlsh:day3> select * from temperature_details where daynum=4317;

Answer: B)
cqlsh:day3> alter table temperature_details add seasons map<varint,text>;
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317 and year =1955 and month = 7 and date=26;
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;


Explanation:
The correct steps are:
a) Add column “seasons”
cqlsh:day3> alter table temperature_details add seasons map<varint,text>;

b) Update table
cqlsh:day3> update temperature_details set seasons = seasons + {7:'spring'} where daynum=4317 and year =1955 and month = 7 and date=26;

c) Select query
cqlsh:day3> select * from temperature_details where daynum=4317 and year=1955 and month=7 and date=26;

 

daynum

year

month

date

hours

hours1

max_temp

seasons

4317

1955

7

26

{1,5,9,13}

[1,5,9,13,5,9]

16.7

{7:’spring’}

Big Data Computing: Quiz Assignment-III Solutions (Week-3)

1. In Spark, a is a read-only collection of objects partitioned across a set of machines that can be rebuilt if a partition is lost.

A. Spark Streaming

B. FlatMap

C. Driver

D. Resilient Distributed Dataset (RDD)

Answer: D) Resilient Distributed Dataset (RDD)

Explanation: Resilient Distributed Data Sets (RDDs) are a basic Spark data structure. It is a distributed and immutable collection of objects. Each dataset in RDD is divided into logical partitions that can be computed on different nodes in the cluster. RDDs can contain any type of Python, Java, or Scala object, including custom classes. Formally, an RDD is a read-only, partitioned collection of data sets. RDDs can be created by deterministic operations on data in stable storage or other RDDs. RDD is a collection of fault tolerant elements that can be operated in parallel.


2. Given the following definition about the join transformation in Apache Spark:

def join[W](other: RDD[(K, W)]): RDD[(K, (V, W))]

Where join operation is used for joining two datasets. When it is called on datasets of type (K, V) and (K, W), it returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key.

Output the result of joinrdd, when the following code is run.

val rdd1 = sc.parallelize(Seq(("m",55),("m",56),("e",57),("e",58),("s",59),("s",54)))

val rdd2 = sc.parallelize(Seq(("m",60),("m",65),("s",61),("s",62),("h",63),("h",64))) val joinrdd = rdd1.join(rdd2)

joinrdd.collect


A. Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)),

(m,(56,65)), (s,(59,61)), (s,(59,62)), (h,(63,64)), (s,(54,61)), (s,(54,62)))

B. Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)),

(m,(56,65)), (s,(59,61)), (s,(59,62)), (e,(57,58)), (s,(54,61)), (s,(54,62)))

C. Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)),

(m,(56,65)), (s,(59,61)), (s,(59,62)), (s,(54,61)), (s,(54,62)))

D. None of the mentioned

Answer: C) Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)),

(m,(56,65)), (s,(59,61)), (s,(59,62)), (s,(54,61)), (s,(54,62)))

Explanation: join() is transformation which returns an RDD containing all pairs of elements with matching keys in this and other. Each pair of elements will be returned as a (k, (v1, v2)) tuple, where (k, v1) is in this and (k, v2) is in other.

 

3. Consider the following statements in the context of Spark:

Statement 1: Spark improves efficiency through in-memory computing primitives and general computation graphs.

Statement 2: Spark improves usability through high-level APIs in Java, Scala, Python and also provides an interactive shell.

A. Only statement 1 is true

B. Only statement 2 is true

C. Both statements are true

D. Both statements are false

Answer: C) Both statements are true

Explanation: Apache Spark is a fast and universal cluster computing system. It offers high-level APIs in Java, Scala, and Python, as well as an optimized engine that supports general-execution graphics. It also supports a variety of higher-level tools, including Spark SQL for SQL and structured computing, MLlib for machine learning, GraphX ​​for graph processing, and Spark Streaming. Spark comes with several sample programs. Spark offers an interactive shell, a powerful tool for interactive data analysis. It is available in Scala or Python language. Spark improves efficiency through in-memory computing primitives. With in-memory computing, data is kept in random access memory (RAM) instead of some slow disk drives and is processed in parallel. This allows us to recognize a pattern and analyze large amounts of data. This has become popular because it reduces the cost of storage. Therefore, in-memory processing is economical for applications.


4. True or False ?

Resilient Distributed Datasets (RDDs) are fault-tolerant and immutable.

A. True

B. False

Answer: True

Explanation: Resilient Distributed Datasets (RDDs) are:

1. Immutable collections of objects spread across a cluster

2. Built through parallel transformations (map, filter, etc.)

3. Automatically rebuilt on failure

4. Controllable persistence (e.g. caching in RAM)


5. Which of the following is not a NoSQL database?

A. HBase

B. Cassandra

C. SQL Server

D. None of the mentioned

Answer: C) SQL Server

Explanation: NoSQL, which stands for "not just SQL", is an alternative to traditional relational databases where the data is stored in tables and the data schema is carefully designed before the database is created. NoSQL databases are particularly useful for working with large amounts of distributed data.

 

6. True or False ?

Apache Spark potentially run batch-processing programs up to 100 times faster than Hadoop MapReduce in memory, or 10 times faster on disk.

A. True

B. False

Answer: True

Explanation: Spark's biggest claim about speed is that "it can run programs up to 100 times faster than Hadoop MapReduce in memory or 10 times faster on disk." Spark could make this claim because it takes care of the processing in the main memory of the worker nodes and avoids unnecessary I / O operations on the disks. The other benefit that Spark offers is the ability to chain tasks at the application programming level without actually writing to disks or minimizing the amount of writes to disks.


7. _____________leverages Spark Core fast scheduling capability to perform streaming analytics.

A. MLlib

B. Spark Streaming

C. GraphX

D. RDDs

Answer: B) Spark Streaming

Explanation: Spark Streaming ingests data in mini-batches and performs RDD transformations on those mini-batches of data.


8. _________ is a distributed graph processing framework on top of Spark.

A. MLlib

B. Spark streaming

C. GraphX

D. All of the mentioned

Answer: C) GraphX

Explanation: GraphX is Apache Spark's API for graphs and graph-parallel computation. It is a distributed graph processing framework on top of Spark.


9. Point out the incorrect statement in the context of Cassandra:

A. It is a centralized key-value store

B. It is originally designed at Facebook

C. It is designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure

D. It uses a ring-based DHT (Distributed Hash Table) but without finger tables or routing

Answer: A) It is a centralized key-value store

Explanation: Cassandra is a distributed key-value store.


10. Consider the following statements:

Statement 1: Scale out means grow your cluster capacity by replacing with more powerful machines.

Statement 2: Scale up means incrementally grow your cluster capacity by adding more COTS machines (Components Off the Shelf).

A. Only statement 1 is true

B. Only statement 2 is true

C. Both statements are false

D. Both statements are true

Answer: C) Both statements are false

Explanation: The correct statements are:

Scale up = grow your cluster capacity by replacing with more powerful machines

Scale out = incrementally grow your cluster capacity by adding more COTS machines (Components Off the Shelf)

Search Aptipedia