tag:blogger.com,1999:blog-52767766973642956622024-03-13T01:10:22.696-07:00geekRai a programmer's diary !siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.comBlogger161125tag:blogger.com,1999:blog-5276776697364295662.post-86058872261304394162019-12-12T10:43:00.001-08:002019-12-12T10:43:24.615-08:00Important Configuration Parameters of Kafka Producers <div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
The list of Kafka configurations for Producer is quite large. But, the good news is that you are not forced to configure all of them; Kafka provides a default for most of them. This may work for most of the cases. But, if you are particular about the performance, reliability, throughput, latency then it's worth revisiting them and customizing as per your specific need. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
This post, I will cover some of the important configurations. </div>
<div style="text-align: justify;">
Kafka Reference - <a href="https://kafka.apache.org/090/documentation.html#configuration">here</a>.</div>
<h3 style="text-align: left;">
<br />compression.type</h3>
<div>
Default value = none. (i.e. No compression).</div>
<div>
Available values = none, gzip, snappy, lz4</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
This is the algorithm that will be used by the producer (sitting in your application) to compress data before sending them to the brokers. If multiple messages are getting batched together before sending then this configuration improves performance. Enabling compression will reduce network utilization and storage. Snappy (invented by Google) provides decent compression ratios with low CPU overhead. Gzip, typically provides a better compression ratio but uses more CPU. So if network bandwidth is limited choose Gzip else go for Snappy. </div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
batch.size</h3>
<div>
Default value = <span style="font-family: "roboto" , sans-serif; font-size: 15px;">16384 (i.e. </span>16K bytes)</div>
<div>
<br /></div>
<div style="text-align: justify;">
Kafka Producer batches messages for each partition before sending them to the specific partition. This parameter controls the amount of memory (in bytes) which will be used for each batch. Kafka producer uses batch size and the timeout (linger.ms) to decide when to send. The producer will try to accumulate as many messages are possible (<= batch.size) and then send all of them in one go. If the batch size is very small, the Producer will be sending messages more frequently (0 value will disable batching). A larger batch size may waste some memory as the allocated memory might not get fully utilized. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
linger.ms</h3>
<div>
Default value = 0</div>
<div>
<br /></div>
<div style="text-align: justify;">
This value allows the Producer to group together records/messages before they get sent to the broker. This is the amount of time in milliseconds for which the producer will wait for accumulating messages in a batch. If this value is not set (default), then the producer will send messages as and when they arrive. Latency will be minimum for the default value. Setting this value to say, 5 will increase the latency but at the same time, it will also increase throughput (as you can send more messages in one go, so less overhead per message). If there is no load then setting it to 5 will increase latency by up to 5 ms. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
acks</h3>
<div>
Default value = 1</div>
<div>
<br /></div>
<div>
This controls the number of acknowledgments the producer requires the leader to have received before considering a request complete. This affects the durability of the message.<br />
<br />
acks=0<br />
<br />
The message is considered to be written successfully to Kafka if the producer managed to send it over then network. </div>
<div>
<br /></div>
<div>
<br /></div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com2tag:blogger.com,1999:blog-5276776697364295662.post-69837870632742696422019-12-12T10:39:00.000-08:002019-12-12T10:39:06.015-08:00Resolving ClassNotFoundException in Java<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
ClassNotFoundException is a checked exception and subclass of java.lang.Exception. Recall that, checked exceptions need to be handled either by providing try/catch or by throwing it to the caller. You get this exception when JVM is unable to load a class from the classpath. So, troubleshooting this exception requires an understanding of how classes get loaded and the significance of classpath.</div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
Root Cause of java.lang.ClassNotFoundException</h3>
<div style="text-align: justify;">
Java doc of <a href="https://docs.oracle.com/javase/7/docs/api/java/lang/ClassNotFoundException.html">java.lang.ClassNotFoundException</a> puts it quite clearly. It gets thrown in below cases when the class definition is not found-</div>
<ul style="text-align: left;">
<li>Load class using <i>forname </i>method of class <i>Class</i></li>
<li>Load class using <i>findSystemClass </i>method of <i>ClassLoader</i></li>
<li>Load using <i>loadClass </i>method of <i>ClassLoader</i></li>
</ul>
<b>References:</b><br />
<a href="http://javaeesupportpatterns.blogspot.in/2012/11/javalangclassnotfoundexception-how-to.html">http://javaeesupportpatterns.blogspot.in/2012/11/javalangclassnotfoundexception-how-to.html</a><b> </b> <br />
<br /></div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-44739404981308453322019-12-12T10:26:00.002-08:002020-12-22T01:32:55.159-08:00Distributed Data System Patterns<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: left;">I have published the post on Medium, <a href="https://rai-skumar.medium.com/distributed-data-systems-patterns-e0cae6ffe40a" target="_blank">here</a>. </div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgit2YXels5pq-lnKfNQBcx8QGtNllczUisDFNxm8_1rUyWSU3hLlXOPC5F7GUK-RDr3FzXDXCTupsGiaEGmaEayHwXAbHqMAW7OqCzE00maHq-K42j4vnVxDS5HxqH_Um4oDiucEb4eT4/s1600/dds12dec.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1551" data-original-width="1378" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgit2YXels5pq-lnKfNQBcx8QGtNllczUisDFNxm8_1rUyWSU3hLlXOPC5F7GUK-RDr3FzXDXCTupsGiaEGmaEayHwXAbHqMAW7OqCzE00maHq-K42j4vnVxDS5HxqH_Um4oDiucEb4eT4/s640/dds12dec.png" width="568" /></a></div>
<br /></div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com1tag:blogger.com,1999:blog-5276776697364295662.post-52291199864344882222019-12-02T08:27:00.001-08:002019-12-02T08:31:00.439-08:00Performance Parameters for a System<div dir="ltr" style="text-align: left;" trbidi="on">
<h3 style="text-align: left;">
<dt style="background-color: #fefefe; font-family: Georgia, Palatino, "Palatino Linotype", Times, "Times New Roman", serif; font-size: 16px; font-weight: 400;"><a href="http://en.wikipedia.org/wiki/Computer_performance" style="color: #0b0080; text-decoration-line: none;">Performance</a> is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used.</dt>
<dt style="background-color: #fefefe; font-family: Georgia, Palatino, "Palatino Linotype", Times, "Times New Roman", serif; font-size: 16px; font-weight: 400;"><br /></dt>
<dt style="background-color: #fefefe; font-family: Georgia, Palatino, "Palatino Linotype", Times, "Times New Roman", serif; font-size: 16px; font-weight: 400;"><div style="margin-bottom: 1em; margin-top: 1em;">
Depending on the context, this may involve achieving one or more of the following:</div>
<ul class="list" style="margin: 1em 0px; padding: 0px 0px 0px 2em;">
<li>Short response time/low latency for a given piece of work</li>
<li>High throughput (rate of processing work)</li>
<li>Low utilization of computing resource(s)</li>
</ul>
</dt>
</h3>
<h3 style="text-align: left;">
Response Time / Latency</h3>
<div>
The time between a client sending a request and receiving the response. The response time is what the client sees which includes the service time of the request and network & queuing delay. </div>
<div>
<br /></div>
<div style="text-align: justify;">
Even if you make the same request time and again you will see a varying response time on every try. In practice, service or application handling a variety of request, the response time can vary a lot. One obvious reason is, a request for a user having a lot of data will be slower than another user which doesn't have much data. Other reasons could be - random additional latency, loss of a network packet during TCP transmission, GC pause, page fault forcing read from disk, other mechanical or network faults. That's why we need to think of response time not as a single number but as a distribution of values. </div>
<div style="text-align: justify;">
<i><span style="color: #4c1130;"><br /></span></i></div>
<div style="text-align: justify;">
<i><span style="color: #4c1130;">If 95th percentile (p95) response time is 1.5 seconds, that means 95 out of 100 requests take less than 1.5 seconds, and 5 out of 100 requests take 1.5 seconds or more. </span></i><br />
<i><span style="color: #4c1130;"><br /></span></i>
<span style="background-color: #fefefe; font-family: "georgia" , "palatino" , "palatino linotype" , "times" , "times new roman" , serif; font-size: 16px;">low latency - achieving a short response time - is the most interesting aspect of performance, because it has a strong connection with physical (rather than financial) limitations.</span><br />
<span style="background-color: #fefefe; font-family: "georgia" , "palatino" , "palatino linotype" , "times" , "times new roman" , serif; font-size: 16px;"><br /></span>
<span style="background-color: #fefefe; color: #4c1130; font-family: "georgia" , "palatino" , "palatino linotype" , "times" , "times new roman" , serif; font-size: 16px;"><b>In a distributed system, there is a minimum latency that cannot be overcome: the speed of light limits how fast information can travel, and hardware components have a minimum latency cost incurred per operation (think RAM and hard drives but also CPUs).</b></span><br />
<h3 style="text-align: left;">
</h3>
<h3 style="text-align: left;">
<br /></h3>
<h3 style="text-align: left;">
Throughput</h3>
<div>
<div style="text-align: left;">
The number of requests or records which can be processed per second, or the total time it takes to run a job on a dataset of a certain size.</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<span style="background-color: #fefefe; font-family: "georgia" , "palatino" , "palatino linotype" , "times" , "times new roman" , serif; font-size: 16px;">There are tradeoffs involved in optimizing for any of these outcomes. For example, a system may achieve higher throughput by processing larger batches of work thereby reducing operational overhead. The tradeoff would be longer response times for individual pieces of work due to batching.</span></div>
<div style="text-align: left;">
<span style="background-color: #fefefe; font-family: "georgia" , "palatino" , "palatino linotype" , "times" , "times new roman" , serif; font-size: 16px;"><br /></span></div>
<h3 style="text-align: left;">
Resource Utilization</h3>
<div>
We want the optimal usage of the hardware resources which includes CPU, RAM, Network bandwidth. Or, in other words, do more with fewer resources. This will help in the scaling of the system. </div>
<div style="text-align: left;">
<br /></div>
</div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-9086503517248189502019-07-20T07:38:00.000-07:002019-07-20T07:38:00.291-07:00Good practices for Accessing Couchbase Programatically<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
Couchbase is one of the most popular distributed, fault-tolerant, and highly performant Document-oriented as well as a key-value store. Like any other database (relational or NoSQL), Couchbase provides language-specific Client SDK to access DB and perform CRUD operations. You can also access Couchbase through the command-line tool, <b>cbc</b> or from the Couchbase web console. </div>
<br />
This post will focus on some of the good practices for accessing Couchbase. I will be using Java SDK for this post; the concepts are applicable for any language though.<br />
<br />
<h3 style="text-align: left;">
Initialize the Connection</h3>
<div>
All operations performed against Couchbase (cluster) is through Bucket instance. From the relational world perspective, <i>Bucket</i> is the database instance. And to create bucket instance, we need to have the instance of the <i>Cluster</i>. </div>
<div>
<br /></div>
<div style="text-align: justify;">
<span style="color: #741b47;"><i>The important point to be noted is that we should only create <b>ONE</b> connection to the Couchbase cluster and <b>ONE</b> connection to <b>each bucket</b>, and then statically reference those connections for use across the application. Reusing the connections will ensure that underlying resources are utilized to the fullest. </i></span></div>
<div style="text-align: justify;">
<span style="color: #741b47;"><i><br /></i></span></div>
<div style="text-align: justify;">
<div style="font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13.199999809265137px; text-align: left;">
<pre style="background-color: #eff0f1; border: 0px; font-size: 13px; margin-bottom: 1em; max-height: 600px; overflow: auto; padding: 5px; width: auto; word-wrap: normal;"><div style="font-size: medium; white-space: normal;">
<span style="color: #444444; font-family: -webkit-standard;">// connects to cluster running on localhost</span></div>
<div style="font-size: medium; white-space: normal;">
<span style="font-family: -webkit-standard;"><span style="color: #4c1130;">Cluster cluster = CouchbaseCluster.create(); </span></span></div>
<div style="color: #222222; font-size: medium; white-space: normal;">
<span style="color: black; font-family: -webkit-standard;">
</span></div>
<div style="font-size: medium; white-space: normal;">
<span style="font-family: -webkit-standard;"><span style="color: #444444;">// Connects cluster on 10.0.0.1 and if it fails then tries 10.0.0.2</span></span></div>
<span style="font-size: small; white-space: normal;"><span style="color: #4c1130;">Custer cluster = CouchbaseCluster.create("10.0.0.1", "10.0.0.2");</span></span><span style="color: #222222;">
</span></pre>
</div>
</div>
<div>
<br /></div>
Now, the Cluster instance is created, we can create Bucket instance to complete the initialization.<br />
<br />
<div style="text-align: justify;">
<div style="text-align: left;">
<pre style="background-color: #eff0f1; border: 0px; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; margin-bottom: 1em; max-height: 600px; overflow: auto; padding: 5px; width: auto; word-wrap: normal;"><div style="font-size: medium; white-space: normal;">
<span style="color: #444444; font-family: -webkit-standard;">// Opens the default bucket</span></div>
<div style="font-size: medium; white-space: normal;">
<span style="font-family: -webkit-standard;"><span style="color: #4c1130;">Bucket bucket = cluster.openBucket(); </span></span></div>
<div style="font-size: medium; white-space: normal;">
<span style="font-family: -webkit-standard;"><span style="color: #4c1130;">
</span></span></div>
<div style="font-size: medium; white-space: normal;">
<pre style="border: 0px; font-size: 13px; margin-bottom: 1em; max-height: 600px; overflow: auto; padding: 5px; width: auto; word-wrap: normal;"><div style="font-size: medium; white-space: normal;">
<span style="color: #444444; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;">
</span>
<span style="color: #444444; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif;">// Opens connections to demo bucket</span></div>
<span style="color: #4c1130; font-size: small; white-space: normal;">Bucket bucket = cluster.openBucket("demo"); </span></pre>
</div>
<div style="font-size: medium; white-space: normal;">
<span style="font-family: -webkit-standard;"><span style="color: #444444;">// Opens connections to SECURED demo bucket</span></span></div>
<span style="color: #4c1130; font-size: small; white-space: normal;">Bucket bucket = cluster.openBucket("demo", "p@ssword"); </span><span style="color: #222222;">
</span></pre>
<div style="font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13.199999809265137px;">
<span style="font-size: small; white-space: normal;"><span style="color: #4c1130;"><br /></span></span></div>
<div style="font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13.199999809265137px;">
<span style="font-size: small; white-space: normal;"><span style="color: #4c1130;"><b>Tips#</b></span></span></div>
<div>
<span style="color: #4c1130;"><i>It's good practice to pass at least two IP addresses in the create method. At the time of this call if the first host is down (for some reason); 2nd host will be tried. These IPs will be used only during initialization. If there is only one host and it's down, then you are out of luck and bucket instance will not be created!</i></span><br />
<span style="color: #4c1130;"><i><br /></i></span></div>
<div style="font-family: arial, tahoma, helvetica, freesans, sans-serif;">
<div style="font-size: 13.199999809265137px;">
<span style="font-size: small; white-space: normal;"><span style="color: #4c1130;"><b><br /></b></span></span></div>
<div style="font-size: 13.199999809265137px;">
<span style="font-size: small; white-space: normal;"><span style="color: #4c1130;"><b><br /></b></span></span></div>
<span style="color: #4c1130;"><b>References:</b></span><br />
<a href="https://blog.couchbase.com/10-things-developers-should-know-about-couchbase/">https://blog.couchbase.com/10-things-developers-should-know-about-couchbase/</a></div>
</div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-78482572090935753832019-07-20T07:29:00.002-07:002019-07-20T07:29:39.376-07:00Thoughts on GC friendly programmig<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
Garbage Collections in Java gets triggered automatically to reclaim some of the occupied memory by freeing up objects. Hotspot VM divides the Heap into different memory segments to optimize the garbage collection cycle. It mainly separates objects into two segments - young generation and old generation.<br />
<br />
Objects get initially created into young gen. Young gen is quite small and thus minor garbage collection runs on it. If objects survive the minor GC; then they get moved to the old gen. So it's better to use short-lived and immutable objects than long-lived mutable objects.<br />
<br />
Minor GC is quite fast (as its runs on smaller memory segment) and hence it's less disruptive. The ideal scenario will be that GC never compacts old gen. So if full GC can be avoided you will achieve the best performance.<br />
<br />
So a lot depends on how you have configured your heap memory and another important factor is do you code keeping in mind these aspects.<br />
<br />
<a href="http://www.ibm.com/developerworks/library/j-leaks/">http://www.ibm.com/developerworks/library/j-leaks/</a><br />
<a href="http://stackoverflow.com/questions/6470651/creating-a-memory-leak-with-java/6471947#6471947">http://stackoverflow.com/questions/6470651/creating-a-memory-leak-with-java/6471947#6471947</a></div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-531632574012136412019-07-20T07:24:00.000-07:002019-07-20T07:25:55.071-07:00Graph DFS traversal<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px;">
This post, I will be focusing on Depth-First traversal. </div>
<br style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;" />
<span style="background-color: white; color: #222222; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">The difference between BFS and DFS traversal lies in the order in which vertices are explored. And this mainly comes due to the data structure used to do traversal. BFS uses queue whereas DFS uses a stack data structure to perform traversal. Below diagram illustrates the order of traversal. </span><br />
<span style="background-color: white; color: #222222; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"><br /></span>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj43-nOrC9pUAIstJLQPGwSo2sCh9oEs5WmNYWjLp6u4J-PFQIA0hJhgZjXUQK8e4Izk359eRcoIxKwt7Nl59Xaj3v5AGozuafxhs5UfO8uI5yY3_5CvVPcJYNOsACo32pTe4mbZY9ePpQ/s1600/DFSProgress.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="396" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj43-nOrC9pUAIstJLQPGwSo2sCh9oEs5WmNYWjLp6u4J-PFQIA0hJhgZjXUQK8e4Izk359eRcoIxKwt7Nl59Xaj3v5AGozuafxhs5UfO8uI5yY3_5CvVPcJYNOsACo32pTe4mbZY9ePpQ/s1600/DFSProgress.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">DFS Progress</td></tr>
</tbody></table>
<!-- HTML generated using hilite.me -->
<br />
<h3 style="text-align: justify;">
DFS Implementation</h3>
<span style="background-color: white; color: #222222; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">DFS implementation uses UndirectedGraph.java from the BFS </span><a href="http://geekrai.blogspot.in/2014/08/graph-breadth-first-traversal.html" style="background-color: white; color: #888888; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify; text-decoration: none;">post</a><span style="background-color: white; color: #222222; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">. Below class has two implementations of DFS traversals (recursive and iterative). Method, </span><i style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">traverse(..)</i><span style="background-color: white; color: #222222; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"> is stack based implementation whereas </span><i style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">traverseRecur(..)</i><span style="background-color: white; color: #222222; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"> is recursive implementation.</span><br />
<br />
<div style="background: #f0f0f0; border-width: .1em .1em .1em .3em; border: solid gray; overflow: auto; padding: .2em .3em; width: auto;">
<pre style="line-height: 125%; margin: 0;"><span style="color: #007020; font-weight: bold;">package</span> graph<span style="color: #666666;">.</span><span style="color: #4070a0;">algo</span><span style="color: #666666;">;</span>
<span style="color: #007020; font-weight: bold;">import</span> <span style="color: #0e84b5; font-weight: bold;">java.util.ArrayDeque</span><span style="color: #666666;">;</span>
<span style="color: #007020; font-weight: bold;">import</span> <span style="color: #0e84b5; font-weight: bold;">java.util.Deque</span><span style="color: #666666;">;</span>
<span style="color: #007020; font-weight: bold;">import</span> <span style="color: #0e84b5; font-weight: bold;">java.util.LinkedHashSet</span><span style="color: #666666;">;</span>
<span style="color: #007020; font-weight: bold;">import</span> <span style="color: #0e84b5; font-weight: bold;">java.util.List</span><span style="color: #666666;">;</span>
<span style="color: #007020; font-weight: bold;">import</span> <span style="color: #0e84b5; font-weight: bold;">java.util.Objects</span><span style="color: #666666;">;</span>
<span style="color: #007020; font-weight: bold;">import</span> <span style="color: #0e84b5; font-weight: bold;">java.util.Set</span><span style="color: #666666;">;</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * Depth-First Traversal of a graph represented as adjacency list</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> * @author Siddheshwar</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #007020; font-weight: bold;">public</span> <span style="color: #007020; font-weight: bold;">class</span> <span style="color: #0e84b5; font-weight: bold;">DFS</span><span style="color: #666666;"><</span>V<span style="color: #666666;">></span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">private</span> Graph<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> graph<span style="color: #666666;">;</span> <span style="color: #60a0b0; font-style: italic;">// graph on which DFS will be performed</span>
<span style="color: #007020; font-weight: bold;">private</span> Deque<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> stack<span style="color: #666666;">;</span> <span style="color: #60a0b0; font-style: italic;">// Deque used to implement stack</span>
<span style="color: #007020; font-weight: bold;">private</span> Set<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> visited<span style="color: #666666;">;</span> <span style="color: #60a0b0; font-style: italic;">// stores set of visited nodes</span>
<span style="color: #007020; font-weight: bold;">public</span> <span style="color: #06287e;">DFS</span><span style="color: #666666;">(</span>Graph<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> graph<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">graph</span> <span style="color: #666666;">=</span> graph<span style="color: #666666;">;</span>
stack <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">new</span> ArrayDeque<span style="color: #666666;"><>();</span>
visited <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">new</span> LinkedHashSet<span style="color: #666666;"><>();</span> <span style="color: #60a0b0; font-style: italic;">// to maintain the insertion order</span>
<span style="color: #666666;">}</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * Iterative/stack based DFS implementation</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> * @param source</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #007020; font-weight: bold;">public</span> <span style="color: #902000;">void</span> <span style="color: #06287e;">traverse</span><span style="color: #666666;">(</span>V source<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
Objects<span style="color: #666666;">.</span><span style="color: #4070a0;">requireNonNull</span><span style="color: #666666;">(</span>source<span style="color: #666666;">,</span> <span style="color: #4070a0;">"source is manadatory!"</span><span style="color: #666666;">);</span>
<span style="color: #007020; font-weight: bold;">if</span> <span style="color: #666666;">(</span><span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">graph</span> <span style="color: #666666;">==</span> <span style="color: #007020; font-weight: bold;">null</span> <span style="color: #666666;">||</span> <span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">graph</span><span style="color: #666666;">.</span><span style="color: #4070a0;">isEmpty</span><span style="color: #666666;">())</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">throw</span> <span style="color: #007020; font-weight: bold;">new</span> <span style="color: #06287e;">IllegalStateException</span><span style="color: #666666;">(</span>
<span style="color: #4070a0;">"Valid graph object is required !!!"</span><span style="color: #666666;">);</span>
<span style="color: #666666;">}</span>
stack<span style="color: #666666;">.</span><span style="color: #4070a0;">push</span><span style="color: #666666;">(</span>source<span style="color: #666666;">);</span>
<span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">markAsVisited</span><span style="color: #666666;">(</span>source<span style="color: #666666;">);</span>
<span style="color: #902000;">boolean</span> pop <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">false</span><span style="color: #666666;">;</span>
V stackTopVertex<span style="color: #666666;">;</span>
System<span style="color: #666666;">.</span><span style="color: #4070a0;">out</span><span style="color: #666666;">.</span><span style="color: #4070a0;">print</span><span style="color: #666666;">(</span><span style="color: #4070a0;">" "</span> <span style="color: #666666;">+</span> source<span style="color: #666666;">);</span>
<span style="color: #007020; font-weight: bold;">while</span> <span style="color: #666666;">(!</span>stack<span style="color: #666666;">.</span><span style="color: #4070a0;">isEmpty</span><span style="color: #666666;">())</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">if</span> <span style="color: #666666;">(</span>pop <span style="color: #666666;">==</span> <span style="color: #007020; font-weight: bold;">true</span><span style="color: #666666;">)</span>
stackTopVertex <span style="color: #666666;">=</span> stack<span style="color: #666666;">.</span><span style="color: #4070a0;">pop</span><span style="color: #666666;">();</span>
<span style="color: #007020; font-weight: bold;">else</span>
stackTopVertex <span style="color: #666666;">=</span> stack<span style="color: #666666;">.</span><span style="color: #4070a0;">peek</span><span style="color: #666666;">();</span>
List<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> neighbors <span style="color: #666666;">=</span> graph<span style="color: #666666;">.</span><span style="color: #4070a0;">getAdjacentVertices</span><span style="color: #666666;">(</span>stackTopVertex<span style="color: #666666;">);</span>
<span style="color: #007020; font-weight: bold;">if</span> <span style="color: #666666;">(!</span>neighbors<span style="color: #666666;">.</span><span style="color: #4070a0;">isEmpty</span><span style="color: #666666;">()</span>
<span style="color: #666666;">&&</span> hasUnvisitedNeighbor<span style="color: #666666;">(</span>neighbors<span style="color: #666666;">,</span> visited<span style="color: #666666;">))</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">for</span> <span style="color: #666666;">(</span>V a <span style="color: #666666;">:</span> neighbors<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">if</span> <span style="color: #666666;">(!</span><span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">isVertexVisited</span><span style="color: #666666;">(</span>a<span style="color: #666666;">))</span> <span style="color: #666666;">{</span>
System<span style="color: #666666;">.</span><span style="color: #4070a0;">out</span><span style="color: #666666;">.</span><span style="color: #4070a0;">print</span><span style="color: #666666;">(</span><span style="color: #4070a0;">" "</span> <span style="color: #666666;">+</span> a<span style="color: #666666;">);</span>
visited<span style="color: #666666;">.</span><span style="color: #4070a0;">add</span><span style="color: #666666;">(</span>a<span style="color: #666666;">);</span>
stack<span style="color: #666666;">.</span><span style="color: #4070a0;">push</span><span style="color: #666666;">(</span>a<span style="color: #666666;">);</span>
<span style="color: #60a0b0; font-style: italic;">// break from loop if an unvisited neighbor is found</span>
<span style="color: #007020; font-weight: bold;">break</span><span style="color: #666666;">;</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span> <span style="color: #007020; font-weight: bold;">else</span> <span style="color: #666666;">{</span>
<span style="color: #60a0b0; font-style: italic;">// if all neighbors are visited</span>
pop <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">true</span><span style="color: #666666;">;</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * Recursive implementation of DFS</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> * @param source</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #007020; font-weight: bold;">public</span> <span style="color: #902000;">void</span> <span style="color: #06287e;">traverseRecur</span><span style="color: #666666;">(</span>V source<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
Objects<span style="color: #666666;">.</span><span style="color: #4070a0;">requireNonNull</span><span style="color: #666666;">(</span>source<span style="color: #666666;">,</span> <span style="color: #4070a0;">"source is manadatory!"</span><span style="color: #666666;">);</span>
<span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">markAsVisited</span><span style="color: #666666;">(</span>source<span style="color: #666666;">);</span>
System<span style="color: #666666;">.</span><span style="color: #4070a0;">out</span><span style="color: #666666;">.</span><span style="color: #4070a0;">print</span><span style="color: #666666;">(</span><span style="color: #4070a0;">" "</span> <span style="color: #666666;">+</span> source<span style="color: #666666;">);</span>
<span style="color: #60a0b0; font-style: italic;">// get neighbors in sorted manner</span>
List<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> neighbors <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">graph</span><span style="color: #666666;">.</span><span style="color: #4070a0;">getAdjacentVertices</span><span style="color: #666666;">(</span>source<span style="color: #666666;">);</span>
<span style="color: #007020; font-weight: bold;">for</span> <span style="color: #666666;">(</span>V n <span style="color: #666666;">:</span> neighbors<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">if</span> <span style="color: #666666;">(!</span><span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">isVertexVisited</span><span style="color: #666666;">(</span>n<span style="color: #666666;">))</span> <span style="color: #666666;">{</span>
traverseRecur<span style="color: #666666;">(</span>n<span style="color: #666666;">);</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * checks if any of the neighbor is unvisited</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> * @param neighbors</span>
<span style="color: #60a0b0; font-style: italic;"> * @param visited</span>
<span style="color: #60a0b0; font-style: italic;"> * @return</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #007020; font-weight: bold;">private</span> <span style="color: #902000;">boolean</span> <span style="color: #06287e;">hasUnvisitedNeighbor</span><span style="color: #666666;">(</span>List<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> neighbors<span style="color: #666666;">,</span> Set<span style="color: #666666;"><</span>V<span style="color: #666666;">></span> visited<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">for</span> <span style="color: #666666;">(</span>V i <span style="color: #666666;">:</span> neighbors<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">if</span> <span style="color: #666666;">(!</span>visited<span style="color: #666666;">.</span><span style="color: #4070a0;">contains</span><span style="color: #666666;">(</span>i<span style="color: #666666;">))</span>
<span style="color: #007020; font-weight: bold;">return</span> <span style="color: #007020; font-weight: bold;">true</span><span style="color: #666666;">;</span>
<span style="color: #666666;">}</span>
<span style="color: #007020; font-weight: bold;">return</span> <span style="color: #007020; font-weight: bold;">false</span><span style="color: #666666;">;</span>
<span style="color: #666666;">}</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * Returns true if vertex is already visited</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> * @param i</span>
<span style="color: #60a0b0; font-style: italic;"> * @return</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #007020; font-weight: bold;">private</span> <span style="color: #902000;">boolean</span> <span style="color: #06287e;">isVertexVisited</span><span style="color: #666666;">(</span>V i<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">return</span> <span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">visited</span><span style="color: #666666;">.</span><span style="color: #4070a0;">contains</span><span style="color: #666666;">(</span>i<span style="color: #666666;">);</span>
<span style="color: #666666;">}</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * Mark a vertex visited</span>
<span style="color: #60a0b0; font-style: italic;"> * </span>
<span style="color: #60a0b0; font-style: italic;"> * @param i</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #007020; font-weight: bold;">private</span> <span style="color: #902000;">void</span> <span style="color: #06287e;">markAsVisited</span><span style="color: #666666;">(</span>V i<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
<span style="color: #007020; font-weight: bold;">this</span><span style="color: #666666;">.</span><span style="color: #4070a0;">visited</span><span style="color: #666666;">.</span><span style="color: #4070a0;">add</span><span style="color: #666666;">(</span>i<span style="color: #666666;">);</span>
<span style="color: #666666;">}</span>
<span style="color: #60a0b0; font-style: italic;">// test method</span>
<span style="color: #007020; font-weight: bold;">public</span> <span style="color: #007020; font-weight: bold;">static</span> <span style="color: #902000;">void</span> <span style="color: #06287e;">main</span><span style="color: #666666;">(</span>String<span style="color: #666666;">[]</span> args<span style="color: #666666;">)</span> <span style="color: #666666;">{</span>
Graph<span style="color: #666666;"><</span>Integer<span style="color: #666666;">></span> graph <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">new</span> Graph<span style="color: #666666;"><>();</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">1</span><span style="color: #666666;">,</span> <span style="color: #40a070;">2</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">1</span><span style="color: #666666;">,</span> <span style="color: #40a070;">5</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">1</span><span style="color: #666666;">,</span> <span style="color: #40a070;">6</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">2</span><span style="color: #666666;">,</span> <span style="color: #40a070;">3</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">2</span><span style="color: #666666;">,</span> <span style="color: #40a070;">5</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">3</span><span style="color: #666666;">,</span> <span style="color: #40a070;">4</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">5</span><span style="color: #666666;">,</span> <span style="color: #40a070;">4</span><span style="color: #666666;">);</span>
<span style="color: #60a0b0; font-style: italic;">// for undirected graph</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">2</span><span style="color: #666666;">,</span> <span style="color: #40a070;">1</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">5</span><span style="color: #666666;">,</span> <span style="color: #40a070;">1</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">6</span><span style="color: #666666;">,</span> <span style="color: #40a070;">1</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">3</span><span style="color: #666666;">,</span> <span style="color: #40a070;">2</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">5</span><span style="color: #666666;">,</span> <span style="color: #40a070;">2</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">4</span><span style="color: #666666;">,</span> <span style="color: #40a070;">3</span><span style="color: #666666;">);</span>
graph<span style="color: #666666;">.</span><span style="color: #4070a0;">addEdge</span><span style="color: #666666;">(</span><span style="color: #40a070;">4</span><span style="color: #666666;">,</span> <span style="color: #40a070;">5</span><span style="color: #666666;">);</span>
System<span style="color: #666666;">.</span><span style="color: #4070a0;">out</span><span style="color: #666666;">.</span><span style="color: #4070a0;">print</span><span style="color: #666666;">(</span><span style="color: #4070a0;">"DFS -->"</span><span style="color: #666666;">);</span>
DFS<span style="color: #666666;"><</span>Integer<span style="color: #666666;">></span> dfs <span style="color: #666666;">=</span> <span style="color: #007020; font-weight: bold;">new</span> DFS<span style="color: #666666;"><>(</span>graph<span style="color: #666666;">);</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * stack based DFS traversal</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
dfs<span style="color: #666666;">.</span><span style="color: #4070a0;">traverse</span><span style="color: #666666;">(</span><span style="color: #40a070;">1</span><span style="color: #666666;">);</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * Recursive DFS traversal</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
<span style="color: #60a0b0; font-style: italic;">// dfs.traverseRecur(1);</span>
<span style="color: #60a0b0; font-style: italic;">// validation</span>
<span style="color: #60a0b0; font-style: italic;">/**</span>
<span style="color: #60a0b0; font-style: italic;"> * after traversal; stack should be empty. And visited should have</span>
<span style="color: #60a0b0; font-style: italic;"> * vertices in DFS order</span>
<span style="color: #60a0b0; font-style: italic;"> */</span>
System<span style="color: #666666;">.</span><span style="color: #4070a0;">out</span><span style="color: #666666;">.</span><span style="color: #4070a0;">println</span><span style="color: #666666;">(</span><span style="color: #4070a0;">"\nIs Stack is empty :"</span>
<span style="color: #666666;">+</span> <span style="color: #666666;">(</span>dfs<span style="color: #666666;">.</span><span style="color: #4070a0;">stack</span><span style="color: #666666;">.</span><span style="color: #4070a0;">isEmpty</span><span style="color: #666666;">()</span> <span style="color: #666666;">?</span> <span style="color: #4070a0;">"yes"</span> <span style="color: #666666;">:</span> <span style="color: #4070a0;">"no"</span><span style="color: #666666;">));</span>
System<span style="color: #666666;">.</span><span style="color: #4070a0;">out</span><span style="color: #666666;">.</span><span style="color: #4070a0;">println</span><span style="color: #666666;">(</span><span style="color: #4070a0;">"visited :"</span> <span style="color: #666666;">+</span> dfs<span style="color: #666666;">.</span><span style="color: #4070a0;">visited</span><span style="color: #666666;">);</span>
<span style="color: #666666;">}</span>
<span style="color: #666666;">}</span>
</pre>
</div>
<br />
<b style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">Output:</b><br />
<span style="background-color: white; color: #741b47; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"><i>DFS --> 1 2 3 4 5 6</i></span><br />
<span style="background-color: white; color: #741b47; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"><i>Is Stack is empty :yes</i></span><br />
<span style="background-color: white; color: #741b47; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"><i>visited :[1, 2, 3, 4, 5, 6]</i></span><br />
<span style="background-color: white; color: #741b47; font-family: "arial" , "tahoma" , "helvetica" , "freesans" , sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;"><i><br /></i></span>
<br />
<div style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">
<br /></div>
<h3 style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; margin: 0px; position: relative; text-align: justify;">
Time Complexity</h3>
<div style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">
<div>
Assuming above implementation of Graph i.e. Adjacency List, |V| is number of vertices and |E| is number of edges. </div>
<div>
So complexity is O(|V|+|E|)</div>
</div>
<div style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">
<br /></div>
<div style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18.479999542236328px; text-align: justify;">
--<br />
happy learning !!!</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-50507691275211752862018-08-20T22:57:00.000-07:002018-08-21T18:25:27.602-07:00Scalability - Getting hang of Seconds<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-size: large;">This post list some of the most important numbers which are important for engineers to do back of envelope calculations. This post, I have focussed on <b>seconds</b>. If you hear someone telling that his service gets 1 million hits a day; don't get bogged down with the numbers, it just means he gets 10 requests per second.</span></div>
<span style="font-size: large;"><br /></span>
<br />
<h3 style="text-align: left;">
<span style="font-size: large;">Seconds</span></h3>
<div>
<span style="font-size: large;"><b># Seconds in a day</b> = 86400 (=24*60*60)</span></div>
<div>
<span style="font-size: large;"> = 0.85*10^5</span></div>
<div>
<span style="font-size: large;"> = 10^5</span></div>
<div>
<span style="font-size: large;"> = <b><i>0.1 million</i></b></span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><b># Seconds in a month </b>= 2592000 (=30*86400)</span></div>
<div>
<span style="font-size: large;"> = 2.5*10^6</span></div>
<div>
<span style="font-size: large;"> = <b><i>2.5 millions</i></b></span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<i><span style="color: #741b47; font-size: large;">If an online site gets <b>10 million hits per day</b> then it means on an average it gets <b>100 requests/sec.</b></span></i></div>
<div>
<i><span style="color: #741b47; font-size: large;">If an online site gets <b>10 million hits per month</b> then it means on an average it gets <b>4 requests/sec.</b></span></i></div>
<div>
<i><span style="color: #741b47; font-size: large;"><b><br /></b></span></i></div>
<div>
<span style="font-size: large;"><b># Seconds in a year</b> = 31104000 (=12*259200)</span></div>
<div>
<b><span style="font-size: large;"> = <i>31 millions</i></span></b></div>
<div>
<b><span style="font-size: large;"> = Pie * 10^7</span></b></div>
<div>
<b><span style="font-size: large;"><br /></span></b></div>
<div>
<span style="font-size: large;"><b> </b>If we treat a year as 365.25 days then also, # seconds in a year would be 3,155,7600 which would approximate to 31 million. </span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><b># Seconds in a century </b>= 3,155,760,000 seconds (considering 1 year = 365.25 days)</span></div>
<div>
<span style="font-size: large;"><b> = <i>3.15 billions</i></b><i> </i></span></div>
<div>
<span style="font-size: large;"> </span></div>
<div>
<span style="color: #741b47; font-size: large;"><b>Nanocentury is 1 billionth of a century.</b> So, a nanocentry = 3.15 seconds.</span></div>
<div>
<span style="font-size: large;"><span style="color: #741b47;">i.e. Pie seconds are there in a nano century. This is also known as <b>Duff's Rule.</b></span> </span></div>
<div>
<span style="font-size: large;"> </span></div>
<div>
<br /></div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-29411044999686720072018-03-25T11:57:00.000-07:002018-03-29T23:11:30.510-07:00Designing REST URI for supporting multiple content type<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-size: large;">Resources can be represented in multiple formats - JSON, XML, Atom, Binary formats like png, text and even proprietary formats. If client request a resource the REST service transfers the state of a resource (and not the resource itself) in the appropriate format.</span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">Assume that you are designing RESTful interface for providing metadata for Cars and your service gets consumed by many clients, some traditionals as well as few startups. So each one of them have their own requirements to provide response in the given format. Let's see what are available options-</span></div>
<br />
<br />
<h3 style="text-align: left;">
<b><span style="font-size: large;">Approach 1: One URI per representation</span></b></h3>
<span style="font-size: large;">http://www.myservice.com/cars</span><br />
<span style="font-size: large;">http://www.myservice.com/cars/xml</span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;">The first URI is default representation of the resource and second one returns the response in xml format. Both URI are different so there will be different handlers (end point) and hence the response can be easily returned in appropriate format.</span><br />
<br />
<h3 style="text-align: left;">
<b><span style="font-size: large;">Approach 2: Use Parameter of URI</span></b></h3>
<span style="font-size: large;">http://www.myservice.com/cars?format=xml</span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;">This approach is easy to read and understand.</span><br />
<span style="font-size: large;"><br /></span>
<br />
<h3 style="text-align: left;">
<b><span style="font-size: large;">Approach 3: Single URI for all representation</span></b></h3>
<div style="text-align: justify;">
<span style="font-size: large;">This approach comes from the fact that if client is essentially asking for the same resource then why do we need different URIs. Remember, REST uses HTTP; can we leverage HTTP <i><b>ACCEPT header</b></i> to get different representation of the same resource. This is process of selecting the best representation for a given resource- termed as <i>Content Negotiation</i>. </span></div>
<div style="text-align: left;">
<span style="font-size: large;"><b><br /></b></span></div>
<div style="text-align: left;">
<b><span style="font-size: large;">Content Types</span></b></div>
<span style="font-size: large;">HTTP uses Internet media types (originally known as MIME types) in the content-type and accept header fields. Internet media types are divided into 5 top level categories: text, image, audio, video and application. And then these types are further divided into several subtypes:</span><br />
<ul style="text-align: left;">
<li><span style="font-size: large;"><i>text/xml </i>: default content type for text message</span></li>
<li><span style="font-size: large;"><i>text/html</i> : commonly used type used in browsers </span></li>
<li><span style="font-size: large;"><i>text/xml, application/xml</i>: Format used for xml exchanges </span></li>
<li><span style="font-size: large;"><i>image/gif, image/jpeg, image/png: </i>image types</span></li>
<li><span style="font-size: large;"><i>application/json</i>: language independent light weight data-interchange text format </span></li>
</ul>
<div>
<span style="font-size: large;">GET /cars/</span></div>
<div>
<span style="font-size: large;">Accept: application/json</span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;">So respect the HTTP headers and everything works out. </span></div>
<div>
<span style="font-size: large;">This approach could be bit code intensive for some frameworks like <span style="background-color: white; font-family: "lucida grande" , "arial" , "helvetica" , sans-serif;">Django </span>as you need to dig into headers and decode what clients wants. But most of the Java frameworks handle it though annotation. </span><br />
<span style="font-size: large;"><br /></span></div>
<h3 style="text-align: left;">
<span style="font-size: large;">
Final Note</span></h3>
<div>
<span style="font-size: large;">No matter which approach you use. It would be great if you stick to one across the services. Prefer to be consistent even if that leads to not being very right!</span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<b><span style="font-size: large;">References</span></b></div>
<div>
<a href="https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_1_2"><span style="font-size: large;">https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_1_2</span></a></div>
<div>
<a href="http://www.restapitutorial.com/lessons/restquicktips.html"><span style="font-size: large;">http://www.restapitutorial.com/lessons/restquicktips.html</span></a></div>
<div>
<a href="http://www.informit.com/articles/article.aspx?p=1566460"><span style="font-size: large;">http://www.informit.com/articles/article.aspx?p=1566460</span></a></div>
<span style="font-size: large;"><br /></span>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-36944937212885069012017-12-21T21:15:00.001-08:002019-06-13T06:46:11.485-07:00PUT vs POST for Modifying a Resource<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-size: large;">Debate around PUT vs POST for resource update is quite common; I have had my share as well. Debate is NOT un-necessary as the difference is very subtle. One simple line of defence by many people is that if the update is IDEMPOTENT then we should use PUT else we can use POST. This explanation is correct to a good extent; provided we clearly understand if a request is truly Idempotent or not. </span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">Also, lot of content is available online which causes confusion. So, I tried to see what the originators of REST architectural style themselves say. This post might again be opinionated, or have missed few important aspects. I have tried to be as objective as possible. Feel free to post your comments/openions :)</span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<h3 style="text-align: left;">
<span style="font-size: large;">
Updating a Resource</span></h3>
<div>
<span style="font-size: large;">For our understanding, let's take a case that we are dealing with an account resource which has three attributes: <i><b>firstName, lastName and status. </b></i></span></div>
<div>
<i><span style="font-size: large;"><br /></span></i></div>
<div>
<b><span style="font-size: large;">Updating Status field:</span></b></div>
<div>
<span style="font-size: large;">Advocates of PUT consider below request to be IDEMPOTENT. </span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<div>
<i><span style="color: #4c1130; font-size: large;">HTTP 1.1 PUT /account/a-123</span></i></div>
<div>
<i><span style="color: #4c1130; font-size: large;">{</span></i></div>
<div>
<i><span style="color: #4c1130; font-size: large;"> "status":"disabled"</span></i></div>
<div>
<i><span style="color: #4c1130; font-size: large;">}</span></i></div>
</div>
<div>
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">Reality is that, above request is NOT idempotent as it's updating a partial document. To make it idempotent you need to send all the attributes. So that line of defence is <b>NOT</b> perfect. </span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div>
<div>
<span style="color: #4c1130; font-size: large;">HTTP 1.1 PUT /account/a-123</span></div>
<div>
<span style="color: #4c1130; font-size: large;">{</span></div>
<div>
<span style="color: #4c1130; font-size: large;"> "firstName":"abc",</span></div>
<div>
<span style="color: #4c1130; font-size: large;"> "lastName":"rai",</span></div>
<div>
<span style="color: #4c1130; font-size: large;"> "status":"disabled"</span></div>
<div>
<span style="color: #4c1130; font-size: large;">}</span></div>
</div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="color: #4c1130; font-size: large;">Below article tells very clearly that, if you want to use PUT to update a resource, you must send all attributes of the resource where as you can use POST for either partial or full update. </span></div>
<div>
<a href="https://stormpath.com/blog/put-or-post"><span style="font-size: large;">https://stormpath.com/blog/put-or-post</span></a></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;">So, you can use POST for either full or partial update (until PATCH support becomes universal). </span></div>
<div>
<b><span style="font-size: large;"><br /></span></b></div>
<h3 style="text-align: left;">
<b><span style="font-size: large;">What the originator of REST style says</span></b></h3>
<div>
<div>
<span style="font-size: large;">The master himself suggest that we can use POST if you are modifying part of the resource. </span></div>
<div>
<a href="http://roy.gbiv.com/untangled/2009/it-is-okay-to-use-post"><span style="font-size: large;">http://roy.gbiv.com/untangled/2009/it-is-okay-to-use-post</span></a></div>
</div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<h3 style="text-align: left;">
<span style="font-size: large;">
My Final Recommendation</span></h3>
<div>
<span style="color: #4c1130; font-size: large;"><b><i>Prefer POST if you are doing partial update of the resource. </i></b></span><br />
<span style="color: #4c1130; font-size: large;"><b><i>If you doing full update of the resource and it's IDEMPOTENT, use PUT else use POST. </i></b></span></div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-13371849791305750452017-12-14T21:58:00.001-08:002017-12-14T21:58:55.473-08:00Build Tools in Java World<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="font-size: large;">Build tools in Java (or JVM ecosystem) have evolved over period of time. Each successive build tool has tried to solve some of the pains of the previous one. But before going further down on tools, let’s start with basic features of standard build tools. </span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;"><b>Dependency Management</b></span><br />
<div style="text-align: left;">
<span style="text-align: justify; text-indent: -24px;"><span style="font-size: large;">Each project requires some external libraries for build to be successful. So these incoming files/jars/libraries are called as dependencies of the project. Managing dependencies in a centralized manner is de-facto feature of modern build tools. Output artifact of the project also gets published and then managed by dependency management. Apache IVY and Maven are two most popular tools which support dependency management.</span></span></div>
<div style="text-align: left;">
<span style="text-align: justify; text-indent: -24px;"><span style="font-size: large;"><br /></span></span></div>
<div style="text-align: left;">
<span style="text-align: justify; text-indent: -24px;"><span style="font-size: large;"><b>Build By Convention</b></span></span></div>
<div class="MsoListParagraphCxSpLast" style="mso-list: l0 level1 lfo1; text-align: justify; text-indent: -.25in;">
<span style="font-size: large;">Build script needs to be as simple and compact as possible. Imagine specifying each and every action which needs to be performed during build (compile all files from src directory, copy them to dir file, create jar file and so on); this will definitely make the script huge and hence managing and evolving it becomes a daunting task. So, modern build tools uses convention like by default (or can be configured as well) it knows that source files are in src directory. This minimizes number of lines in the build file and hence it becomes easier to write and manage build scripts. <o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span style="font-size: large;">So any standard build tool should have above two as de-facto. Below are list of tools which have these features.</span><br />
<span style="font-size: large;"><br /></span>
<br />
<div class="MsoListParagraphCxSpFirst" style="text-indent: -0.25in;">
<span style="font-size: large;">ANT + IVY (Apache IVY for dependency management)<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpMiddle" style="text-indent: -0.25in;">
<span style="font-size: large;"><span style="font-family: "symbol"; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;">·<span style="font-family: "times new roman"; font-stretch: normal;"> </span></span>MAVEN<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpLast" style="text-indent: -0.25in;">
<span style="font-size: large;"><span style="font-family: "symbol"; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;">·<span style="font-family: "times new roman"; font-stretch: normal;"> </span></span>GRADLE<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpLast" style="text-indent: -0.25in;">
<span style="font-size: large;"><br /></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span style="font-size: large;"><u>I have listed only most popular build tools above. </u>ANT by default doesn’t have dependency management but other two have native support for dependency management. Java world is basically divided between MAVEN and GRADLE. So, I have focused below on these two tools.<o:p></o:p></span><br />
<span style="font-size: large;"><br /></span>
<br />
<h3>
<span style="font-size: large;">Maven vs Gradle</span></h3>
<div>
<ul>
<li><span style="font-size: large;">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; text-indent: -24.0px; line-height: 22.0px; font: 18.0px Times; color: #000000; -webkit-text-stroke: #000000}
span.s1 {font-kerning: none}
</style>
<div class="p1">
<span class="s1">MAVEN uses XML to write build script where as GRADLE uses a DSL language based on Groovy (one of the JVM language). GRADLE build script tends to be shorter and cleaner compared to maven build script.</span></div>
</span></li>
<li><div class="p1">
<span class="s1"><span style="font-size: large;">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; text-indent: -24.0px; line-height: 18.0px; font: 18.0px Times; color: #000000; -webkit-text-stroke: #000000}
span.s1 {font-kerning: none}
</style>
</span></span></div>
<div class="p1">
<span class="s1"><span style="font-size: large;">GRADLE build script is written in Groovy (and can also be extended using Java). This definitely gives more flexibility to customize the build process. Groovy is a real programming language (unlike XML). Also, GRADLE doesn’t force to always use convention, it can be overridden. </span></span></div>
</li>
<li><div class="p1">
<span class="s1"><span style="font-size: large;">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; text-indent: -24.0px; line-height: 18.0px; font: 18.0px Times; color: #000000; -webkit-text-stroke: #000000}
span.s1 {font-kerning: none}
</style>
</span></span></div>
<div class="p1">
<span class="s1"><span style="font-size: large;">GRADLE has first class support for multi-project build whereas multi-project build of MAVEN is broken. GRADLE support dependency management natively using Apache open source project IVY (is an excellent dependency management tool). Dependency management of GRADLE is better than MAVEN.</span></span></div>
</li>
<li><div class="p1">
<span class="s1"><span style="font-size: large;">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; text-indent: -24.0px; line-height: 18.0px; font: 18.0px Times; color: #000000; -webkit-text-stroke: #000000}
span.s1 {font-kerning: none}
</style>
</span></span></div>
<div class="p1">
<span class="s1"><span style="font-size: large;">MAVEN is quite popular tool so it has wide community and Java community have been using it for a while; GRADLE on the other hand is quite new so there will be learning curve for developers.</span></span></div>
</li>
<li><div class="p1">
<span class="s1"><span style="font-size: large;">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; text-indent: -24.0px; line-height: 18.0px; font: 18.0px Times; color: #000000; -webkit-text-stroke: #000000}
span.s1 {font-kerning: none}
</style>
</span></span></div>
<div class="p1">
<span class="s1"><span style="font-size: large;">Both are plugin based (and GRADLE being a newer); finding plugin might be difficult for GRADLE. But adoption of GRADLE is growing at good pace, Google supports GRADE for Android. Integration of GRADLE with servers, IDEs and CI tools is not as much as that of MAVEN (as of now).</span></span></div>
</li>
</ul>
<div>
<span style="font-size: large;"><br /></span></div>
<ul>
<li><br /></li>
<li><div class="MsoNormal" style="text-align: justify;">
<h3>
<b>CONCLUSION</b></h3>
</div>
<div class="MsoNormal" style="text-align: justify;">
<span style="font-size: large;">Most of the cons for GRADLE are mainly because it’s a new kid in the block. Other than this, rest all looks quite impressive about GRADLE. <u>It scores better on both core features i.e. Dependency Management and Build by Convention.</u> IMO, configuring build through a programming language is going to be more seamless once we overcome the initial learning curve.<o:p></o:p></span></div>
<div class="MsoNormal" style="text-align: justify;">
<span style="font-size: large;">Also, considering we are going down the microservices path, so we will have option and flexibility to experiment with build tool as well (along with language/framework).</span><o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
<b><u>References<o:p></o:p></u></b></div>
<div class="MsoNormal" style="text-align: justify;">
<a href="https://technologyconversations.com/2014/06/18/build-tools/"><b>https://technologyconversations.com/2014/06/18/build-tools/</b></a><b><u><o:p></o:p></u></b></div>
<div style="text-align: left;">
<br /></div>
<div class="p1">
<span class="s1"></span></div>
<div class="MsoNormal" style="text-align: justify;">
<b><u>https://github.com/tkruse/build-bench</u></b></div>
</li>
<li><div class="p1">
<span class="s1"><br /></span></div>
</li>
</ul>
</div>
<div>
<div class="MsoListParagraphCxSpFirst" style="text-indent: -0.25in;">
<br /></div>
</div>
</div>
<br />
<div class="MsoNormal" style="text-align: justify;">
</div>
<br />
<div class="MsoNormal" style="-webkit-text-stroke-width: 0px; color: black; font-family: Times; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: justify; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
<div style="margin: 0px;">
<br /></div>
</div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-58450613674838263122017-12-05T09:19:00.002-08:002017-12-15T23:04:42.547-08:00How AWS structures its Infrastructure<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="font-size: large;">This post talks about how AWS structures its global infrastructure. </span><br />
<span style="font-size: large;"><br /></span>
<br />
<div style="text-align: justify;">
<span style="font-size: large;">AWS' most basic infrastructure is <b>Data Center</b>. A single Data Center houses several thousand servers. AWS core applications are deployed in N+1 configuration to ensure smooth functioning in the event of a data center failure. </span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">AWS data centers are organized into <b>Availability Zones</b>. One DC can only be part of one AZ. Each AZ is designed as an independent failure zone for fault isolation. Two AZs are interconnected with high-speed private links. </span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">Two or more AZs form a <b>Region</b>. As of now (dec '17) AWS has 16 regions across the globe. <span style="color: #4c1130;">Communication among regions use public infrastructure (i.e. internet), therefore use appropriate encryption methods to encrypt sensitive data.</span> Data stored in a specific region is not replicated across other regions automatically. </span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">AWS also has 60+ global <b>Edge Locations</b>. Edge locations help lower latency and improve performance for end users. Helpful for services like Route 53 and Cloud Front. </span></div>
<div style="text-align: justify;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbuVvwzorqQbX9xQWcLgf7nl6r3BcOU8MdshXQgeDKDy0yslcTZWNEqaJEhISz1Mt3Zq5HogTtFBe4BEyQMmixqb1014sa1mUtUYi91VKcIThaUIpH2XKDAObcmcDikV8l0YggSgZIHW8/s1600/aws-infra.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="748" data-original-width="782" height="611" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbuVvwzorqQbX9xQWcLgf7nl6r3BcOU8MdshXQgeDKDy0yslcTZWNEqaJEhISz1Mt3Zq5HogTtFBe4BEyQMmixqb1014sa1mUtUYi91VKcIThaUIpH2XKDAObcmcDikV8l0YggSgZIHW8/s640/aws-infra.png" width="640" /></a></div>
<br />
<h3 style="text-align: left;">
Guidlines for designing </h3>
<div>
<ul style="text-align: left;">
<li><span style="font-size: large;">Design your system to survive temporary or prolonged failure of an Availability Zone. This brings resiliency to your system in case of natural disasters or system failures. </span></li>
<li><span style="font-size: large;">AWS recommends replicating across AZs for resiliency. </span></li>
<li><span style="font-size: large;">When you put data in a specific region, it's your job to move it to other regions if you require. </span></li>
<li><span style="font-size: large;">AWS products and services are available by region so you may not see a service available in your region. </span></li>
<li><span style="font-size: large;">Choose your region appropriately to reduce latency for your end-users. </span></li>
</ul>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-35192805998709237092017-10-21T10:32:00.000-07:002017-11-11T08:22:25.683-08:00How Kafka Achieves Durability<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-size: large;">Durability is a guarantee that, once the Kafka broker confirms that the data is written, it will be permanent. Databases implement it by storing it in non-volatile storage. Kafka doesn't follow the DB approach!</span></div>
<div>
<br /></div>
<div>
<span style="font-size: large;"><b>Short Answer</b></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">Short answer is that, Kafka doesn't rely on the physical storage (i.e. file system) as the criteria that a message write is complete. <span style="color: #4c1130;"><i><u>It relies on the replicas.</u></i></span></span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><b>Long Answer</b></span></div>
<div style="text-align: justify;">
<span style="font-size: large;">When the message arrives to the broker, it first writes it to the in-memory copy of leader replica. Now it has following things to do before considering the write successful. </span><br />
<span style="font-size: large;">Assume that, replication factor > 1. </span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;">1. Persist the message in the file system of the partition leader.</span></div>
<div>
<span style="font-size: large;">2. Replicate the message to the all ISRs (in-sync replicas).</span></div>
<div>
<br /></div>
<div style="text-align: justify;">
<span style="font-size: large;">In ideal scenario, both above are important and should be done irrespective of order. But, the real question is, when does Kafka considers that the message write is complete? To answer this, let's try to answer below question-</span><br />
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-size: large; text-align: justify;">If a consumer asks for a message 4 which just go persisted on the leader, will the leader return the data? </span><span style="font-size: large; text-align: justify;">And the answer is </span><b><span style="font-size: large;"><i style="text-align: justify;"><span style="color: #4c1130;">NO!</span></i></span></b></div>
<div style="text-align: justify;">
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvuxHIFhgVvtrCDcafd-2L_enepbuvl4D_9_woDLYMeuKWjBojbNgSbGunXahLARfKuQpiUFv0rjWkLw4MrpWfP20zpwhVBvTdfu6MnIDnq14bCvlxWvpWCKM_b2nRV6apinGDRc-YUGc/s1600/Screen+Shot+2017-10-21+at+11.03.50+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="372" data-original-width="1278" height="185" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvuxHIFhgVvtrCDcafd-2L_enepbuvl4D_9_woDLYMeuKWjBojbNgSbGunXahLARfKuQpiUFv0rjWkLw4MrpWfP20zpwhVBvTdfu6MnIDnq14bCvlxWvpWCKM_b2nRV6apinGDRc-YUGc/s640/Screen+Shot+2017-10-21+at+11.03.50+PM.png" width="640" /></a></div>
<span style="color: #4c1130; font-size: large;"><i><br /></i></span>
<span style="color: #4c1130; font-size: large;"><i>It's interesting to note that, not all data that exists on the leader is available for clients to read. Clients can read only those messages that were written to in-sync replicas. The replica leader knows which messages were replicated to which replica, so until it's replicated it will not be returned to the client. Attempt to read those messages will result in empty response.</i></span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="font-size: large;">So, now it's obvious, just writing the message to leader (including persisting to the file system) is hardly of any use. Kafka considers a message written only if it's replicated to all in-sync replicas.</span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="font-size: large;">~Happy replication!</span></div>
<div>
<br /></div>
<div>
<br />
<div>
<br /></div>
<div>
<br /></div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-66236497452234838232017-10-07T06:36:00.002-07:002017-10-10T20:37:32.854-07:00My favourite fiz-buzz problem for Senior Programmers<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-size: large;">This post, I will be discussing one of my favourite fiz-buzz problems for senior programmers/engineers. </span></div>
<br />
<h3 style="text-align: left;">
Find Kth largest element from a list of 1 million integers. </h3>
<h3 style="text-align: left;">
Or</h3>
<h3 style="text-align: left;">
Find Kth largest element at any given point of time from a stream of integers, count is not known. </h3>
<div>
<br /></div>
<div style="text-align: left;">
<div style="text-align: justify;">
<span style="font-size: large;">This problem is interesting as it has multiple approaches to solve and it checks the fundamentals of algorithms and data structure. Quite often, candidate start with asking questions like is the list sorted? Or can I sort the list ? In such case, I go and check on which sorting algorithm the candidate proposes. This gives me an opportunity to start conversation around complexity of the approach (particularly, time complexity). Most of the candidates are quick to point out algorithms (like Quick Sort , Merge Sort) which take O(NlogN) for sorting a list. This is right time to point out that why do you need to sort the complete array/list if you just need to find out 100th or kth largest/smallest element. Now the conversation usually go in either of the direction - </span></div>
<ol style="text-align: left;">
<li style="text-align: justify;"><span style="font-size: large;">Candidate sometime suggest that, sorting is more quicker way to solve this problem - missing altogether the complexity aspect. If someone doesn't even realize that sorting is not the right way to handle this problem, then it kind of <b>red</b> signal for me. </span></li>
<li style="text-align: justify;"><span style="font-size: large;">At times candidates acknowledge the in-efficiency of sorting approach and then start looking for better approach. I suggest, candidates to think out loud which will give me insight about their thought process and how are they approaching it. When I see them not moving ahead; I suggest them on optimizing Quick sort approach ? Is there any way to cut down the problem size in half in every iteration ? Can you use divide and concur to improve on your O(NlogN) complexity ? </span></li>
</ol>
<div>
<div style="text-align: justify;">
<span style="font-size: large;">This problem can be solved by <a href="http://www.cs.yale.edu/homes/aspnes/pinewiki/QuickSelect.html" target="_blank">Quick Select</a> as well as using <a href="http://geekrai.blogspot.in/2013/05/heap-data-structure.html" target="_blank">Heap</a> data structure. This problem also has a brute force approach (i.e. run loop for k time; in each iteration find the maximum number lower than the last one). </span></div>
<span style="font-size: large;"><br /></span>
<br />
<div style="text-align: justify;">
<span style="font-size: large;">If the candidate doesn't make much progress then I try to simplify the problem by saying - find 3rd or 2nd largest element. I have seen some of the senior programmers failing to solve this trivial version as well. This is clear Reject sign for me.</span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;">Also, sometime I don't even ask candidate to code. I use this problem to just get an idea and skip the coding part if i see a programmer sitting right across me :)</span><br />
<br />
<span style="font-size: large;">-Happy problem solving !</span></div>
</div>
<div style="text-align: justify;">
<br /></div>
<br />
<br /></div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-84562761820744532232017-10-07T06:22:00.000-07:002017-10-10T09:49:51.376-07:00Identifying Right Node in Couchbase<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="separator" style="clear: both; text-align: justify;">
<span style="font-size: large;">This post covers - how Couchbase achieves data partitioning and Replication. </span></div>
<div class="separator" style="clear: both; text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div class="separator" style="clear: both; text-align: justify;">
<span style="font-size: large;">doc = "{"key1":"value1".....}" ; doc-id = id</span></div>
<div class="separator" style="clear: both; text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<h3 style="text-align: left;">
<span style="font-size: large;">
Steps:</span></h3>
<div>
<ul style="text-align: left;">
<li><span style="font-size: large;">Based on key (or document id) the hash gets calculated.</span></li>
<li><span style="font-size: large;">Hash returns a value in the range [0,1023] both inclusive. This is known as partition id.</span></li>
<ul>
<li><span style="font-size: large;">number basically maps the document to one of the vBuckets. </span></li>
</ul>
<li><span style="font-size: large;">Next task is to map the vBucket to a physical node. This gets decided by vBucket Map.</span></li>
<ul>
<li><span style="font-size: large;">This maps tells which is the primary node for the document and which all are the backup nodes. vBucket Map will have 1024 entries, one of each vBucket. And each entry also an array. The first value is for primary node and rest all are replica nodes.</span></li>
</ul>
<li><span style="font-size: large;">Server list stores list of live nodes. So based on the index of the vBucketMap, we get to know the physical node IP address.</span></li>
</ul>
</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5rq0cXewsMVzE3TPKL0DrYNTD79uk2-RPZ7fwZ5jeGPb2aZaYwnLH4t2_3LwtcgEN2mlTZeEymJ6jDVn9Ps7FC2wnxtOR6I5vb2zJE3XF5ipU3lD7C7Tse7fEwqaHxOJXy5O9Q-tKpjs/s1600/Screen+Shot+2017-09-03+at+12.22.04+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1426" data-original-width="1592" height="573" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5rq0cXewsMVzE3TPKL0DrYNTD79uk2-RPZ7fwZ5jeGPb2aZaYwnLH4t2_3LwtcgEN2mlTZeEymJ6jDVn9Ps7FC2wnxtOR6I5vb2zJE3XF5ipU3lD7C7Tse7fEwqaHxOJXy5O9Q-tKpjs/s640/Screen+Shot+2017-09-03+at+12.22.04+PM.png" width="640" /></a></div>
<br /></div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-57151695491933330002017-08-15T03:27:00.000-07:002017-12-16T06:15:29.991-08:00Couchbase Primary vs Secondary Indexes<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Couchbase supports <span style="color: #741b47;">key-value as well as JSON</span> based data model. In Key-value model you don't care about the type of value. In JSON model you have ability to perform queries on the individual attributes using N1QL queries.</span><br />
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span>
<br />
<h3>
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Key-Value Model </span></h3>
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><b>Without Index</b></span><br />
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Key-value store is schema less where the object gets mapped to a given key (Just like a HashMap or Dictionary). Couchbase is more like a distributed HashMap. The value could be any supported data type (JSON, CSV, or BLOB). You perform any operation using the key or Document Id. In this case, Couchbase looks up the value corresponding to a given <b><span style="color: #4c1130;">document id</span></b>. In simple terms, it's just like a key lookup in a HashMap. Index doesn't play any role here.</span></div>
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span>
<br />
<span style="color: #4c1130; font-size: large;"><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: small;"><b>Query</b>: </span><span style="font-family: "menlo";">bucket</span><span style="background-color: white; font-family: "menlo";">.get(docId);</span></span><br />
<span style="font-size: large;"><span style="background-color: white; font-family: "menlo";"><br /></span></span>
<br />
<div style="text-align: left;">
<b style="font-family: "helvetica neue", arial, helvetica, sans-serif; font-size: x-large; text-align: justify;">With Index</b></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Now what if you want number of documents in your bucket ?</span></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="color: #741b47;"><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><b>Query</b>: </span><span style="text-align: left;"><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">SELECT COUNT(*) FROM `bucket-name`</span></span></span></div>
<div style="text-align: justify;">
<span style="color: #741b47;"><span style="text-align: left;"><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span></span></span></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Above query is going to do full Bucket scan (similar to full table scan in SQL world). In SQL world, index on primary key gets created by default so you can easily perform above operation. But, in Couchbase, that's not the case. You will have to create explicit index to perform above query or any other ad-hoc query. So, if you want to create an index on the the key or document id, we can create primary index. </span></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="color: #741b47; font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><b>Query</b>: </span><span style="color: #741b47; text-align: left;"><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Create PRIMARY INDEX index_name on `bucket-name`</span></span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span></div>
<h3 style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">JSON Model (Secondary Indexes)</span></h3>
<div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">If you want to complete control on your data and queries, Json model is going to be your choice. In above approaches you can't say like give me all the objects which has certain attribute value. </span></div>
</div>
<div>
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">In JSON based model, we can query through a SQL like expressive language named as N1QL(pronounced as nickel). This is much more flexible model, we can look for a document(s) through the keys contained inside JSON. Obviously, to optimise lookup/search we can create index on those attributes. These indexes are named as </span><b style="font-family: "helvetica neue", arial, helvetica, sans-serif; font-size: x-large;"><span style="color: #4c1130;">secondary indexes</span></b><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"> or more precisely </span><b style="font-family: "helvetica neue", arial, helvetica, sans-serif; font-size: x-large;"><span style="color: #4c1130;">Global Seconday Indexes</span></b><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">.</span></div>
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="color: #4c1130; font-size: large;"><span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif;"><b>Query</b>: </span><span style="font-family: "helvetica neue"; text-align: left;">CREATE INDEX type_index ON `bucket-name`(type</span><span style="font-family: "helvetica neue"; text-align: left;">) USING GSI</span></span></div>
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'; color: #454545}
</style>
<br />
<div style="text-align: justify;">
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;"><br /></span>
<br />
<h3>
<span style="font-family: "helvetica neue" , "arial" , "helvetica" , sans-serif; font-size: large;">Primary vs Secondary Indexes</span></h3>
<br />
<ul class="ul bullets" style="box-sizing: inherit; color: #252525; font-family: "kievit ot", sans-serif; line-height: 1.2em; margin: 0px 0px 25px; padding: 0px 0px 0px 10px;">
<li class="li" style="box-sizing: inherit; line-height: 1.2em; list-style-type: disc !important; margin: 0px 0px 0.5em 1.1em; padding: 0px;"><span style="font-size: large;"><dfn class="term" style="box-sizing: inherit; line-height: 1.2em;">Primary indexes</dfn> index all the keys in a given bucket and are used when a secondary index cannot be used to satisfy a query and a full bucket scan is required. </span></li>
<li class="li" style="box-sizing: inherit; line-height: 1.2em; list-style-type: disc !important; margin: 0px 0px 0.5em 1.1em; padding: 0px;"><span style="font-size: large;"><dfn class="term" style="box-sizing: inherit; line-height: 1.2em;">Secondary indexes</dfn> can index a subset of the items in a given bucket and are used to make queries targeting a specific subset of fields more efficiently. </span></li>
</ul>
</div>
<br />
<br />
<span style="font-size: large;">--- happy learning !</span></div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-89208235332432933152017-08-11T22:16:00.000-07:002017-10-11T05:53:04.498-07:00RAM sizing Data Node of Couchbase Cluster<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
This post talks about finding out how much RAM does your Couchbase cluster needs for holding your Data (in RAM)! </div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<h3 style="color: #444444; text-align: left;">
RAM Calculator </h3>
<div style="color: #444444;">
RAM is one of the most crucial areas to size correctly. Cached documents allow the reads to be served at low latency and high throughput. <u>Please note that, this doesn't not incorporate RAM requirement from the host/VM OS and other applications running along with Couchbase.</u></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<b>Enter below fields to estimate RAM -</b><br />
<br /></div>
<div style="color: #444444;">
<b>Sample Document</b> (key)<input id="id" name="id" type="text" /> (Value) <input id="document" name="document" type="text" />
</div>
<div style="color: #444444;">
<span style="color: #741b47; font-family: "opensans" , sans-serif; font-size: 16px;"><i>This is required as document content length as well as ID length impacts RAM. Be mindful of the size aspect when deciding your key generation strategy. </i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b># Replicas </b> <input id="replicas" name="replicas" type="text" />
</div>
<div style="color: #444444;">
<span style="color: #741b47; font-family: "opensans" , sans-serif; font-size: 16px;"><i>Couchbase only supports upto 3 replicas. So enter either - 1, 2 or 3.</i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>% Of Data you want to be in RAM </b><input id="workingSet" name="workingSet" type="text" /> %</div>
<div style="color: #741b47; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #741b47; font-family: "opensans" , sans-serif; font-size: 16px;"><i>For best throughput you need to have all your documents in RAM i.e. 100% . This way any request will be served from RAM and there will be no IO. In the field please enter only the value like 80, 100 etc. </i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b># Documents</b> <input id="sizeDoc" name="sizeDoc" type="text" />
</div>
<div style="color: #444444;">
<span style="color: #741b47; font-family: "opensans" , sans-serif; font-size: 16px;">Number of documents in the cluster. When your application is starting from scratch then you can start with a number depending on the load of the application and then you need to evaluate it regularly and adjust your RAM quota if required. So, you can start with say 10000 or 1000000 documents. </span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Type of Storage</b> SSD
<input id="ssd" name="memoryType" type="radio" value="other" /> HDD <input id="hdd" name="memoryType" type="radio" value="other" /></div>
<div style="color: #444444;">
<span style="color: #741b47; font-family: "opensans" , sans-serif; font-size: 16px;"><i>If storage is SSD then overhead % is 25 else it's 30%. SSD will bring better performance in disk throughput and latency. SSD storage will help improved performance if all data is not in the RAM. </i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Couchbase Version</b> < 2.1 <input checked="" id="lowerVer" name="version" type="radio" value="less than 2.1" /> 2.1 or higher <input id="higherVer" name="version" type="radio" value="2.1+" /> </div>
<div style="color: #444444;">
<span style="color: #741b47; font-family: "opensans" , sans-serif; font-size: 16px;"><i>Size of meta data for 2.1 and higher versions is 56 bytes but for lower versions it's 64. </i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>High Water Mark </b> <input id="waterMark" name="waterMark" type="text" />%</div>
<span style="color: #999999;">If you want to use default value enter 85. </span><br />
<div style="color: #444444;">
<i style="color: #741b47; font-family: opensans, sans-serif; font-size: 16px;">If the amount of RAM used by documents reaches high water mark (upper threshold), both primary and replica documents are <b>ejected</b> until the memory usage reaches <b>low Water Mark </b>(lower threshold). </i></div>
<div style="color: #444444;">
<i style="color: #741b47; text-align: justify;"><br /></i>
</div>
<div style="color: #444444;">
<input onclick="calculate()" type="button" value="Estimate RAM for the cluster" />
</div>
<div style="color: #444444; text-align: justify;">
<br />
Based on the RAM requirement for the cluster, you can plan how many nodes are required. Another important aspect in deciding number of data nodes is how you expect your system to behave if 1, 2 or more nodes go down at the same time. This <a href="https://geekrai.blogspot.in/2017/08/replication-factor-in-distributed.html">link</a>, I have discussed about Replication factor and how it affects your system performance. So, take your call wisely!<br />
<span style="color: #741b47;"><i><br /></i></span>
<span style="color: #741b47;"><i>The value got calculated as explained in the Couchbase link, <a href="https://developer.couchbase.com/documentation/server/4.5/install/sizing-general.html#topic_axp_glg_xs">here</a>.</i></span><br />
<span style="color: #741b47;"><i>Reference for calculating document size is, <a href="https://blog.couchbase.com/calculating-average-document-size-documents-stored-couchbase/">here</a>. </i></span>
<span style="color: #741b47;"><i><br /></i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
--- happy sizing :)</div>
</div>
<script type="text/javascript">
function calculate(){
//NUM OF COPIES
var num_of_replica = document.getElementById('replicas').value;
var no_of_copies = 1 + parseInt(num_of_replica);
console.log('no_of_copies = '+no_of_copies);
// calculate total_metadata And headroom
var headroom = 0.25; //overhead_percentage
if (document.getElementById('hdd').checked == true){
console.log("hdd is checked");
headroom = 0.30;
}
var metadata_per_document = 64;
if (document.getElementById('higherVer').checked == true){
metadata_per_document= 56;
console.log('higher version checked --- metadata_per_document = '+metadata_per_document);
}
// Calculate total_metadata
var documents_num = parseInt(document.getElementById('sizeDoc').value);
var ID_size = JSON.stringify(document.getElementById('id').value).length
var total_metadata = (documents_num) * (metadata_per_document + ID_size) * (no_of_copies)
console.log(ID_size+'<-id size ;total_metadata = '+total_metadata);
// Calcualte TOTAL DATA SET
var value_size = JSON.stringify(document.getElementById('document').value).length
var total_dataset=(documents_num) * (value_size) * (no_of_copies)
console.log('total_dataset = '+ total_dataset);
// Calcualte working_set
var working_set_percentage = parseInt(document.getElementById('workingSet').value) / 100;
console.log('working_set_percentage = '+working_set_percentage);
var working_set =total_dataset * (working_set_percentage);
console.log('working_set = '+working_set);
var high_water_mark = parseInt(document.getElementById('waterMark').value) / 100;
console.log('high_water_mark = '+high_water_mark);
var cluster_ram_quota = (total_metadata + working_set) * (1 + headroom) / (high_water_mark) //in bytes
var quota_in_gb = cluster_ram_quota/1000000000;
console.log('cluster_ram_quota (in GB) = '+quota_in_gb);
alert(" Cluster RAM in GB =" + quota_in_gb);
}
</script>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-68845143100024094252017-08-11T22:04:00.002-07:002017-10-10T09:50:53.064-07:00What's so special about Java 8 stream API<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
<span style="font-size: large;">Java 8 has added functional programming and one of the major addition in terms of API is, <b>stream</b>.</span></div>
<div style="text-align: justify;">
<span style="font-size: large;"><br /></span></div>
<div style="text-align: justify;">
<span style="color: #444444; font-size: large;"><i>A mechanical analogy is car-manufacturing line where a stream of cars is queued between processing stations. Each take a car, does some modification/operation and then pass it to next station for further processing.</i></span></div>
<span style="font-size: large;"><br /></span>
<span style="font-size: large;"><br /></span>
<div style="text-align: justify;">
<span style="font-size: large;">Main benefit of stream API is that, now in Java (8) you can program at higher level of abstraction. So you can transform stream of one type to stream of other type rather than processing each item at a time (using for loop or iterator). With this Java 8 can run a pipeline of stream operations on several CPU cores on different components of the input. This way you are getting parallelism almost free instead of hard work using threads and locks.</span></div>
<span style="font-size: large;"><br /></span>
<span style="color: #4c1130; font-size: large;"><b>Stream </b>focuses on partitioning the data rather than coordinating access to it. </span><br />
<span style="font-size: large;"><span style="color: #4c1130;"><b><br /></b></span>
<span style="color: #4c1130;"><b> Vs</b></span></span><br />
<span style="color: #4c1130; font-size: large;"><b><br /></b>
<b>Collection </b>is mostly about storing and accessing the data, whereas <b>stream</b> is mostly about describing computation on data. </span><br />
<b><br /></b>
<br />
<br /></div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-50619178636331528632017-08-09T22:21:00.003-07:002017-08-13T03:16:41.702-07:00Replication Factor in Couchbase<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
One of the core requirement for Distributed DBs is to be as High Availability as possible. What this literally means is that, even if node/nodes go down the DB should function (on its own or with minimum intervention). This is possible only if there are backup copies of the data. </div>
</div>
<br />
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130; font-family: "opensans" , sans-serif; font-size: 16px;">Replication factor controls number of replicas or backup of an item/data/document stored in a DB. The general rule is to have replica for each node which can fail in the cluster.</span></div>
<br />
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
Let's check how one of the famous NoSQL distributed Db handles Replication Factor. </div>
<div style="text-align: left;">
<br /></div>
<h3 style="text-align: left;">
Couchbase</h3>
<div>
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
Default replication factor is 1 in Couchbase (if it's enabled). Drop down field (as shown below) has only 3 values i.e. 1, 2 and 3. Practically, it doesn't make sense to have replication factor more than 3 no matter how large your cluster is.</div>
</div>
<div>
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
So even if you have only one node and enable replicas then in the same node there will be two copies of the same data (one original and one backup). Once you add more nodes to the cluster original and replicas will get re-distributed automatically. </div>
</div>
<div>
<br /></div>
<div class="separator" style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmDKJXhR8E2yyCqSA8THaBKxHSDj77U7CpwKSUzEhSWddWW9_QBoWPF4zkFiFvEVY_FTweTK7MpyBW10gz6Ypc-6lATvMdPsv8rNTLXRnH7TJt_sfST94tMaEK3c_dZLnzD1q9kQ5CuZo/s1600/create-replicas.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="129" data-original-width="743" height="108" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmDKJXhR8E2yyCqSA8THaBKxHSDj77U7CpwKSUzEhSWddWW9_QBoWPF4zkFiFvEVY_FTweTK7MpyBW10gz6Ypc-6lATvMdPsv8rNTLXRnH7TJt_sfST94tMaEK3c_dZLnzD1q9kQ5CuZo/s640/create-replicas.png" width="640" /></a></div>
<div>
<br /></div>
<div style="text-align: left;">
<b>Recommendation:</b></div>
<div style="text-align: left;">
<b><br /></b></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444;">Number of Nodes <= 5 - </span><span style="color: #4c1130;"><u>RF = 1</u></span></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444;">5 <= Number Of Nodes <= 10 - </span><span style="color: #4c1130;"><u>RF =2</u></span></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444;">Number of Nodes > 10 - </span><span style="color: #4c1130;"><u>RF = 3</u></span></div>
<div>
<br /></div>
<div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
Number of nodes mentioned above is only for data nodes if you are using <a href="http://geekrai.blogspot.co.uk/2016/11/why-multi-dimensional-scaling-in.html">Multi Dimensional Scaling</a>. If you are not using MDS then also above rule should hold good. </div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130;">In the event of failure we can </span><b style="color: #4c1130;">fail over</b><span style="color: #4c1130;"> (manually or automatically) to replicas. </span></div>
<ul style="text-align: left;">
<li><span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">In a 5 node cluster with 1 replica. If one node goes down cluster can fail it over. Now before the the failed node is up, what if another node goes down ? You are out of luck. You will have to add another node to the cluster. </span></li>
<li><span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">After a node goes down and it's failed over try to replace that node ASAP and perform rebalance. Rebalance creates the replica copies if there are enough nodes available. </span></li>
</ul>
</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<b>References</b></div>
<div>
<a href="https://developer.couchbase.com/documentation/server/4.5/clustersetup/create-bucket.html">https://developer.couchbase.com/documentation/server/4.5/clustersetup/create-bucket.html</a><br />
<a href="https://developer.couchbase.com/documentation/server/3.x/admin/Concepts/bp-sizingGuidelines.html">https://developer.couchbase.com/documentation/server/3.x/admin/Concepts/bp-sizingGuidelines.html</a></div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com2tag:blogger.com,1999:blog-5276776697364295662.post-89518056894817374292017-07-29T06:44:00.000-07:002017-08-09T22:23:22.548-07:00Understanding AWS' IAM Service<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
In AWS world, everything is a service; in fact more technically a web service. Even for security there is one, <b>IAM (Identity Access Management)</b>. </div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
<br />
<br />
<h3 style="text-align: left;">
IAM background </h3>
Let's assume, you manage a team (in a big company or you are a startup) and you decide to embrace your favourite cloud platform, AWS. To start with, you create an account on AWS. <span style="color: #4c1130;">This is root account or root user (as called in Linux world).</span> You want your team members to get access to AWS console and it's different services.</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br />
Do you want to give them the same access as you have? Definitely NOT!<br />
<br />
<div style="text-align: justify;">
Full administrative access to all users will affect security of your systems and critical data. Root access might affect your monthly bills - what if a user starts bunch of powerful EC2 instances, although you wanted to use only S3 services of AWS. That's where the concept of users, groups, role, policies comes into picture in AWS and this all gets achieved using IAM service. </div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130; font-family: "opensans" , sans-serif; font-size: 16px;"><i>Amazon follows a shared security model - this means it's responsible for securing platform but as as customer you need to secure your data and access to a service. </i></span><br />
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><b>Root Account</b> g</span><span style="color: #4c1130;">ets created when you first setup your AWS account. It has complete admin access. </span></div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<h3>
What is IAM ?</h3>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">IAM is authentication and authorisation service of AWS. </span><b style="color: #4c1130; text-align: justify;">IAM allows your to control who can access AWS resources, how they can access and in what ways</b><span style="color: #4c1130; text-align: justify;">. As an Administrator, it gives you centralised control of your AWS account and enables you to manage users and their level of access to the AWS console. </span></div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><br /></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="text-align: justify;">IAM, being a core service has global scope (not specific to a region). This means your user accounts, roles will be available all across the world. </span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br />
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"></span>
<br />
<h3>
IAM Page on AWS Console</h3>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
Sign-in through your <b>root account</b> to the AWS console. On the left top of UI, click on services and then from the list of services click IAM (comes under Security, Identity and Compliance). This takes to the IAM page of the console. </div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
At the very top it gives sign-in link which has numeric account number in the URL. I customised the url for easy readability by replacing account number with geekrai (this blog name). Attaching below screenshot of my IAM page.</div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div class="separator" style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqoIPuXZo2euX-SrQ9IiwxW_PZeD8tQ8VQpXiMmq9XfMAcJVxwIb-Sr8TZFG8fQQv9RUY-4I5J4Wg1kYIxAWVwrrNaLqyWu1Hco_2RsKvNX-mht4NaAnNiUrYl5H45G4Tq83uASDijExw/s1600/Screen+Shot+2017-07-29+at+10.45.21+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1022" data-original-width="1438" height="452" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqoIPuXZo2euX-SrQ9IiwxW_PZeD8tQ8VQpXiMmq9XfMAcJVxwIb-Sr8TZFG8fQQv9RUY-4I5J4Wg1kYIxAWVwrrNaLqyWu1Hco_2RsKvNX-mht4NaAnNiUrYl5H45G4Tq83uASDijExw/s640/Screen+Shot+2017-07-29+at+10.45.21+AM.png" width="640" /></a></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
<br /></div>
<h3 style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
IAM Components</h3>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
Above image shows the IAM page in the AWS console. It shows that there are 0 users (root user is not counted as a user), 0 groups and 0 roles. We will explore in detail all IAM components -</div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><b>Users </b>- End users of the services. </span></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">Click on the <b>Create </b></span><span style="color: #4c1130;"><b>individual IAM users </b>to configure users for this account. </span><span style="color: #4c1130;">Through this </span><span style="color: #4c1130;">you can add as many users as you want to this account. </span><span style="color: #444444; text-align: justify;">By default new users have no permission when they get created. </span><span style="color: #444444;"> There are two types of accesses for new users. </span></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<ol>
<li><span style="color: #4c1130; font-family: "opensans" , sans-serif; font-size: 16px;">Programatic Access: AWS enables an <i>access key ID</i> and <i>secret access key for accessing </i>AWS programatically (AWS API, CLI, SDK etc). </span></li>
<li style="color: #444444;"><span style="color: #4c1130;">AWS Management Console Access: This allows your users to sign-in to AWS console. Users need a password to sign-in. </span></li>
</ol>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">You can choose either or both of above access types for a user. </span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><br /></span></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<span style="font-family: "opensans" , sans-serif; font-size: 16px;"><b style="color: #444444;">Groups - </b><span style="color: #4c1130;">A collection of user under one set of permission.</span></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
Once a user is created, (ideally) it should be part of a group like developer, administrator etc. I created a group named as <i>developer</i> and added user with name <i>siddheshwar</i> to the group. This enables you to add policies to the group. </div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><span style="color: #4c1130;"><br /></span></span></div>
<div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<b style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">Policies (Policy Document) - </b><span style="color: #4c1130; text-align: justify;">It's a document which defines one or more permissions which gets attached to the group or user. </span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">Policy document is a key value pair in JSON format. AWS console provides list of all possible policies, you just need to select the one which is apt for your case. </span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130; text-align: justify;"><br /></span>
</div>
<h3 class="ng-binding" style="background-color: #f6f6f6; border: none; box-sizing: border-box; color: #444444; font-family: 'helvetica neue', roboto, arial, sans-serif; font-size: 18px; font-weight: 400; line-height: 24px; margin: 0px; padding: 0px; white-space: nowrap;">
AmazonEC2FullAccess</h3>
<h3 class="ng-binding" style="background-color: #f6f6f6; border: none; box-sizing: border-box; color: #444444; font-family: 'helvetica neue', roboto, arial, sans-serif; font-size: 18px; font-weight: 400; line-height: 24px; margin: 0px; padding: 0px; white-space: nowrap;">
<div class="description ng-binding" style="box-sizing: border-box; font-size: 14px; line-height: 21px; margin-bottom: 10px; padding: 0px; white-space: normal;">
Provides full access to Amazon EC2 via the AWS Management Console.</div>
</h3>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div>
<span style="color: #444444;">{</span></div>
<div>
<span style="color: #444444;"> "Version": "2012-10-17",</span></div>
<div>
<span style="color: #444444;"> "Statement": [</span></div>
<div>
<span style="color: #444444;"> {</span></div>
<div>
<span style="color: #444444;"> "Action": "ec2:*",</span></div>
<div>
<span style="color: #444444;"> "Effect": "Allow",</span></div>
<div>
<span style="color: #444444;"> "Resource": "*"</span></div>
<div>
<span style="color: #444444;"> },</span></div>
<div>
<span style="color: #444444;"> {</span></div>
<div>
<span style="color: #444444;"> "Effect": "Allow",</span></div>
<div>
<span style="color: #444444;"> "Action": "elasticloadbalancing:*",</span></div>
<div>
<span style="color: #444444;"> "Resource": "*"</span></div>
<div>
<span style="color: #444444;"> },</span></div>
<div>
<span style="color: #444444;"> {</span></div>
<div>
<span style="color: #444444;"> "Effect": "Allow",</span></div>
<div>
<span style="color: #444444;"> "Action": "cloudwatch:*",</span></div>
<div>
<span style="color: #444444;"> "Resource": "*"</span></div>
<div>
<span style="color: #444444;"> },</span></div>
<div>
<span style="color: #444444;"> {</span></div>
<div>
<span style="color: #444444;"> "Effect": "Allow",</span></div>
<div>
<span style="color: #444444;"> "Action": "autoscaling:*",</span></div>
<div>
<span style="color: #444444;"> "Resource": "*"</span></div>
<div>
<span style="color: #444444;"> }</span></div>
<div>
<span style="color: #444444;"> ]</span></div>
<div>
<span style="color: #444444;">}</span></div>
</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #741b47;"><br /></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<h3 class="ng-binding" style="background-color: #f6f6f6; border: none; box-sizing: border-box; color: #444444; font-family: 'Helvetica Neue', Roboto, Arial, sans-serif; font-size: 18px; font-weight: 400; line-height: 24px; margin: 0px; padding: 0px; white-space: nowrap;">
AdministratorAccess</h3>
<div class="description ng-binding" style="background-color: #f6f6f6; box-sizing: border-box; color: #444444; font-family: 'Helvetica Neue', Roboto, Arial, sans-serif; font-size: 14px; line-height: 21px; margin-bottom: 10px; padding: 0px;">
Provides full access to AWS services and resources</div>
</div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div>
<span style="color: #444444;">{</span></div>
<div>
<span style="color: #444444;"> "Version": "2012-10-17",</span></div>
<div>
<span style="color: #444444;"> "Statement": [</span></div>
<div>
<span style="color: #444444;"> {</span></div>
<div>
<span style="color: #444444;"> "Effect": "Allow",</span></div>
<div>
<span style="color: #444444;"> "Action": "*",</span></div>
<div>
<span style="color: #444444;"> "Resource": "*"</span></div>
<div>
<span style="color: #444444;"> }</span></div>
<div>
<span style="color: #444444;"> ]</span></div>
<div>
<span style="color: #444444;">}</span></div>
</div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130;"><u>Administrator access is the same as root access. Please note that these policies can be added directly to the user, does't have always to be through Group.</u></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130;"><u><br /></u></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #4c1130; text-align: justify;"><br /></span>
<b style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">Roles </b><span style="color: #4c1130; text-align: justify;">- Roles control responsibilities which get assigned to AWS resources. </span></div>
<span style="text-align: justify;"><span style="color: #444444; font-family: "opensans" , sans-serif;">IAM roles is similar to user, but instead of being associated with a person it can be assigned to an <b>application</b> and <b>service</b> as well. Remember, in AWS everything is a service. </span></span><br />
<span style="text-align: justify;"><span style="color: #444444; font-family: "opensans" , sans-serif;"><br /></span></span>
<br />
<div style="text-align: justify;">
<span style="color: #444444; font-family: "opensans" , sans-serif;"><b>How roles can help - </b></span></div>
<ul style="text-align: left;">
<li><span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">Enable Identity federation. Allow users to log-in to AWS console through gmail, Amazon, OpenId etc. </span></li>
<li><span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">Enable access between your AWS account and 3rd party AWS account. </span></li>
<li><span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">Allow EC2 instance to call AWS services on your behalf. </span></li>
</ul>
<br />
Below is the screenshot of a role page.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgnu9PbJh5IsgjueKFFZExCHrbc-ZxO6vrkRYpC6mklbIHiDXlpqt6SHDTOA5xIBtSNkxSHV3igpekzahTkhxRFCNGkc4OT4Kq-88T29d61IA-Cwh9IHz2D0XJelAVrT_LSzvZgnYspdw/s1600/Screen+Shot+2017-07-31+at+10.25.06+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1040" data-original-width="1600" height="416" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgnu9PbJh5IsgjueKFFZExCHrbc-ZxO6vrkRYpC6mklbIHiDXlpqt6SHDTOA5xIBtSNkxSHV3igpekzahTkhxRFCNGkc4OT4Kq-88T29d61IA-Cwh9IHz2D0XJelAVrT_LSzvZgnYspdw/s640/Screen+Shot+2017-07-31+at+10.25.06+AM.png" width="640" /></a></div>
<br />
<div style="text-align: justify;">
<span style="color: #444444; font-family: "opensans" , sans-serif;"><br /></span>
<span style="color: #444444; font-family: "opensans" , sans-serif;">More details about Roles, <a href="http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html">here</a>. </span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="text-align: justify;"><br /></span>
<span style="text-align: justify;"><br /></span></div>
</div>
<h3 style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
</h3>
<h3 style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: left;">
IAM Best Practices</h3>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<b>Multi Factor Authentication (MFA) For Root Account: </b>Root account is the id password which you used to sign in to the AWS. Root account gives you unlimited access to AWS, and that's why security is quite important and AWS recommends to set up MFA. Once you set MFA, you will have to provide a MFA code as well while signing in.<br />
<br />
Reference- <a href="https://aws.amazon.com/iam/details/mfa/">https://aws.amazon.com/iam/details/mfa/</a><br />
<br />
<b>Set Password Policy: </b>It's a good practice to set password polices- like what all characters are mandatory in password, expiry time or rotation policy.<br />
<br />
<b>Set Billing Alarm: </b>You can set a threshold level on your monthly bills; if that level crosses then AWS will send an e-mail. This feature is not directly related to IAM. Amazon's cloud watch service helps in monitoring the billing. </div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<b><br /></b></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br />
---<br />
happy learning !</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: justify;">
<br /></div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com3tag:blogger.com,1999:blog-5276776697364295662.post-82240119563914793762017-07-23T02:57:00.000-07:002017-07-27T22:34:00.767-07:00Kafka - All that's Important<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="font-family: opensans, sans-serif;">
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
This post is all about KAFKA! By the end of this post you should have idea about its design philosophy, components, and architecture.</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
Kafka is written in Scala and doesn't follow JMS standards.</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br />
<h3 style="text-align: left;">
What is Apache Kafka ?</h3>
<div>
<div style="text-align: justify;">
Kafka is a <span style="color: #741b47;"><b>distributed streaming platform</b></span> which is highly scalable, fault-tolerant and efficient (provides high throughput and low latency). Just like other messaging platforms it allows you to publish and subscribe stream of records/messages..but as a platform it offer much more. An important distinction worth mentioning is -<i>it's NOT a Queue implementation but it can be used as Queue</i>. It can be used to pipe data flow between two systems something which has traditionally been done through ETL systems.</div>
</div>
<br />
<br />
<h3 style="text-align: left;">
What is Stream Processing ?</h3>
</div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444;">
Let's understand what is stream processing before we delve deeper. Jay Kreps, who implemented Kafka along with other members while working at LinkedIn explains about stream processing, <a href="https://www.youtube.com/watch?v=9RMOc0SwRro">here</a>. Let's cover programming paradigms -</div>
<div style="color: #444444;">
<ol style="text-align: left;">
<li style="text-align: justify;"><b>Request/Response-</b> Send ONE request and wait for ONE response (a typical http or REST call).</li>
<li style="text-align: justify;"><b>Batch-</b> Send ALL inputs, batch job does data crunching/processing and then returns ALL output in one go. </li>
<li style="text-align: justify;"><b>Stream Processing- </b>Program has control in this model, it takes (bunch of) inputs and produces SOME output. SOME here depends on the program - it can return ALL output or ONE or it can do everything in-between. So, this is basically <span style="color: #741b47;"><u>generalisation of above two extreme models</u></span>.</li>
</ol>
</div>
<div style="color: #444444;">
<span style="color: #444444;">Stream processing is generally </span><b style="color: #444444;">async</b><span style="color: #444444;">. Stream processing has also been popularised by Lambdas and frameworks like Rx - where stream processing is confined to a process. But, in case of Kafka- </span><span style="color: #741b47;">stream processing is distributed and really large!</span></div>
<div style="color: #444444;">
<br /></div>
<h3 style="color: #444444; text-align: left;">
<b>What is Event Stream ?</b></h3>
<div style="color: #444444;">
<span style="color: #741b47;"><i><b>Events are actions which generate data!</b></i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444; text-align: justify;">
Databases store the current state of data, which has been reached by <b>sequence or</b> <b><i>stream of events</i></b>. Visualising data as stream of events might not be very obvious, especially if you have grown seeing data being stored as rows in databases. </div>
<div style="color: #444444; text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="color: #444444;">Let's take example of your bank balance - your current bank balance is result of all credit and debit events which have occurred in the past. </span><span style="color: #741b47;">Events are business story which resulted in the current state of data.</span><span style="color: #444444;"> Similarly, current price of a stock is due to all the buy and sell events which have happened from the day it got listed on a bourses; in retail domain data can be realised as- stream of orders, sells and price adjustments. Google earns billions of dollers by capturing click events and impression events.</span><br />
<div style="color: #444444;">
<br /></div>
<span style="color: #444444;">Now companies want to records all user events on their sites - this helps in better profiling the customers and offering more customised services. This has led to tons of data being generated. </span><span style="color: #741b47;"><i>So, when we hear about the term big data - it basically means capturing all these events which most of the companies were ignoring earlier. </i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<h3 style="color: #444444; text-align: left;">
Kafka's Core Data Structure</h3>
<div style="color: #444444; text-align: justify;">
There are quite similarities between stream of records and application logs. Both order the entries with time. At the core, Kafka uses something similar to record all the stream of events - <span style="color: #741b47;"><b>Commit Log</b></span>. I have discussed separately about commit logs, <a href="http://geekrai.blogspot.com/2017/07/role-of-commit-log-in-databases-and.html">here</a>.</div>
<div style="color: #444444;">
<br /></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
<span style="color: #4c1130;">Kafka provides commit log of updates as shown in below image. Data Sources ( or Producers) can publish stream of events or records which gets stored in commit log and then subscribers or consumers (like DB, Cache, http service) can read those events. These consumers are independent of each other and have their own reference points to read records. </span></div>
</div>
<div style="color: #444444;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="color: #444444; margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://cdn2.hubspot.net/hub/540072/file-3062870538-png/blog-files/commit_log-copy.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="293" data-original-width="660" height="177" src="https://cdn2.hubspot.net/hub/540072/file-3062870538-png/blog-files/commit_log-copy.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="https://www.confluent.io/blog/stream-data-platform-1/">https://www.confluent.io/blog/stream-data-platform-1/</a></td></tr>
</tbody></table>
<div style="color: #444444;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><u><i>Commit logs can be partitioned or shared across cluster of nodes and they are also replicated to achieve fault-tolerance.</i></u></span></div>
<div class="separator" style="clear: both; color: #444444; text-align: center;">
</div>
<div style="color: #444444;">
<br /></div>
<span style="color: #666666;">Read about Zero Copy here: <a href="https://www.ibm.com/developerworks/library/j-zerocopy/">https://www.ibm.com/developerworks/library/j-zerocopy/</a></span><br />
<span style="color: #666666;">How Kafka storage internally works: <a href="https://thehoard.blog/how-kafkas-storage-internals-work-3a29b02e026">https://thehoard.blog/how-kafkas-storage-internals-work-3a29b02e026</a></span><br />
<div style="color: #444444;">
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<h3 style="color: #444444; text-align: left;">
Key Concepts and Terminologies</h3>
<h3 style="color: #444444; font-size: 16px;">
<span style="font-weight: normal;">Let's cover important components and aspects of Kafka:</span></h3>
<div style="color: #444444;">
<span style="font-weight: normal;"><br /></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444;">
<b>Message</b></div>
<div style="color: #444444; text-align: justify;">
Message is record or information which gets persisted in Kafka for processing. It's a fundamental unit of data in Kafka world. Kafka stores the message in binary format. </div>
<div style="color: #444444; text-align: justify;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Topics</b></div>
<div style="text-align: justify;">
<span style="color: #444444;">Messages in Kafka are categorised under topic. Topic is like a database table. Messages are always published and subscribed from a given topic (name). For each topic, </span><span style="color: #4c1130;"><i>Kafka maintains a structured commit log with one or more partitions.</i></span></div>
<div style="color: #444444;">
<br /></div>
</div>
<div class="separator" style="clear: both; color: #444444; text-align: center;">
<a href="https://kafka.apache.org/0110/images/log_anatomy.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="267" data-original-width="416" height="204" src="https://kafka.apache.org/0110/images/log_anatomy.png" width="320" /></a></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Partition</b></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
A topic can have multiple partition as shown in above figure. <span style="text-align: left;">Kafka appends new message at the end of a partition. Each message in a topic is assigned a unique identifier known as</span><span style="text-align: left;"> </span><i style="text-align: left;"><span style="color: #741b47;">offset</span></i><span style="text-align: left;">. </span>Write to a partition are sequential (from left to write) but write across different partitions can be done in parallel as each partition may be in a different box/node. Offset uniquely identifies a message in a given partition. Current offset where message is going to be written in partition 0 is 12 (in above pic).</div>
</div>
<div style="color: #444444; text-align: justify;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #741b47;"><i>Ordering of record is guaranteed only across a partition of the given topic. Partition allows Kafka to go beyond the limitation of a single server. This means single topic can be scalled horizontally across multiple servers. <u>But at the same time each individual partition must fit in a host. </u></i></span></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<span style="color: #741b47;"><i><br /></i></span></div>
<div style="color: #444444; text-align: justify;">
<span style="font-family: "opensans" , sans-serif; font-size: 16px;"><i><span style="color: #4c1130;">Each partition has one server which acts as <b>leader</b> and zero or more servers which acts as <b>follower</b>. The leader handles all reads and writes requests for that partition and follower replicate. If the leader fails, one of the follower will get chosen as leader. Each server acts a leader for some partition and follower for others so that load is balanced. </span></i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Producers</b></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
As the name suggests, Producers post messages to topics. Producer is responsible for assigning messages to a partition in a given topic. <span style="color: #741b47;"><i>Producer connects to any of the alive nodes and requests metadata about the leaders for the partition of a topic. This allows the producer to put the message directly to the lead broker of the partition. </i></span></div>
</div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Consumers</b></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
Consumers subscribe to one or more topics and read messages for further processing. It's consumers job to keep track of which message have been read using offset. Consumers can re-read past messages or can jump to future messages as well. This is possible because Kafka retains all messages for a given time (which is pre configured).</div>
<br /></div>
<div style="color: #444444; text-align: justify;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<b>Consumer Group</b></div>
<div style="color: #444444; text-align: justify;">
Kafka scales the consumption by grouping consumers and distributing partitions among them. </div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="color: #444444; margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="color: #444444; font-size: 16px;"><a href="https://cdn2.hubspot.net/hubfs/540072/New_Consumer_figure_1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="250" data-original-width="519" height="192" src="https://cdn2.hubspot.net/hubfs/540072/New_Consumer_figure_1.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px;"><a href="https://cdn2.hubspot.net/hubfs/540072/New_Consumer_figure_1.png">https://cdn2.hubspot.net/hubfs/540072/New_Consumer_figure_1.png</a></td></tr>
</tbody></table>
<div class="separator" style="clear: both; color: #444444; text-align: center;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
Above diagram shows a topic with 3 partitions and a consumer group with 2 members. Each partition of topic is assigned to only one member in the group. Group coordination protocol is built into Kafka itself (earlier it was managed through zookeeper). For each group one broker is selected as group co-ordinator. It's main job is to control partition assignment when there is any change (addition/deletion) in membership of the group. </div>
</div>
<div style="color: #444444; text-align: justify;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="color: #444444; font-size: 16px;"><tbody>
<tr><td style="text-align: center;"><a href="https://kafka.apache.org/0110/images/consumer-groups.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="252" data-original-width="474" height="170" src="https://kafka.apache.org/0110/images/consumer-groups.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.800000190734863px; text-align: center;"><a href="https://kafka.apache.org/0110/images/consumer-groups.png">https://kafka.apache.org/0110/images/consumer-groups.png</a></td></tr>
</tbody></table>
<span style="color: #444444;">How Consumer Group helps:</span><br /><div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
</div>
<ul style="color: #444444;">
<li>If all consumer instances have same consumer group, then records will get load balanced over consumer instances. </li>
<li>If all the consumer instances have different consumer groups, then each record will be broadcast to all consumer processes. </li>
</ul>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Brokers</b></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
Kafka is a distributed system, so topics could be spread across different nodes in a cluster. These individual nodes or servers are known as <b>brokers</b>. <span style="color: #741b47;">Brokers job is to manage persistence and replication of messages. Brokers scale well as they are not responsible for tracking offset or messages for individual consumers. </span>Also there is no issue due to rate with which consumer consumes messages. A single broker can easily handle thousands of partitions and millions of messages. </div>
</div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444; text-align: justify;">
<u>Within a cluster of brokers, one will be elected as cluster controller. This happens automatically among live brokers.</u></div>
<div style="color: #444444;">
<br /></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
<div style="color: #444444;">
<span style="color: #4c1130;">A partition is owned by a single broker of the cluster and that broker acts as partition leader. To achieve replication, a partition will be assigned to multiple brokers. This provides redundancy of the portion and will be used as leader if the primary one fails. </span></div>
<div style="color: #444444;">
<span style="color: #4c1130;"><br /></span>
<b style="text-align: left;">Leader</b></div>
<div style="text-align: left;">
<div style="color: #444444;">
Node or Broker responsible for all the reads and writes of the given partition. </div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<b>Replication Factor</b></div>
<div style="color: #444444;">
<span style="color: #444444;">Replication factor controls the number of replica copies. </span><span style="color: #4c1130;"><i>If a topic is un-replicated then replication factor will be 1. </i></span></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<span style="color: #741b47;">If you want to design in such a way that <b>f</b> failures is fine then need to have <b>2f+1 </b>replica. </span></div>
<div style="color: #444444;">
<span style="color: #741b47;"><b><u>So, for 1 node failure; replication factor should be set to 3. </u></b></span></div>
</div>
<div style="color: #444444; text-align: left;">
<br /></div>
</div>
</div>
</div>
<h3 style="color: #444444; text-align: left;">
Architecture</h3>
</div>
</div>
<div style="color: #444444;">
<h3 style="color: #444444; font-family: opensans, sans-serif;">
<span style="font-size: 16px; font-weight: normal;">Below diagram shows all important components of Kafka and their relationship-</span></h3>
<h3 style="color: #444444; font-family: opensans, sans-serif;">
<span style="font-size: 16px; font-weight: normal;"><br /></span></h3>
<div class="separator" style="clear: both; color: #444444; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXPyfT2ViCEd5Qejet_rtoFaIGJrkWRJwXvl7LJDup21MtAO1t9yyjeLA-H54ormtoOgHVNwC5lQsBpRjFm52Z4MJfBzrQ_Q-R1-xpMscWjCkRr3ShogHy_T9tfO9OMyYKNp2Ab1Z_YTM/s1600/Screen+Shot+2017-07-23+at+3.36.58+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1013" data-original-width="1600" height="404" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXPyfT2ViCEd5Qejet_rtoFaIGJrkWRJwXvl7LJDup21MtAO1t9yyjeLA-H54ormtoOgHVNwC5lQsBpRjFm52Z4MJfBzrQ_Q-R1-xpMscWjCkRr3ShogHy_T9tfO9OMyYKNp2Ab1Z_YTM/s640/Screen+Shot+2017-07-23+at+3.36.58+PM.png" width="640" /></a></div>
<h3 style="font-family: opensans, sans-serif; font-size: 16px; text-align: left;">
<div style="font-weight: normal;">
<div style="color: #444444;">
<b><br /></b>
<b>Role of Zookeeper</b></div>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444;">
<br /></div>
<div style="text-align: justify;">
<span style="color: #444444;">Zookeeper is an open source, high performance coordination service for distributed applications. In distributed systems Zookeeper helps in configuration management, consensus building, coordination and locks (Hadoop also uses Zookeeper). </span><span style="color: #4c1130;"><i>It acts as middle man among all nodes and helps in different co-ordination activities - it's source of truth. </i></span></div>
</div>
<div style="color: #444444; text-align: justify;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
In Kafka, it's mainly used to track status of cluster nodes and also to keep track of topics, partitions, messages etc. So, before starting Kafka server, you should start Zookeeper. </div>
</div>
</div>
<div style="color: #444444; font-weight: normal;">
<br /></div>
<div style="color: #444444; font-weight: normal;">
<br />
<h3 style="text-align: left;">
Kafka Vs Messaging Systems</h3>
</div>
</h3>
<h3 style="color: #444444; text-align: left;">
<div style="font-size: 16px;">
<div style="text-align: justify;">
<span style="font-weight: normal;">Kafka can be used as a traditional messaging systems (or Brokers) like ActiveMQ, RabitMQ, Tibco. Traditional messaging systems have two models - </span>queuing<span style="font-weight: normal;"> and </span>publish-subscribe<span style="font-weight: normal;">. Each of these models have their own strengths and weaknesses -</span></div>
</div>
<span style="font-weight: normal;"><div style="text-align: justify;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><br /></span></div>
</span><span style="font-family: "opensans" , sans-serif; font-size: 16px;"><div style="text-align: justify;">
<span style="color: #444444;">Queuing- </span><span style="font-weight: normal;"><span style="color: #444444;">In this model pool of consumers can read from server and each record goes to one of them. This allows to divide up the processing over multiple consumers and helps in scale processing. </span><span style="color: #741b47;">But, queues are not multi-subscriber- once one subscriber reads data it's gone. </span></span></div>
</span><div style="text-align: justify;">
<br /></div>
<span style="font-family: "opensans" , sans-serif; font-size: 16px;"><div style="text-align: justify;">
<span style="color: #444444;">Publish-Subscribe- </span><span style="font-weight: normal;"><span style="color: #444444;">This model allows to broadcast data to multiple subscribers. </span><span style="color: #741b47;">But, this approach doesn't scale up as every message goes to every consumer/subscriber. </span></span></div>
</span><span style="font-weight: normal;"><div style="text-align: justify;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><br /></span></div>
</span><div style="text-align: justify;">
<span style="color: #741b47;"><span style="font-family: "opensans" , sans-serif; font-size: 16px; font-weight: normal;">Kafka uses </span><span style="font-family: "opensans" , sans-serif; font-size: 16px;">Consumer Group</span><span style="font-family: "opensans" , sans-serif; font-size: 16px; font-weight: normal;"> model; so every topic has both these properties (queue and publish-subscribe). <i>It can scale processing and it's also multi-subscriber.</i> Consumer group allows to divide up processing over different consumers of the group (just like queue model). And just like traditional pub-sub model it allows you to broadcast messages to multiple consumer groups. </span></span></div>
<div style="font-size: 16px;">
<div style="text-align: justify;">
<br /></div>
</div>
<div style="font-size: 16px; font-weight: normal;">
<div style="text-align: justify;">
Order is not guaranteed in tradition systems, if records needs to be delivered to consumers asynchronously they may reach to consumers out of order. Kafka does better by creating multiple partitions for the same topics and each partition is consumed by exactly one consumer in the group. </div>
</div>
</h3>
<h3 style="color: #444444; font-family: opensans, sans-serif; font-size: 16px; text-align: left;">
<div style="font-weight: normal;">
<br />
<h3 style="text-align: left;">
CLI Commands</h3>
</div>
</h3>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
Here, goes a link which covers Kafka CLI commands.</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://howtoprogram.xyz/2016/07/08/apache-kafka-command-line-interface/">https://howtoprogram.xyz/2016/07/08/apache-kafka-command-line-interface/</a></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br /></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<br />
<br />
-------</div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<b>References:</b></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://kafka.apache.org/documentation/">https://kafka.apache.org/documentation/</a></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://www.confluent.io/blog/stream-data-platform-1/">https://www.confluent.io/blog/stream-data-platform-1/</a></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying</a></div>
<div style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://thenewstack.io/apache-kafka-primer/">https://thenewstack.io/apache-kafka-primer/</a><br />
<a href="http://thesecretlivesofdata.com/">http://thesecretlivesofdata.com</a></div>
</div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com2tag:blogger.com,1999:blog-5276776697364295662.post-56302824080377217202017-07-12T01:52:00.001-07:002017-07-18T05:33:36.300-07:00Role of Commit Log in Databases and Distributed Systems<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
Before jumping to the topic, let's start with few questions -<br />
<ul style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
<li style="text-align: justify;">What if debit was done successfully from one account but before credit to second account, DB crashed! Quite possible, right? How relational DBs handle such crashes and ensure that when it comes up the data is in consistent state? (in other words how DBs achieve atomicity and durability?)</li>
<li style="text-align: justify;">How databases like Oracle, MongoDB, PostgreSQL keeps the replica component in synch with the master ?</li>
<li style="text-align: justify;">In a distributed database - how two components of a distributed system agree on a given update order ? Assume that two changes which were sent arrive in different order due to network issues, latency, asynchronous nature or some other issue, then how the system can know what should be update order. In Kafka it's known as Order guarantee. </li>
<li style="text-align: justify;">How a process remains consistent across different nodes ? Or, how a particular update gets consistently applied to different replicas ?</li>
<li style="text-align: justify;">How you force multiple nodes in a distributed system to do the same stuff ? Or, how you enforce <i><u>deterministic process is deterministic</u>?</i></li>
</ul>
<div>
<br /></div>
<div style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
Answer to all above questions is - <b>Log</b>!<br />
In database and systems world it is called as <b>write-ahead log</b> or <i><b>commit log </b>or<b> journal </b></i>(similar to application logs but used only for programatic access). Jay Kreps in his book <i><a href="https://www.amazon.com/Heart-Logs-Stream-Processing-Integration/dp/1491909382/ref=sr_1_1?ie=UTF8&qid=1499836980&sr=8-1&keywords=i+love+logs">I Love Logs</a></i> defines it as - <span style="color: #741b47;">a append-only sequence of records ordered by time.</span> </div>
<div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvcql8iu8YmxkmM9Y2H3DJ3i7igI3zBx2bzAsCjQHtcUUx-e3VzU_DDzP69GMWgFnveW2rVUp0xcVIeawrMbTzRhLMk48FBbicPEH-0hjqBZusZJMBU7zYJHh5SYhY2Lo7J8FvucHdb58/s1600/log.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="187" data-original-width="396" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvcql8iu8YmxkmM9Y2H3DJ3i7igI3zBx2bzAsCjQHtcUUx-e3VzU_DDzP69GMWgFnveW2rVUp0xcVIeawrMbTzRhLMk48FBbicPEH-0hjqBZusZJMBU7zYJHh5SYhY2Lo7J8FvucHdb58/s320/log.png" width="320" /></a></div>
<br /></div>
<div>
<br /></div>
<div>
<div style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
Each record (as rectangle in above image) gets appended to the log from left to right. Each entry is assigned a unique sequential log entry number which acts as unique key. The records are relatively ordered with time - leftmost having occurred at the earliest. <span style="color: #741b47;">So, log will help in recording what happened and when - quite handy in distributed data systems.</span> </div>
</div>
<div style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
<br />
<b>Common Usage of Commit log-</b> <i>data integration, enterprise architecture, real-time data processing, and data system design. </i></div>
<div>
<br />
<h3 style="text-align: left;">
How Commit log helps?</h3>
</div>
<div>
<div style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
<div style="text-align: justify;">
<span style="color: #741b47;"><i>Relational databases, NoSQL databases, and distributed systems write out the event/update information in a commit log first. It will log all relevant details in the log before actually applying those changes to the actual systems. Writing to log file is an atomic and non-distributed operation so it will get persisted immediately and then it gets used as single source of authority for applying those changes to the systems. So, even if system sees a crash or fault; once it recovers it will check the commit log and apply the pending changes.</i></span></div>
</div>
<br /></div>
<div>
<div style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">
<span style="color: #741b47;">Similarly, sequence of events/changes which happens on primary data nodes is exactly what is needed to keep the remote replica database in sync. The slave to replicate node/DB can apply those changes recorded in commit log to their own local data source to stay in sync with master. </span><br />
<span style="color: #741b47;"><br /></span>
<span style="color: #741b47;">The problem of data integration can also be solved through commit logs. Take out all the organisations data and put it in a centralised log for processing. That's what the specialised systems like Kafka does. </span></div>
</div>
<div>
<br /></div>
<div>
<br />
happy logging !<br />
<br /></div>
<div>
- - -</div>
<div>
References:</div>
<div>
<br />
<a href="https://www.confluent.io/blog/stream-data-platform-1/">https://www.confluent.io/blog/stream-data-platform-1/</a><br />
<a href="https://www.amazon.com/Heart-Logs-Stream-Processing-Integration/dp/1491909382/ref=sr_1_1?ie=UTF8&qid=1499836980&sr=8-1&keywords=i+love+logs">https://www.amazon.com/Heart-Logs-Stream-Processing-Integration/dp/1491909382/ref=sr_1_1?ie=UTF8&qid=1499836980&sr=8-1&keywords=i+love+logs</a><br />
<a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying</a></div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-27099887292759604712017-06-11T22:44:00.002-07:002017-07-25T21:53:59.739-07:00Securing Communication between Data Centre and Cloud<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="font-family: opensans, sans-serif; font-size: 16px;">
<div dir="ltr" style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="color: #444444; text-align: justify;">
In the early days of my career, I used to wonder why we connect to office network using the crypto card. If you have no clue or some clue about what the hell is VPN (just like me :D) ; I would recommend this <a href="http://www.techhive.com/article/3158192/privacy/howand-whyyou-should-use-a-vpn-any-time-you-hop-on-the-internet.html">link</a> which covers the fundamentals of VPN. </div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
Let's start with definition of VPN gateway-</div>
<div style="color: #444444;">
<br /></div>
<h3 style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
VPN Gateway</h3>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
<span style="color: #4c1130;">A VPN gateway is a type of networking device that connects two or more devices or networks together in a VPN infrastructure. It is designed to create connection or communication between two or more remote sites, networks or devices and/or connect multiple VPNs together.</span> <a href="https://www.techopedia.com/definition/30755/vpn-gateway" style="color: #444444;">Ref</a></div>
</div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<h3 style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
From the Perspective of Cloud</h3>
<div style="font-family: opensans, sans-serif; font-size: 16px;">
<div style="text-align: justify;">
<div style="color: #444444;">
Companies are gradually moving (their systems) to cloud, so there is need of secure connectivity between Data Centre and Cloud hosted applications. Cloud has become a logical extension of the corporate datacenter (this is referred to as hybrid datacenter). </div>
<div style="color: #444444;">
<br /></div>
<span style="color: #4c1130;"><i>The cloud hosted application should be able to securely talk to in-premise data or application. This is where the VPN gateway comes into play by securing one site to another site. VPN builds a secure tunnel between two remote sites.</i></span><br />
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
Below diagram shows VPC inside AWS and GCP. You can think of VPC (Virtual Private Cloud) as a cloud inside cloud; or a logical datacenter inside AWS (or GCP - Google Cloud Platform). </div>
</div>
</div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<br /></div>
<div class="separator" style="color: #444444; font-family: opensans, sans-serif; font-size: 16px;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGjgv2baNw2QcO8satIOFOSMFTZMbJlu_sTVeptPwFKEpCI7_lsKMHFIZ2mzpVrr8e96eDWoTm58PjYTiKl2xCN3HPNjDuULGI3AzhK26jTFqM3nQyBKp-elC3kUjFZ6WFMqPtHM0WEYw/s1600/Screen+Shot+2017-06-04+at+1.16.55+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="966" data-original-width="1588" height="387" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGjgv2baNw2QcO8satIOFOSMFTZMbJlu_sTVeptPwFKEpCI7_lsKMHFIZ2mzpVrr8e96eDWoTm58PjYTiKl2xCN3HPNjDuULGI3AzhK26jTFqM3nQyBKp-elC3kUjFZ6WFMqPtHM0WEYw/s640/Screen+Shot+2017-06-04+at+1.16.55+PM.png" width="640" /></a></div>
<div style="color: #444444;">
<br /></div>
<div style="color: #444444;">
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><br /></span>
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;">Traffic traveling between the two networks is encrypted by originator's VPN gateway, then it gets decrypted by the receiver's VPN gateway. </span><br />
<span style="color: #444444; font-family: "opensans" , sans-serif; font-size: 16px;"><br /></span></div>
</div>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0tag:blogger.com,1999:blog-5276776697364295662.post-61869829524297746972017-06-11T04:08:00.000-07:002017-06-12T23:09:10.244-07:00Count number of different bits in two Numbers<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><b>Problem</b>:</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Given two numbers, find how many bits are different in two numbers.</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Or, another way to look at problem is - Determine number of bits required to convert <i>num_1</i> to <i>num_2</i>.</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="color: #666666; font-family: "times" , "times new roman" , serif; font-size: large;"><i>num_1</i> = 1</span><br />
<span style="color: #666666; font-family: "times" , "times new roman" , serif; font-size: large;"><i>num_2</i> = 0</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Number of different bits = 1</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="color: #999999; font-family: "times" , "times new roman" , serif; font-size: large;"><i>num_1 = 11111</i></span><br />
<span style="color: #999999; font-family: "times" , "times new roman" , serif; font-size: large;"><i>num_2 = 01110</i></span><br />
<i><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Number of different bits = 2</span></i><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><i><br /></i>
<b>Solution:</b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Basically we need to find at each position if the value of bit in two number is same or different. If they are different then increase the counter and do the same for all subsequent bits.</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">It might not be very obvious from the problem but there is a bit operator which exactly finds out how different two inputs are. Let's apply XOR operator and see how it behaves:</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><b>1 ^ 0 = 1</b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><b>0 ^ 0 = 0</b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><b>0 ^ 1 = 1</b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><b>1 ^ 1 = 0</b></span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;">So notice that, when bits are same output is always 0. And when both bits are different then output is 1.</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"> 11111</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">^ 01110</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">-------------</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"> 10001</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">So after taking XOR, we just need to count the number of 1's in the result.</span><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span>
<b><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Java Implementation</span></b><br />
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><b><br /></b>
<span style="color: #741b47;"><i>public static int countNumberOfDifferentBits(int a, int b){</i></span></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i> int xor = a ^ b;</i></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i> int count = 0;</i></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i> for(int i= xor; i!=0;){</i></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i> count += i & 1;</i></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i> i = i >> 1;</i></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i> }</i></span><br />
<span style="color: #741b47; font-family: "times" , "times new roman" , serif; font-size: large;"><i>}</i></span></div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com1tag:blogger.com,1999:blog-5276776697364295662.post-57774633286165282252017-06-11T04:05:00.000-07:002017-06-12T23:09:19.710-07:00Measuring Execution Time of a Method in Java<div dir="ltr" style="text-align: left;" trbidi="on">
<b><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Old fashioned Way</span></b><br />
<div>
<i><span style="font-family: "times" , "times new roman" , serif; font-size: large;">System.currentTimeMillis()</span></i><br />
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Accuracy is only in milli seconds, so if you are timing a method which is quite small then you might not get good results.</span></div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span></div>
<div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">List<Integer> <span style="color: #7e504f;">input</span> = getInputList();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="color: #931a68;">long</span> <span style="color: #7e504f;">t1</span> = System.currentTimeMillis();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Collections.sort(<span style="color: #7e504f;">input</span>);</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="color: #931a68;">long</span> <span style="color: #7e504f;">t2</span> = System.currentTimeMillis();</span></div>
<div style="color: #3933ff;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="color: black;">System.</span><span style="color: #0326cc;">out</span><span style="color: black;">.println(</span>"Time Taken ="<span style="color: black;">+ (</span><span style="color: #7e504f;">t2</span><span style="color: black;">-</span><span style="color: #7e504f;">t1</span><span style="color: black;">) + </span>" in milli seconds"<span style="color: black;">);</span></span></div>
</div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span></div>
<div>
<b><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Using Nano seconds</span></b></div>
</div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">System.nanoTime()</span></div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Preferred approach (compared to first one). But do keep in mind that not all systems will provide accuracy in nano time.</span></div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span></div>
<div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">List<Integer> <span style="color: #7e504f;">input</span> = getInputList();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="color: #931a68;">long</span> <span style="color: #7e504f;">t1</span> = System.nanoTime();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Collections.sort(<span style="color: #7e504f;">input</span>);</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="color: #931a68;">long</span> <span style="color: #7e504f;">t2</span> = System.nanoTime();</span></div>
<div style="color: #3933ff;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><span style="color: black;">System.</span><span style="color: #0326cc;">out</span><span style="color: black;">.println(</span>"Time Taken ="<span style="color: black;">+ (</span><span style="color: #7e504f;">t2</span><span style="color: black;">-</span><span style="color: #7e504f;">t1</span><span style="color: black;">) + </span>" in nano seconds"<span style="color: black;">);</span></span></div>
</div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span></div>
<div>
<b><span style="font-family: "times" , "times new roman" , serif; font-size: large;">Java 8</span></b></div>
<div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">List<Integer> <span style="color: #7e504f;">input</span> = getInputList();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Instant <span style="color: #7e504f;">start</span> = Instant.now();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Collections.sort(<span style="color: #7e504f;">input</span>);</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">Instant <span style="color: #7e504f;">end</span> = Instant.now();</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">System.<span style="color: #0326cc;">out</span>.println(<span style="color: #3933ff;">"Time Taken ="</span>+ Duration.between(<span style="color: #7e504f;">start</span>, <span style="color: #7e504f;">end</span>) + <span style="color: #3933ff;">" in nano seconds"</span>);</span></div>
<div style="font-family: Monaco; font-size: 11px;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span></div>
<h3 style="text-align: left;">
<span style="font-family: "times" , "times new roman" , serif; font-size: large; font-weight: normal;">Guava</span></h3>
</div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large; font-weight: normal;">Stopwatch stopwatch = new Stopwatch().start();</span></div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large; font-weight: normal;">Collections.sort(input);</span></div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;">stopwatch.stop();</span></div>
<div>
<span style="font-family: "times" , "times new roman" , serif; font-size: large;"><br /></span></div>
<div>
<pre class="lang-java prettyprint prettyprinted" style="background-color: #eeeeee; border: 0px; color: #393318; margin-bottom: 1em; max-height: 600px; overflow: auto; padding: 5px; width: auto; word-wrap: normal;"></pre>
</div>
</div>
siddheshwarhttp://www.blogger.com/profile/06138213414248415451noreply@blogger.com0