Sunday, January 24, 2016

The Cost of Concurrency

Concurrency is not free!

Modern libraries provide a wonderful abstraction for programmers, so doing certain task concurrently or asynchronously is quite trivial. It is as simple as instantiating an object and calling few methods on it, and you are done! These libraries are abstracted in such a way that they don't even remind to programmers that you are going to deal with threads. And this is where the lazy programmer can take things for granted.

You need to process 100 task, create 50 threads.
     Collection<Task> task = fetchTasks();   //from somewhere
     int numberOfThreads = 50;
    obj.executeConcurrently(tasks, numberOfThreads);
In object oriented world, all it takes is a method call. 

To understand the cost of concurrency, let's take a step back and ask yourself how is it implemented? It is implemented through locks. Locks provide mutual exclusion and ensure that the visibility of change occurs in an ordered manner. 

Locks are expensive because they require arbitration when contended. This arbitration is achieved by a context switch at the OS level which will suspend threads waiting for lock until it is released. Context switch might cause performance penalty as OS might decide to do some other housekeeping job and so will lose the cached instruction and data. This is even more evident in multicore CPUs where each core has its own cache.  In the worst case, this might cause latency equivalent to that of an I/O operation. 

Another aspect of concurrency is managing the lifecycle of threads. OS does the dirty job of creating threads and managing them on behalf of your platform (or runtime environment). There are certain limits on the number of threads which can be created at the system level. So definitely, proper thoughts should be given on how many threads are required to accomplish a job.


Don't blindly decide to execute task concurrently!

Friday, January 22, 2016

Concurrency or Thread Model of Java

Thread Model in Java is built around shared memory and Locking!

Concurrency basically means two or more tasks happen in parallel and they compete to access a resource.  In the object-oriented world, the resource would be an object which could abstract a database, file, socket, network connection etc. In the concurrent environment, multiple threads try to get hold of the same resource. Locks are used to ensuring consistency in case there is a possibility of concurrent execution of a piece of code. Let's cover these aspects briefly:

Concurrent Execution is about

  1. Mutual Exclusion or Mutex
  2. Memory Consistency or Visibility of change

Mutual Exclusion
Mutual exclusion ensures that at a given point of time only one thread can modify a resource.  If you write an algorithm which guarantees that a given resource can be modified by (only) one thread then mutual exclusion is not required.  


Visibility of Change
This is about propagating changes to all threads. If a thread modifies the value of a resource and then (right after that) another thread wants to read the same then the thread model should ensure that read thread/task gets the updated value. 

The most costly operation in a concurrent environment contends write access. Write access to a resource, by multiple threads, requires expensive and complex coordination. Both read as well as write requires that all changes are made visible to other threads. 


Locks

Locks provide mutual exclusion and ensure that visibility of change is guaranteed (Java implements locks using the synchronized keyword which can be applied on a code block or method).

Read about the cost of locks, here

Wednesday, January 20, 2016

Preventing logging overhead in Log4J

Have you seen something like below in your application, and ever wondered about the overhead (due to the if condition). This post covers ways to get rid of the overhead with Log4j 2.x and Java 8.

if(log.isDebugEnabled()){
   log.debug("Person="+ person);
}

Above style is quite common in log4j-1.x; though it adds few extra lines it improves the performance.


Below log calls toString method on the person even if it's not going to get logged.
log.debug(person);  //big NO; avoid this !

So how do we remove the overhead of if check

The if condition is an overhead and it's going to appear multiple places in method/class. Also, if you don't do logging judiciously then it can be spread all over.

log4j 2.x
log4j 2.x is out there after a long gap and this particular issue has been addressed. Inspired by SLF4J it has added parameterized logging.

log.debug("{}"+person); //will not call .toString method on person
log.debug("{}"+person.toString());   //this is still going to be an issue

So log4j 2.x parameterized logging mechanism will not call implicit (toString) method on the person, but if you call it explicitly then it will make the call. So log4j 2 has partially addressed the problem.

log4j 2.4 + Java 8 lambda
log4j 2.4 supports lambda based mechanism which solves the problem completely. So it doesn't call the method (implicit or explicit) at all if the statement being logged is at the level less specific than the current level.

log.debug(() -> person.toString());   //all good 
log.debug("{}"+ this::expensiveOperation());   //all good

Final Note:
Be mindful of logging overhead when you do code review!


References
https://logging.apache.org/log4j/2.0/manual/api.html
http://marxsoftware.blogspot.pt/2015/10/log4j2-non-logging-performance.html
http://www.jroller.com/bantunes/entry/java_lazy_logging_features_ways
http://www.infoq.com/news/2014/07/apache-log4j2
http://www.grobmeier.de/log4j-2-performance-close-to-insane-20072013.html

Saturday, January 16, 2016

Extracting root cause from Stack Trace

Don't tell me the problem; show me the logs first!

Whether you are a fresh graduate, an experienced programmer, QA engineer, production engineer or even a product manager - a good understanding of stack trace is vital to crack critical issues. Your ability to find the real culprit from a lengthy stack trace will be instrumental in resolving a problem. This is even more important if you work on a distributed system where you use many libraries so stack trace is not well structured. Let's start with a simple scenario-

Scenario 1: Simple

This is the most trivial case, where the exception gets thrown by a method of your project and during the call duration, it doesn't go out of your code base. This is the most trivial scenario which you will encounter but very important to understand how the stack trace gets printed.

Shown below is Eclipse screenshot of MyController.java which two classes. Right click and run the program. 

Let's Decode Above stack trace:
  • RuntimeException is shown in line number 29, in method MyService.four()
  • Method MyService.four() gets called by MyService.three() in line number 25
  • Method MyService.three() gets called by MyController.two() in line 11
  • Method MyController.two() gets called by MyController.one() in line 6
  • Method MyController.one() gets called by MyController.main() in line 17


First frame of stack trace holds all important information required to know the root cause. 
Be mindful of the very important line number 

Scenario 2: Chain Of Exception

Let's modify above code a bit by catching exception at the origin and then throwing a brand new Exception. 


Let's Decode Above stack trace:

This stack trace has a caused by section. It has only one caused by but in real applications you can have multiple caused by sections. The last caused by will have the root cause of the exception.


Caused by: java.lang.RuntimeException: here, comes the exception!
at MyService.four(MyController.java:30)


... 4 more


But, if you are using external jars or libraries, finding the root cause could be bit tricky as the real reason might be nested deep inside. In such case you should look for Class.method name which belongs to your application. Also, you should look the complete stack trace carefully as the real root cause could lie in any part of stack trace. 


References:
http://www.nurkiewicz.com/2011/09/logging-exceptions-root-cause-first.html
http://stackoverflow.com/questions/12688068/how-to-read-and-understand-the-java-stack-trace
http://stackoverflow.com/questions/3988788/what-is-a-stack-trace-and-how-can-i-use-it-to-debug-my-application-errors


Friday, January 15, 2016

Logging Hibernate Query

To log Hibernate SQL queries, we need to set below properties to true in the configuration of the session factory (only the first one is mandatory).

<property name="hibernate.show.sql" value="true" />
<property name="hibernate.format.sql" value="true" />
<property name="hibernate.use.sql.comments" value="true" />

show.sql : logs SQL queries
format.sql: pretty prints the SQL
use.sql.comments: adds explanatory comment

This is how a sample query will look like, after enabling above properties:

/* insert com.co.orm.AuditLogRecord  */
insert into 
audit_log
(creationTime, createdBy, deleted, name, operationType, rowId, details)
values
(?,?,?,?,?,?,?,?)

Decoding Above Query

Hibernate logs above query which gets sent to the JDBC driver. Hibernate logs prepared statement and that's why you are seeing? instead of the actual values.

Hibernate only knows about the prepared statement that it sends to the JDBC driver.  It's JDBC driver that builds the actual query and sends them to the database for execution.

This means that, if you want Hibernate to log the actual SQL query with the value embedded, it would have to generate them for you just for logging! 

If you want to see actual query:

Thursday, January 14, 2016

Binary Tree Level Order Traversal

Pre-order, In-order and Post-order traversal of tree use Depth First Search (DFS). These are called DFS since these techniques visit the tree deeper and deeper until it reaches the leaf node. Level-order traversal (as the name suggest) visits the nodes level by level. Level Order Traversal is Breadth First Search(BFS). Below diagram shows Level Order traversal of a binary tree. 



Let's see implementation techniques:

Without Using External Storage

Applying recursion in BFS traversal is non-trivial. So in this case, the most obvious approach could be to print the nodes at each level iteratively. Above tree has three level, which equals the height of the tree. Now, how do we print all nodes at a given level?

To print nodes at level 2(i.e. 2 and 3), we need to start at the root and keep traversing down the tree recursively. Each time we go one level down the level value needs to be reduced by 1. This way both left and right subtree will keep going down the tree independently until the level becomes 1.

   public void levelOrderIteratively(){
    int height = this.getHeight(root);
    for(int i=1; i<= height; i++){
    printNodesAtGivenLevel(root, i);
    }
    }

//other methods are given in full implementation

The below implementation prints all nodes in the same line. In above method, we are calling a separate method to print for each level so we can format it the way we want (something like below can be easily done by tweaking above for loop. 


Nodes at level 1 = 1 

Nodes at level 2 = 2 3 
Nodes at level 3 = 4 5 6 

Using external storage / Using Queue

One of the de-facto approaches to implementing any breadth-first traversal is using Queue (First in First out data structure). BFS enables to visit nodes of the tree level by level. 

Java Implementation

Below java implementation provides level order traversal using an iterative technique which doesn't require extra storage as well as using Queue. 

package algo;

import java.util.LinkedList;
import java.util.Queue;

/**
 * Generic Binary Tree Implementation
 * @author Siddheshwar
 *
 * @param <E>
 */
public class BinaryTree<E> {
 /**
  * Root node of the tree
  */
 Node<E> root;

 /**
  * Node class to represent each node of the tree
  */
 static class Node<E> {
  E data;
  Node<E> left;
  Node<E> right;

  Node(E d) {
   this.data = d;
   this.left = null;
   this.right = null;
  }
 }

 /**
  * Get height of the tree
  * 
  * @param node
  *            root node
  * @return height of the tree which is max of left subtree height or right
  *         subtree height + 1 (for root)
  */
 public int getHeight(Node<E> node) {
  if (node == null) {
   return 0;
  }

  return Math.max(getHeight(node.left), getHeight(node.right)) + 1;
 }

 /**
  * Print binary tree in level-order
  */
 public void levelOrderIteratively() {
  int height = this.getHeight(root);
  for (int i = 1; i <= height; i++) {
   printNodesAtGivenLevel(root, i);
  }
 }

 /**
  * Print nodes at the given level
  * 
  * @param node
  *            first call with root node
  * @param level
  *            first call with the level for which we need to print
  */
 private void printNodesAtGivenLevel(Node<E> node, int level) {
  if (node == null)
   return;

  if (level == 1) {
   System.out.print(node.data + " ");
   return;
  }

  printNodesAtGivenLevel(node.left, level - 1);
  printNodesAtGivenLevel(node.right, level - 1);

 }

 /**
  * Prints level order traversal using Queue
  */
 public void levelOrderQueue() {
  if (root == null) {
   return;
  }

  Queue<Node<E>> queue = new LinkedList<>();
  queue.add(root);

  while (!queue.isEmpty()) {
   Node<E> node = queue.poll();
   System.out.print(node.data + " ");

   if (node.left != null) {
    queue.add(node.left);
   }
   if (node.right != null) {
    queue.add(node.right);
   }
  }
 }

 /**
  * Method to test the tree construction
  */
 public static void main(String[] args) {
  BinaryTree<Integer> bt = new BinaryTree<Integer>();
  bt.root = new Node<>(1);
  bt.root.left = new Node<>(2);
  bt.root.right = new Node<>(3);
  bt.root.left.left = new Node<>(4);
  bt.root.right.left = new Node<>(5);
  bt.root.right.right = new Node<>(6);

  System.out.println(" height =" + bt.getHeight(bt.root));
  bt.levelOrderIteratively();
  System.out.println();
  bt.levelOrderQueue();
 }
}

--
keep coding !!!