Thursday, December 31, 2015

Hibernate write-behind technique

Hibernate implements a technique known as write-behind, to minimize the impact of network latency and duration of database lock.  This enables hibernate makes DML calls as late as possible.

What it actually means is: When objects associated with a persistence context are modified (by update/delete), the changes are not propagated immediately to the database.

Benefits of this technique

  1. If you made two updates to an entity in a session; hibernate doesn't have to make two SQL updates. It can manage both in the same update. 
  2. Hibernate can make use of JDBC batch update when it executes multiple INSERT, UPDATE or DELETE.
  3. Repeated flushing (synchronization of a persistence context with the database) of the persistence context also impacts performance. All dirty objects in the persistence context need to be detected (at the flush time), so if the size of persistent context is large then dirty checking can cause a lot of performance penalty. 

Wednesday, December 30, 2015

Entity States

Java applications contain a mix of transient and persistent objects. Transient (or normal objects) has a limited lifetime and is bounded by life of the process that instantiated it.  But persistent objects (or Entity) can be stored on disk or file system to create again in the future. 

The Java Persistence API (JPA) is part of EJB 3.0 specification (EJB 3.0 itself is part of Java Enterprise Edition). JPA refers to persisted or persistable  classes as Entity.  Technically, it's  just a POJO class which maps to a database table/view. Instance of an entity can represent a single row of a table. This post will cover different states an instance of an Entity could be in. In context of JPA, object and entity refers to the same thing (more details here)

Entity States 

  1. New/Transient Object which got instantiated using new operator is in transient state as it's not yet associated with any persistence context. It's not mapped yet to a record/row in database. It's just another object; and in fact JPA specification doesn't give any name for this state. 
  2. Managed/Persistent Object has a database identity. This means the instance has a valid primary key value to uniquely identity it in the database (or more specifically in a table). These instances are associated with a persistence context. 
  3. Detached As long as persistence context is active entity is in managed state. Once transaction (or unit of work) completes, persistence context is closed but the application still has handle to the entity. So such entities are in detached state. 
  4. Removed During the transaction we can delete or remove an entity. It's still associated with persistence context but it get's scheduled for deletion (at the end of transaction). So a removed object shouldn'd be reused and any reference holding it should be discarded. 
Image Reference : http://openjpa.apache.org

Let's Code

Let's code to explain the above states:

New

Student student = new EngineeringStudent();
student.setName(..);
student.setGrade(..);
student.setXX(..);

Managed

entityManager.persist(student);    //JPA
session.save(student);                  //Hibernate 

Delete
session.delete(student);
entityManager.remove(student);

Detached
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

Student student = fetchStudentWithIdFromDb(id);
student.setGrade(..);
tx.commit();
session.close();

student.setXX(..);   //student is detached here

Detaching Inside a transaction
session.evict(student);  //detaches one object
session.clear(); //detaches all objects

Monday, December 28, 2015

Persistence Context

Java Persistence API uses javax.persistence.EntityManager to manage Entity instances and their life cycle(and hibernate does it using org.hibernate.Session). Each EntityManager instance is associated with a Persistence Context. 

EntityManager begines a new Persistence Context with each transaction and once the transaction ends (with commit or rollback) Context also ends. Within the transaction, entities retrieved from DB are managed entities (or instances which get saved become managed as well). When transaction completes, all entities loose their association from context and become detached.  

So, Persistence Context is cache of managed entity instances attached with your unit of work (or transaction). We don't need to do anything to enable it, it's always there (and we can't turn it off)! This Context has scope of a unit of work and it gets processed in a single thread (so it doesn't have issues like lock management or concurrent access).




How does it help

  • If we ask to load an entity using a primary key, EntityManager/Session checks first in the Context- If the entity is found there, there will be no DB hit. It's repeatable read for the application. So we get repeatable read absolutely free!
  • At most single object (entity) can represent any database row; there is no conflict. And all changes made to that row can be safely written back. 
  • Changes made to a managed entity is immediately visible to other (managed) entities in the context. 
  • At the end of transaction, JPA providers like Hibernate scans persistence context to find out which all entities got modified and only entities with any modification (or dirty attribute) gets propagated to the database. This is known as automatic-dirty-checking

Final Note

Persistence Context hold a copy/snapshot of each persistence object. The snapshot is used by JPA provider to do dirty checking (detect any modification done to the object). So if you carelessly load large number of objects you might run out of memory (OutOfMemoryException).  So be mindful of the impact on memory when you load records from DB (put proper condition in queries to avoid un-necessary load). 


References
https://docs.jboss.org/hibernate/orm/4.0/devguide/en-US/html/ch03.html

Sunday, December 20, 2015

Find if a word exists in a 2D grid

This problem is also known as Word Search Problem, found this on leetcode.

Problem:

Given a 2D board and a word, find if the word exists in the grid. The word can be constructed from the sequentially adjacent cell, where adjacent cells are those horizontally or vertically neighboring. The same letter may not be used more than once. 

[
  ['A','B','C','E'],
  ['S','F','C','S'],
  ['A','D','E','E']
]

word : ABCCED -> true
              ABCB   -> false


Approach:

An initial glance at the 2D array confirms that at a given position there is more than one option to select next character. And if a given selected path fails to find the word then the search needs to backtrack and try out other options.  Below diagram illustrates this point - At S (row:1, col:3), we can start with either of E (up or down). But If we are looking for SEE, then choosing upper E will fail. 



We can apply Graph, depth-first traversal to check if the given input word exists in the 2D array. So we can start with the first character of the word and keep on checking if the next character of the input word is one of the neighbors of that character. Also, note that, if a neighbor is already considered then we need to discard that. 

Note: Characters can repeat in the 2D array. We need to abstract individual entries in the Node class which will keep row and column value along with the character. 


Implementation

Java implementation:


package backtracking;

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Objects;
import java.util.Set;

/**
 * Searches for a word in a 2-D character array. Same character can be present at two different locations in the array.
 * Uses DFS to search the input array in the board.
 */
public class WordSearch {
 private char[][] board;
 private int ROW, COL;
 
 public WordSearch(char[][] board) {
  super();
  this.board = board;
  this.ROW = board.length;
  this.COL = board[0].length;
 }

 /**
  * DFS search.
  * 
  * @param yetToBeSearchedInputStr
  *            string which is yet to get searched in the 2-D array
  * @param currPos
  *            abstracts the character and it's row, col in the 2-D board
  * @param alreadyTravelled
  *            stores nodes which are already covered/travelled
  * @return true if the word was found in the board; false otherwise
  */
 private boolean search(String yetToBeSearchedInputStr, Node currPos,
   Set<Node> alreadyTravelled) {
  if (Objects.isNull(yetToBeSearchedInputStr)
    || yetToBeSearchedInputStr.length() == 0) {
   return true;
  }

  alreadyTravelled.add(currPos);
  List<Node> neighbors = getNeighbors(currPos);
  for (Node node : neighbors) {
   if (!alreadyTravelled.contains(node)
     && node.ch == yetToBeSearchedInputStr.charAt(0)) {
    return search(yetToBeSearchedInputStr.substring(1), node,
      alreadyTravelled);
   }
  }
  return false;
 }

 /**
  * Returns all valid neighbors (left, right, up, down) of the given node
  * @param currPos node for which all neighbors needs to be found
  * @return list of neighbors
  */
 private List<Node> getNeighbors(Node currPos) {
  int row = currPos.row;
  int col = currPos.col;

  List<Node> neighbors = new ArrayList<>();
  if (col - 1 >= 0) {
   neighbors.add(new Node(board[row][col - 1], row, col - 1));
  }
  if (col + 1 < COL) {
   neighbors.add(new Node(board[row][col + 1], row, col + 1));
  }
  if (row - 1 >= 0) {
   neighbors.add(new Node(board[row - 1][col], row - 1, col));
  }
  if (row + 1 < ROW) {
   neighbors.add(new Node(board[row + 1][col], row + 1, col));
  }
  return neighbors;
 }

 @Override
 public String toString() {
  for (int i = 0; i < ROW; i++) {
   System.out.println();
   for (int j = 0; j < COL; j++) {
    System.out.print(board[i][j] + " ");
   }
  }
  return "\n ROW=" + ROW + ", COL=" + COL;
 }

 /**
  * Abstracts the character and it's position (row and col) in the board.
  * This is required as same character can be present at two different
  * locations.
  */
 private class Node {
  char ch;
  int row;
  int col;

  public Node(int r, int c) {
   this.row = r;
   this.col = c;
  }

  public Node(char ch, int r, int c) {
   this(r, c);
   this.ch = ch;
  }
  
  @Override
  public int hashCode() {
   final int prime = 31;
   int result = 1;
   result = prime * result + getOuterType().hashCode();
   result = prime * result + ch;
   result = prime * result + col;
   result = prime * result + row;
   return result;
  }

  @Override
  public boolean equals(Object obj) {
   if (this == obj)
    return true;
   if (obj == null)
    return false;
   if (getClass() != obj.getClass())
    return false;
   Node other = (Node) obj;
   if (!getOuterType().equals(other.getOuterType()))
    return false;
   if (ch != other.ch)
    return false;
   if (col != other.col)
    return false;
   if (row != other.row)
    return false;
   return true;
  }

  @Override
  public String toString() {
   return "Node [ch=" + ch + ", row=" + row + ", col=" + col + "]";
  }

  private WordSearch getOuterType() {
   return WordSearch.this;
  }
 }

 /**
  * Test method
  */
 public static void main(String[] args) {
  char[][] board = { { 'A', 'B', 'C', 'E' }, { 'S', 'F', 'C', 'S' },
    { 'A', 'D', 'E', 'E' } };
  WordSearch wordSearch = new WordSearch(board);
  System.out.println(wordSearch);
  Set<Node> alreadyTravelled = new HashSet<>();
  String inputString = "ABCCED";

  //first find if the first character exists in the board. Start below search for all the positions where it could be found
  //Not given logic to find the first character in the board.
  boolean f = wordSearch.search(inputString.substring(1),
    wordSearch.new Node('A', 0, 0), alreadyTravelled);
  System.out.println("Input String, "+ inputString +" exists in the 2-D character array? " + f);
 }
}



Note:
  • search method performs the DFS search until the word becomes empty or the search fails. It takes the substring of the word which is yet to be verified.
  • If you understand DFS properly, then this approach is quite straightforward. 

---
keep coding !!!

Wednesday, December 9, 2015

Transform collection of one object to another using Java 8 Streams

Few days back, I wanted to convert a list of value objects to another list of ORM/entity objects.  It's a very commonly occurring scenario; oh BOY, streams API works like a charm!

If you are working on application having layered or distributed architecture, this scenario is quite common. When data moves from client/UI to get persisted it gets transformed multiple times. Let's see how it can be done effectively in Java 8 using streams API.



public class City {
 private String name;
 private String zipCode;
 
 //other attribute, setter/getters
}

public class CityOrm {
 private String name;
 private String zipCode;
 private long id;
 
 public CityOrm(City city){
  //construct this using City instance
 }
 //other attribute, setter/getters
}


Please note that CityOrm has a constructor. So transforming or converting a City instance to CityOrm instance is not an issue.  

Problem is:

You have a collection of City; you need to convert it into a collection of CityOrm. 


Before Java 8

 List<City> cities = getItFromSomeWhere();
List<CityOrm> citiesOrm = new ArrayList<>();
for(City city : cities){
citiesOrm.add(new CitiyOrm(city));
}


Java 8 / Streams API

   List<City> cities = getItFromSomeWhere();
   List<CityOrm> citiesOrm = cities.stream()
          .map(city -> new CityOrm(city))                   .collect(Collectors.toList());


stream() method converts the list of City instances into a stream and then map() method goes through each city and by calling appropriate constructor converts it to another type (i.e. CityOrm). And finally, collect() method converts the stream back to a list. 
So Stream API definitely saved a couple of lines (excuse the spread due to formatting).