Saturday, February 28, 2015

Understanding JAVA EE Interceptors

Interceptors are used to implement orthogonal or cross-cutting concerns such as logging, auditing, profiling etc in Java EE applications. In this aspect, it is similar to aspect-oriented programming (AOP).

In Java EE 5, Interceptors were part of EBJs, but in later versions, Interceptors have evolved into a new specification of its own. Interceptors were split into an individual spec in Java EE 7 (Interceptors 1.2) and are part of Context and Dependency Injection(CDI) specification.  They can be applied to any managed classes/beans like EJBs, Servlets, SOAP and RESTful web services. Managed Beans are container-managed objects (unlike Java Beans or POJOs which run inside JVM).  In Java EE 7, Dependency Injection (JSR 330) and  CDI (JSR 299) are merged. You can refer to managed beans as CDI bean as well. These managed beans are injectable and interceptable.

In this post, I will be using terms CDI bean and managed bean interchangeably.

Interceptors

Interceptor is a class whose methods are invoked when methods on a target class are invoked. And this invocation of interceptors methods is performed by the container. Obviously, this would be possible if the target class and the interceptors both are managed by the container.  This is possible because beans managed by containers (servlet or EJB) provides the ability to intercept method invocation through interceptors. 


Interceptors are powerful means to decouple technical concerns from business logic. And it uses strongly typed annotations to achieve it instead of String-based identifiers. This way usage of XML descriptors is minimized. Interceptors use a mandatory deployment descriptor, bean.xml. This is required so that CDI is able to discover the bean from the classpath.

Below are interceptor metadata annotations (from javax.interceptor ):
  • @AroundConstruct associates with the constructor of the target class
  • @AroundInvoke associates with a business method and gets called while entering and exiting
  • @AroundTimeout associates with timeout methods
  • @PostConstruct & @PreDestroy on corresponding lifecycle events of the target class

Using Interceptors

Interceptors, intercept a particular target class so they have the same life cycle as that of the target class. Target class can have any number of associated interceptors.

Intercepting a REST Service

Interceptors can intercept any managed bean; so it can intercept servlet, REST entry point, EJB etc. I will be showing an example of intercepting a REST service.

import java.lang.annotation.ElementType;
import java.lang.annotation.Inherited;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
import javax.interceptor.InterceptorBinding;

/**
 * Interface for logging/auditing JAX-RS requests
 *
 */
@Inherited
@InterceptorBinding
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD, ElementType.TYPE})
public @interface Auditor {
}



import javax.inject.Inject;
import javax.interceptor.AroundInvoke;
import javax.interceptor.Interceptor;
import javax.interceptor.InvocationContext;
/**
 * CDI/Managed bean which intecepts REST API
 * Add @Interceptors(RestAuditor.class) to the target class/method
 *
 */
@Interceptor
@Auditor
public class RestAuditor {
 @AroundInvoke
 public Object intercept(InvocationContext context) throws Exception {
  //context.getMethod().getName())
               // audit/log the request
  return context.proceed();
  //control comes back after target method execution 
 }
}


<beans xmlns="http://java.sun.com/xml/ns/javaee"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="
      http://java.sun.com/xml/ns/javaee 
      http://java.sun.com/xml/ns/javaee/beans_1_0.xsd">
   <interceptors>
        <class>RestAuditor</class>  <!--fully qualified class name -->
    </interceptors>
</beans>

So interceptor definition and the declaration of the interceptor in bean.xml completes the interceptor part. Now next step is to use above defined interceptor on a REST service. Please note that the intercept method will be called before actually calling the REST method and then again once REST method is complete, the control comes back. So the request as well response both can be logged in the intercept(..) method. Using the interceptor is quite trivial, we just need to annotate the REST class as shown below :

@Path("/sample")
@ApplicationScoped
@Interceptors(RestAuditor.class)
public class SampleRestService implements Serializable {  
 @GET
 @Produces(MediaType.APPLICATION_JSON)
 public boolean getLogs() {
  boolean resp = true;
  return resp;
 }
}

That's it!

Important Points

  • If the application contains several jar files and you want to enable CDI across the application then you need to have beans.xml in each jar. Only then CDI will trigger bean discovery for each jar. 
  • Interceptors can be applied to method as well as class level. If you want to intercept a specific method then put the annotation at method level only (NOT at class level as shown above).

Monday, February 23, 2015

TRIE data structure

TRIE, another Search Tree!

Trie is an infix of the word reTRIEval, and pronounced as TREE but to distinguish it from the tree some people also pronounce it as TRY. As the name suggests it is an efficient retrieval or searching data structure. 

The root node represents an empty string (""), if you go down one level it represents all single character words. Similarly, at level 10, trie represents words or prefixes of 10 letters/characters. It is a search tree where each vertex or node represents a single word or a prefix (but each node actually stores a single letter or character). 


Let's take a sample dictionary of 6 words.
Dictionary, D = {a, an, out, the, them, their}

You can use Hashing or Binary search tree to implement a dictionary,  but here we will use trie to implement it.

Below are trie representations of the above dictionary, D. Both left as well as right trie represents the same dictionary (two different representations). If you want to understand the trie the right one is more suitable and left one represents trie from the implementation point of view. Leaf nodes (colored in blue) represents the last node of a word so 'e' of the word 'the' will be the leaf node. The leaf node signals end of a word, but it can get extended further (like 'a' and 'an'). So every node which is the last character of the word will be the leaf node. The implementation can be achieved by a boolean flag.

Two TRIE representation of the same dictionary

Possible operations:
addWord(String): adds a single word to the trie
search(String): search if the given word or prefix exists in the trie
countWords(String): number of words that match in trie with the given word
countPrefixes(String): number of words that have given the prefix

Implementation

To make it simple, let's consider only English alphabets (a-z); and also make case insensitive. This means each node can have maximum of 26 children. So it's just a generalization of Binary Search Tree.

Java implementation this post, here

Trie implementation provides operations like add a word and find a word. If you understand the fundamentals, implementing other methods is not going to be difficult. 

Analyzing Performance

Cost of looking up a word or prefix in a vocabulary/dictionary implemented using trie depends only on the number of characters in the word/prefix. It doesn't depend on the size of the dictionary. This is an important distinction from the HashMap/HashTable based dictionary implementations.  As shown in the above diagram the cost of finding the word 'the' requires just 3 steps. 

Runtime complexity of lookup = O(length of word)
String comparison is considered as a single operation, so above can be generalized as O(1).

Trie implementation requires more memory than an array or tree. This is because at each level, the node takes space for 26 characters but not all of them will be used. But the space overhead gets compensated with the constant time search performance. 


References:
keep coding !!!

TRIE implementation in Java

I have covered the fundamentals of trie in this post.

/**
 * TRIE implementation in Java 
 * Tries are general search tree where every node
 * can have some pre-defined number of children. Length of children is 26. Max
 * depth of trie is equal to max length of the word in the dictionary
 *
 * Constraints: 1. assume that all words are lowercase (uppercase letters are
 * converted to lowercase) 2. accept only a-z letters—no punctuation, no
 * hyphens, no accents, etc
 * 
 * Main Operations supported: 1. addword(..) : add a word to the dictionary 2.
 * search(..) : check if the given word exists in dictionary
 *
 * @author Siddheshwar
 */
public class Trie {
 private Node root;

 public Trie() {
  root = new Node();
 }

 /**
  * adds a dictionary word to the vocabolary or dictionary
  * 
  * @param word
  */
 public void addWord(String word) {
  if (word == null || !word.matches("[a-zA-Z]+")) { // or matches ("\\w+")
   return;
  }

  Node r = root;
  int index;
  for (char c : word.toLowerCase().toCharArray()) {
   index = this.getIndex(c);

   if (!r.hasChildAt(index)) {
    r = r.addChild(c);
   } else {
    r = r.getChildAt(index);
   }
  }
  r.setAsLeaf();
 }

 /**
  * Get index where the character will get mapped in the array 'a'/'A' will
  * be mapped to 0th, 'z'/'Z' will get mapped to 25th index
  * 
  * @param ch
  * @return index
  */
 private int getIndex(char ch) {
  // return (int)(ch - 'a');
  return String.valueOf(ch).codePointAt(0)
    - String.valueOf('a').codePointAt(0);
 }

 /**
  * Is word present in the trie
  * 
  * @param word
  * @return
  */
 public boolean contains(String word) {
  if (word == null)
   return true;
  Node r = root;
  for (char c : word.toCharArray()) {
   r = r.getChildAt(this.getIndex(c));
   if (r == null) {
    return false;
   }
  }
  return true;
 }

 /**
  * Returns the node (or null if word is not found)
  * 
  * @param word
  * @return Node
  */
 public Node getNodeEndingWithWord(String word) {
  if (word == null)
   return null;
  Node r = root;
  for (char c : word.toCharArray()) {
   r = r.getChildAt(this.getIndex(c));
   if (r == null) {
    return null;
   }
  }
  return r;
 }
}

import java.util.Arrays;
import java.util.List;

/**
 * Vertex or Node of a TRIE tree. Supports only alphabets a-z (ignores case) 'a'
 * occupies 0th position and 'z', 25th in Node array(i.e. children)
 */
public class Node {
 private Node[] childen;
 private boolean leaf;
 private char value;
 static final char NULL_CHAR = '\0';

 public Node() {
  childen = new Node[26];
  leaf = false;
  value = NULL_CHAR;
 }

 public Node(char c) {
  childen = new Node[26];
  leaf = false;
  value = c;
 }

 public List<Node> getChilden() {
  return Arrays.asList(childen);
 }

 public boolean hasChildAt(int index) {
  return this.childen[index] == null ? false : true;
 }

 public boolean isLeaf() {
  return leaf;
 }

 public char getValue() {
  return value;
 }

 public Node getChildAt(int index) {
  return this.childen[index];
 }

 public void setAsLeaf() {
  this.leaf = true;
 }

 public boolean isCharValid() {
  return this.value != NULL_CHAR ? true : false;
 }

 /**
  * add a child node to the current node
  * 
  * @param ch
  * @return Node
  */
 public Node addChild(char ch) {
  Node childNode = new Node(ch);
  // int index = (int)(ch - 'a'); prefer below approach
  int index = String.valueOf(ch).codePointAt(0)
    - String.valueOf('a').codePointAt(0);
  this.childen[index] = childNode;
  return childNode;
 }

 /**
  * Get count of valid children; upper limit is 26.
  * 
  * @return count
  */
 public int getChildrenCount() {
  int count = 0;
  for (int i = 0; i < 26; i++) {
   if (this.hasChildAt(i)) {
    count++;
   }
  }
  return count;
 }

 /**
  * Returns comma separated list of children for the node
  * 
  * @return
  */
 public String getChildren() {
  StringBuilder output = new StringBuilder();
  for (int i = 0; i < 26; i++) {
   if (this.hasChildAt(i)) {
    output.append((char) (i + 65)).append(",");
   }
  }
  return output.substring(0, output.length() - 1);
 }
}

Friday, February 20, 2015

Convert character to ASCII value in Java

This post, I will be discussing one of the incredibly popular wrong practice of converting characters into their ASCII value in Java. Can you spot any problem with below?


public class CharToAscii {
 public static void main(String[] args) {
         char aAsChar = 'A';
         System.out.println((int)aAsChar); //prints ASCII value
 }
}
The output on console:
65   

The program runs properly and prints the output as expected. So what's the issue? Above program will not mind as long as you can guarantee that input character is an english alphabet or some well known special character.

The issue is with the message it conveys, it conveys that each character is mapped to a unique integer representation (which is ASCII number for English letters and some common characters like @, - etc). It gives impression that a number can be typecast to character and vice-versa. There is no one-to-one mapping between the two; the above code works as expected but it's WRONG. Char in Java takes 2 bytes (or 16 bits), so it maps 65535 characters (i.e. math.pow(2,16)-1), whereas ASCII is restricted to 128. There is huge list of characters which don't have ASCII representation at all. So definitely above approach is misleading. 

So what should be the proper way to convert character 'A' to its ASCII value. The proper way would be to use Unicode code point which is numerically equivalent to ASCII value. Unicode is the superset of ASCII.  English alphabet 'A' in Unicode is U+0041 or 65 in decimal. So below approach should be preferred to convert characters into ASCII value or their encoded integer value.

      int ascii = String.valueOf('A').codePointAt(0);
      //return 65

If you interested in knowing the evolution of character encoding systems and knowing it in more details, I would strongly recommend this article, from Joelonsoftware.

--
keep coding !!!