Tuesday, December 31, 2013

Java Bytecode or class file

This post, I will be focusing on the content of Java's class file, known as bytecodes.  Java Virtual Machine uses the stream of bytecode from the class file for executing the program. As a Java programmer one doesn't need to bother about internal structure and format of bytecodes at all, but it's worth knowing how it is organized under the hood.  

Reading Bytecodes

Before reading bytecodes, let's generate one first. I am taking a simple HelloWorld example for this. 

public class HelloWorld{
public static void main(String[] args){
System.out.println("Hello, World!");
}
}

Save above class in your editor to compile it (or compile manually through command prompt). Locate the generated class file and open the same in the text editor. Shown in the below screenshot :


Some of the text does look familiar but it doesn't make sense at all. In fact, it doesn't look like bytecode of the HelloWorld.java. It will make more sense if it had numbers( binary/hex). Even if you use the FileReader API of java to read the class file; the result will be the same. You will neither see the stream of bytes nor code mnemonics

Are we missing something ? Yes.
Class file consists of stream of bytecodes. So we need to read the file as an array of the byte as shown in below class.


import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import javax.xml.bind.DatatypeConverter;

/**
 * Utility to read the contents of class file as byte stream
 * 
 * @author Siddheshwar
 * 
 */
public class ReadClassAsByteStream {

 public static byte[] readFileAsByteArray(String fle) throws IOException {
  RandomAccessFile f = new RandomAccessFile(new File(fle), "r");

  try {
   int length = (int) f.length();
   byte[] data = new byte[length];
   f.readFully(data);
   return data;
  } finally {
   f.close();
  }
 }

 // test method
 public static void main(String[] args) throws IOException {
  String file = "D://workspace/JavaSample/src/HelloWorld.class";
  byte[] b = null;

  try {
   b = readFileAsByteArray(file);
   // convert byte array to Hex
   System.out.println(DatatypeConverter.printHexBinary(b));
  } catch (IOException e) {
   e.printStackTrace();
  }
 }
}

Output:
CAFEBABE00000033001D0A0006000F09001000110800120A001300140700150700160100063C696E69743E010003282956010004436F646501000F4C696E654E756D6265725461626C650100046D61696E010016285B4C6A6176612F6C616E672F537472696E673B295601000A536F7572636546696C6501000F48656C6C6F576F726C642E6A6176610C000700080700170C0018001901000C48656C6C6F20576F726C642107001A0C001B001C01000A48656C6C6F576F726C640100106A6176612F6C616E672F4F626A6563740100106A6176612F6C616E672F53797374656D0100036F75740100154C6A6176612F696F2F5072696E7453747265616D3B0100136A6176612F696F2F5072696E7453747265616D0100077072696E746C6E010015284C6A6176612F6C616E672F537472696E673B2956002100050006000000000002000100070008000100090000001D00010001000000052AB70001B100000001000A000000060001000000010009000B000C00010009000000250002000100000009B200021203B60004B100000001000A0000000A000200000003000800040001000D00000002000E

Bingo !!! Now, this looks like what we were looking for.
In the main method, I have used DatatypeConverter API to convert the byte array into hex format to print the content in more compact format. 

What is Bytecode

Bytecode is a series of instructions for the Java Virtual Machine and it gets stored in the method area (of JVM). Each instruction consists of a one-byte opcode followed by zero or more operands. The opcode indicates the action to be taken by JVM. The number of opcodes is quite small  (<256) and hence one byte is enough to represent opcodes. This helps to keep the size of the class file compact.

Important Observations on Generated Bytecode

  1. Java class file is a binary stream of byte. These bytes are stored sequentially in class file, without any padding between adjacent items. 
  2. The absence of padding ensures that class file is compact and hence can be quickly transferred over the network. 
  3. Items which occupy more than one byte are split into multiple consecutive bytes in big-endian style (higher bytes first ).
  4. Notice that, the first four bytes are "CAFEBABE" -known as the magic number. The magic number makes the non-Java class file easier to identify. If the class file doesn't start with this magic number then it's definitely not a Java class file. 
  5. The second four bytes of the class file contain the minor and major version numbers. 

Mnemonics Representation of Bytecode

Bytecode of HelloWorld program can be even represented as mnemonics in a typical assembly language style. Java provides class file disassember utility named as javap for doing it. javap utility provides multiple options for printing the content of class file. You can also use ASM Eclipse plugin to see disassembled bytecode of a class in Eclipse.

D:\workspace\JavaSample\src>javap -c HelloWorld.class
Compiled from "HelloWorld.java"
public class HelloWorld {
  public HelloWorld();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
       3: ldc           #3                  // String Hello World!
       5: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
       8: return
}

Friday, December 20, 2013

Annotation parser in Java

Java SE 5 introduced Annotation processing API; which further got extended in Java 6 and in Java 7. Through this API, you can find annotations in your code base and then work with them. 

I have discussed the basics of Annotation in the previous post, link. In this post, I will be focusing on processing annotations put in a class file. 



Annotation Processor

This post will be diving deeper to understand how to process Annotations in a class file. If there is no logic/tool to process/parse annotations embedded in source code then annotations will look more like comments. Java provides support API to create these tools through reflection technique
Before finally looking into Annotation processors; let's see one sample annotation and its use. Below is a definition of an Annotation. Note, the meta-annotation put in below Annotation (meta-annotations have been discussed in the previous post). InvocationAnnotation will be used to specify how many times a method needs to run. 

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface InvocationAnnotation {
 int numberOfTimesToInvoke() default 1;
}


Now let's take a class where above annotation, InvocationAnnotation will be used. 


public class AnnotationUsage {
 @InvocationAnnotation(numberOfTimesToInvoke = 3)
 public void method1() {
  System.out.println(" method1 invocation ....3 times..");
 }

 @InvocationAnnotation
 public void method2() {
  System.out.println(" method2 invocation...1 time");
 }
}


Now let's turn to the real business of extracting information from above class and the performing task as expected.

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

public class ProcessAnnotationUsage {
 public static void main(String[] args) {
  AnnotationUsage at = new AnnotationUsage();
  Method[] methods = at.getClass().getMethods();

  for (Method m : methods) {

   InvocationAnnotation ia = m
     .getAnnotation(InvocationAnnotation.class);

   if (ia != null) {
    int count = ia.numberOfTimesToInvoke();

    for (int i = 0; i < count; i++) {
     try {
      m.invoke(at, null);
     } catch (IllegalAccessException 
      | IllegalArgumentException
      | InvocationTargetException e) {
      e.printStackTrace();
     }
    }
   }
  }
 }
}

Output :

method1 invocation ....3 times..

 method1 invocation ....3 times..

 method1 invocation ....3 times..

 method2 invocation...1 time


So above annotation processor reads annotation embedded in method through reflection. Then,  finds out how many times the method needs to be invoked and then finally does that using for loop. The output confirms the same. 

Java 5 also introduced an Annotation Processing Tool (APT) called as apt. And in JDK 6, apt was made part of the compiler.

---
do post your comments/questions below !!!