Friday, February 20, 2015

Convert character to ASCII value in Java

This post, I will be discussing one of the incredibly popular wrong practice of converting characters into their ASCII value in Java. Can you spot any problem with below?


public class CharToAscii {
 public static void main(String[] args) {
         char aAsChar = 'A';
         System.out.println((int)aAsChar); //prints ASCII value
 }
}
The output on console:
65   

The program runs properly and prints the output as expected. So what's the issue? Above program will not mind as long as you can guarantee that input character is an english alphabet or some well known special character.

The issue is with the message it conveys, it conveys that each character is mapped to a unique integer representation (which is ASCII number for English letters and some common characters like @, - etc). It gives impression that a number can be typecast to character and vice-versa. There is no one-to-one mapping between the two; the above code works as expected but it's WRONG. Char in Java takes 2 bytes (or 16 bits), so it maps 65535 characters (i.e. math.pow(2,16)-1), whereas ASCII is restricted to 128. There is huge list of characters which don't have ASCII representation at all. So definitely above approach is misleading. 

So what should be the proper way to convert character 'A' to its ASCII value. The proper way would be to use Unicode code point which is numerically equivalent to ASCII value. Unicode is the superset of ASCII.  English alphabet 'A' in Unicode is U+0041 or 65 in decimal. So below approach should be preferred to convert characters into ASCII value or their encoded integer value.

      int ascii = String.valueOf('A').codePointAt(0);
      //return 65

If you interested in knowing the evolution of character encoding systems and knowing it in more details, I would strongly recommend this article, from Joelonsoftware.

--
keep coding !!!

No comments:

Post a Comment