Monday, September 8, 2014

Bit Vector

The bit vector is also referred to as bit-array or bit-set. As the name suggests, it's array of bits. Every modern programming language supports primitive or basic data types like byte, integer, short etc which are inherently array of bits only. So, is bit-array different? 

A byte is composed of 8-bits. Literally, byte data type is also a bit-array of 8 bits or 1 byte.  Integer data type of 4 bytes is a bit-array of 32 bits and so on. This is where the similarity stops and life of bit-array begins. 

To understand it further let's take an example. Assume that you want to represent a set of integers in the range of 0 and 999 (both are inclusive). An option which you have as a programmer is; store these integers in an array of length 1000 or a list which can grow up to 1000. Remember, it's a set, so you are going to store only unique numbers in the range. 

//Java snippet
int[] set = new int[1000];
set[i] = num

What is the memory requirement of above?
Each integer takes 4 bytes. So total memory requirement for 1000 integers is 4KB (=4*1000 byte).  

can we improve? 
If the requirement says that not all numbers in the range might be present in the set, we can use ArrayList instead of an array. So this will improve a bit, as the numbers which are not in the set will not occupy any memory. This helps to an extent but worst-case memory requirement is still 4KB. 

can we improve?
As long as you going to store each element of the set in an integer, space overhead is going to remain the same.

Bit-Array

A bit can only take two values i.e. 0 or 1.  It can be used to denote the presence or absence of an item. So as discussed above, a byte is an array of 8 bits. This means these 8 bits can be used to represent presence/absence of an item in the set. Recall that in an integer array, the index is used to get element stored at that location. Similarly, in bit-set, bit at a position can be used to represent the presence of a number.

So as shown in the above figure, we can represent a set of integers in the range of 0 and 7, can be represented by just 1 byte ( or 8 bits). If the number is present make bit at the corresponding bit as 1. 

So below two operations can be performed on bit-set.
  1. Setting an integer in the array
  2. Testing if an integer exists in the array/set

Implementing Bit-vector

Now, let's return to our initial problem of storing integers in the range of 0 and 999. 

Number of bits required = 1000
Number of bytes required = 1000/8 = 125
So 1000 integers can be represented in 125 bytes.

Below is the Java implementation:

/**
 * Bit-vector to store integers
 * 
 * @author Siddheshwar
 * 
 */
public class BitArray {
 private byte[] set;

 public BitArray(int length) {
  int arraysize = length >> 3;

  // if length is not multiple of 8
  if (length % 8 != 0) {
   arraysize++;
  }
  this.set = new byte[arraysize];
 }

 /**
  * Set given number in the bit-array i.e. make corresponding bit as 1
  * 
  * @param number
  */
 public void storeNumber(int number) {
  if (number > (set.length << 3) || number < 0) {
   throw new IllegalArgumentException("number out of range");
  }

  int arrayIndex = this.getArrayIndex(number);
  int bitIndex = this.getBitIndex(number);

  // shift 1 to the bit index
  int pos = 1 << bitIndex;

  // so bit at position needs to be made 1
  this.set[arrayIndex] = (byte) (this.set[arrayIndex] | pos);
 }

 /**
  * Check if the number exists in the array if the corresponding bit is 1
  * then number exists
  * 
  * @param number
  */
 public boolean numberExists(int number) {
  if (number > (set.length << 3) || number < 0) {
   throw new IllegalArgumentException("number out of range");
  }

  int arrayIndex = this.getArrayIndex(number);
  int bitIndex = this.getBitIndex(number);

  // shift 1 to the bit index
  int pos = 1 << bitIndex;

  // check bit at position is 0 or 1
  return (this.set[arrayIndex] & pos) > 0 ? true : false;
 }

 private int getArrayIndex(int number) {
  return number >> 3; // divide by 8
 }

 private int getBitIndex(int number) {
  return number % (1 << 3); // % 8
 }

 private void printSet() {
  System.out.println("\n array content :");
  for (int i = 0; i < set.length; i++)
   System.out.print(Integer.toBinaryString(this.set[i]));
 }

 //test method
 public static void main(String[] args) {
  BitArray array = new BitArray(1000);
  System.out.println("length of array :" + array.set.length);

  int num1 = 17;
  array.storeNumber(num1);

  int num2 = 171;
  array.storeNumber(num2);

  int num3 = 400;
  array.storeNumber(num3);

  System.out.println("val " + num1 + " exists ? "
    + array.numberExists(num1));
  System.out.println("val " + num2 + " exists ? "
    + array.numberExists(num2));
  System.out.println("val " + num3 + " exists ? "
    + array.numberExists(num3));
  System.out.println("val " + 500 + " exists ? "
    + array.numberExists(500));

  for (int i = 0; i < 1000; i = i + 2)
   array.storeNumber(i);

  array.printSet();

  /*
   * for(int i = 0; i< 1000; i++) System.out.println(array.get(i));
   */
 }
}

Output:
length of array :125
val 17 exists ? true
val 171 exists ? true
val 400 exists ? true
val 500 exists ? false

 array content :
10101011010101101011110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110111011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101101010110101011010101


---
keep coding !!!

No comments:

Post a Comment