Saturday, August 22, 2015

Longest Substring Without Repeating Characters

Problem:
Given a string, find the longest substring without repeating characters.
"abcabcd"  --> "abcd"
"aaaaaa"    --> "a"
"abcad"     --> "bcad"

Solution

This problem might be trivial if there is no restriction on time complexity. But if the expected time complexity is linear then it becomes quite interesting. Let's see how we going to approach to solve it. 

Once you start moving from left to right in the given input string (one character each time), there will be a need to find if the character already exists before? 

if the character doesn't exist earlier - continue further.
if the character exists earlier - this substring (starting from a given index) could be a potential answer. 

So we need a start index which points to the beginning of the current potential answer. And another index which keeps on incrementing to find that maximum substring. 

how to check if character is already found ?
str = "abac"
Assume, startIndex = 0, currentIndex = 2

The character at currentIndex(i.e. 'a') will have to be checked if it's already there in the substring before it (i.e. "ab"). Brute force approach to check if the character exists in the substring will result in complexity O(substring.length()).

We can optimize it by using hashing mechanism, but keep in mind that hashing technique will increase the space overhead. 

Java Implementation


 public static String longestSubstringWithoutRepeatingCharars(String str) {
  if (str == null || str.length() < 2) {
   return str;
  }

  Set<Character> unique = new HashSet<>();
  int startIndex = 0, endIndex = 0;
  char ch;
  String max = " ";

  while (endIndex < str.length()) {
   ch = str.charAt(endIndex);
   if (unique.contains(ch)) {

    // check if max can be optimized
    if (str.substring(startIndex, endIndex).length() >= max
      .length()) {
     max = str.substring(startIndex, endIndex);
    }

    // reset both indexes to find the next substring
    startIndex++;
    endIndex = startIndex;

    // reset set so that it can again store all unique chars
    unique.clear();
   } else {
    endIndex++;
    unique.add(ch);
   }
  }

  // check if max can be optimized
  if (str.substring(startIndex, endIndex).length() >= max.length()) {
   max = str.substring(startIndex, endIndex);
  }
  return max;
 }

longestSubstringWithoutRepeatingCharars("abcad");

No comments:

Post a Comment