LeetCode: Longest Substring Without Repeating Characters

LeetCode: Longest Substring Without Repeating Characters 1

Last Updated on

\(\)

Question

https://leetcode.com/problems/longest-substring-without-repeating-characters/

Given a string, find the length of the longest substring without repeating characters.

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. 

Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Explanation

Naive solutions

According to problem description, the first solution should be enumerating all the ranges of the string, check whether characters in range[i, j] are unique, if true, we compute the distance and compare it with the current distance, choose the max one as answer.

But how to check whether a sub-string is constructed with all unique characters? HashSet is need here, so the pseudo code is:

ans = -1
for(i=0, i<str.size, i++)
  for(j=i+1, j<str.size, j++) {
    if range_is_ok(str, i, j)
      ans = max(ans, j-i)
}

range_is_ok(str, i, j) {
  hash = new hashset
  for(k=i; k<j; k++) {
    if hash.contains(str[k]) 
      return false
    else 
      hash.add(str[k])
  }
}

It’s easy to figure out, there are three loop range in pseudo code.

Time complexity: \(O(N^3)\) Space complexity: We most need \(O(K)\) space, where K is the number of different characters, 26 is enough for letters.

Optimization: reduce the double check with hashmap

Notice in the naive solution, we have a gap double checked in range. For example, we checked range[1, 4) in first loop, we also checked range [2, 4) in later loop, that’s what we can optimize on time complexity.

Actually we can maintain a window slice, there a all unique characters in the windows, if next character can be found in the previous window, which means we need to update the window to current position. How can we quickly found out whether a character occurs in window? A hashset or hashmap can be used here.

Here we come out the C++ version:

#define max(a, b) ((a) > (b) ? (a) : (b))
class Solution {
public:
  int lengthOfLongestSubstring(string s) {
    int n = s.size();
    set<char> set;
    int ans = 0, i = 0, j = 0;
    while(i < n && j < n) {
      std::cout << "i: " << i << " j: " << j << " => " << s[j] << std::endl;
      if(set.find(s[j]) == set.end()) {
        set.insert(s[j++]);
        ans = max(ans, j - i);
      } else {
        set.erase(s[i++]);
      }
    }
    return ans;
  }
};

Take a example input ‘abccde’, it’s output should be:

i: 0 j: 0 => a
i: 0 j: 1 => b
i: 0 j: 2 => c
i: 0 j: 3 => c
i: 1 j: 3 => c
i: 2 j: 3 => c
i: 3 j: 3 => c
i: 3 j: 4 => d
i: 3 j: 5 => e

The time complexity is: \(O(N)\) The space complexity is same as naive solution.

An alternative implementation is use hashmap to store the index of characters, which can be fetched to check the distance of range:

class Solution {
public:
  int lengthOfLongestSubstring(string s) {
    map<char, int> m;
    int ans = 0, pos = -1;
    for(int i = 0; i < s.size(); i++){
      if (m.find(s[i]) != m.end() && pos < m[s[i]]) {
        pos = m[s[i]];
      }
      if ( i - pos > ans ){
        ans = i - pos;
      }
      m[s[i]] = i;
    }
    return ans;
  }
};

Other optimization

If characters have fixed range, we just use a table replace the hashmap, using table will also perform better, and more suitable for C programming languages which don’t have hashmap in standard library.

Java

Java is same with C++ version:

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        int n = s.length();
        Set<Character> set = new HashSet<>();
        int ans = 0, i = 0, j = 0;
        while (i < n && j < n) {
            // try to extend the range [i, j]
            if (!set.contains(s.charAt(j))){
                set.add(s.charAt(j++));
                ans = Math.max(ans, j - i);
            }
            else {
                set.remove(s.charAt(i++));
            }
        }
        return ans;
    }
}

Python

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        start, ans, hash = 0, 0, {}
        for i, num in enumerate(s):
            if num in hash and start <= hash[num]:
                start = hash[num] + 1
            ans = max(ans, i - start + 1)
            hash[num] = i

        return ans

Preparing for an interview? Checkout this!