Rabin-Karp Algorithm

Using hashing to quickly find potential matches in a string.

Key Notes

The Rabin-Karp algorithm is another string-searching algorithm that aims to find occurrences of a pattern in a text. Its average and best-case running time is O(n + m), but its worst-case time is O(n*m). However, with a good hashing function, the worst case is highly unlikely. The core idea is to use hashing to compare the pattern with substrings in the text. Instead of comparing characters one by one, we compute a hash value for the pattern and for each substring of the text with the same length. If the hash values match, it's a potential match, and only then do we perform a character-by-character comparison to confirm (as hash collisions are possible, though rare). The key to Rabin-Karp's efficiency is the use of a 'rolling hash'. A rolling hash function allows us to calculate the hash value of the next substring in the text in O(1) time from the hash value of the current substring. For example, to slide the window one position to the right, we subtract the term corresponding to the character leaving the window and add the term for the new character entering it. This avoids re-calculating the hash from scratch for every substring, which would be O(m) each time. Polynomial rolling hash is a common choice for this technique.

Back to Strings

Rabin-Karp Algorithm

Using hashing to quickly find potential matches in a string.

Key Notes