🔎 KMP String Matching
Failure function search
Step 0
Matches: 0
Inputs
Controls
Stats
Comparisons
0
Matches
0
Phase
Failure
Status
Ready
Trace
Info & Theory

Knuth-Morris-Pratt (KMP) searches for a pattern inside a text in O(n+m) time, where n is the text length and m the pattern length.

The failure function

For each position q of the pattern, fail[q] is the length of the longest proper prefix of P[0..q] that is also a suffix. It is computed by matching the pattern against itself in O(m) time.

The scan

Walk one cursor i through the text and a cursor q through the pattern. On a match both advance. On a mismatch with q > 0, set q = fail[q−1] — the pattern slides forward while i stays put. The text cursor never moves back.

Why it is linear

Each step either advances i or decreases q. Since q can only rise as many times as i advances, the total work is O(n+m).

Where it is used

Editor search, grep, intrusion-detection signature scanning and bioinformatics all use KMP or related linear-time matchers.

Frequently asked questions

What is the KMP algorithm?

The Knuth-Morris-Pratt algorithm finds all occurrences of a pattern within a text in linear O(n+m) time by precomputing a failure function so the text cursor never moves backwards.

What is the failure function?

The failure (prefix) function records, for each position in the pattern, the length of the longest proper prefix that is also a suffix. It tells the algorithm how far to slide the pattern after a mismatch.

Why does the text cursor never move back?

On a mismatch the failure function tells KMP how much of the pattern already matched as a prefix, so it shifts the pattern rather than rewinding the text. Each text character is examined at most a constant number of times.

How is KMP faster than naive search?

Naive search can re-compare characters after every shift, giving O(n·m) in the worst case. KMP reuses information from previous comparisons, achieving O(n+m).

How is the failure function computed?

It is built by matching the pattern against itself: a pointer tracks the current prefix length, extending on a match and falling back through earlier failure values on a mismatch, all in O(m) time.

What is the time and space complexity?

KMP runs in O(n+m) time, where n is the text length and m the pattern length, and uses O(m) extra space for the failure table.

Can KMP find overlapping matches?

Yes. After a full match it uses the failure value of the last pattern position to continue, so overlapping occurrences such as AA in AAAA are all reported.

Where is KMP used?

Text editors, grep-style search, intrusion detection, bioinformatics and network packet inspection use KMP or related linear-time matching algorithms.

How does KMP compare with Boyer-Moore?

Boyer-Moore scans the pattern from right to left and can skip large chunks, often faster in practice, while KMP guarantees a strict linear bound and never moves the text cursor backwards.

What happens if the pattern is empty?

An empty pattern matches at every position by convention. This simulation expects a non-empty pattern shorter than or equal to the text for a meaningful search.