Question 1

What is edit distance?

Accepted Answer

Edit distance, or Levenshtein distance, is the minimum number of single-character edits — insertions, deletions or substitutions — needed to change one string into another.

Question 2

How does the dynamic-programming table work?

Accepted Answer

A grid of size (m+1)x(n+1) stores the edit distance between every prefix of the two strings. Each cell is the minimum of its left, top and diagonal neighbours plus the cost of the edit, so the bottom-right cell holds the full distance.

Question 3

Why is the diagonal move special?

Accepted Answer

A diagonal move aligns one character of each string. If the characters match, the cost is zero (a copy); if they differ, it is a substitution costing one.

Question 4

What is backtracking in this algorithm?

Accepted Answer

After filling the table you start at the bottom-right cell and walk back to the top-left, each step choosing the neighbour that produced the current value. This reconstructs the actual sequence of edits.

Question 5

What is the time complexity?

Accepted Answer

Filling the table takes O(m*n) time and O(m*n) space, where m and n are the string lengths. Space can be reduced to O(min(m,n)) if only the distance is needed.

Question 6

Where is edit distance used?

Accepted Answer

Spell checkers, fuzzy search, DNA and protein alignment, plagiarism detection, OCR correction and diff tools all rely on edit-distance computations.

Question 7

Is the answer always unique?

Accepted Answer

The distance value is unique, but there can be several optimal edit paths of the same total cost. The backtracking here picks one consistent path.

Question 8

What is the difference between Levenshtein and Hamming distance?

Accepted Answer

Hamming distance only counts substitutions and requires equal-length strings. Levenshtein distance also allows insertions and deletions, so it handles strings of different lengths.

Question 9

Can the operations have different costs?

Accepted Answer

Yes. Weighted edit distance assigns different costs to insertion, deletion and substitution, which is common in spell-checking where some mistakes are more likely than others. This simulation uses unit costs.

Question 10

Why does the path stay near the diagonal for similar words?

Accepted Answer

When two strings are alike, most characters align directly, so the optimal path runs close to the main diagonal with only a few off-diagonal insertion or deletion steps.

The DP recurrence

Backtracking the path

Complexity

Where it is used

Frequently asked questions