Talk:Longest common substring

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

Mid

This article has been rated as Mid-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Algorithm Problem[edit]

The final else clause of the algorithm was S[i][j]=0, which made no sense at all. I changed it on the page to be L[i,j]=0, as it should be. Maybe a "vandal" did this? 170.223.207.25 (talk) 00:22, 30 November 2011 (UTC)[reply]

It makes sense in several languages to access a two dim aray through [i][j]. But the example should of course be consequent. 92.62.38.2 (talk) 07:32, 11 July 2013 (UTC)[reply]

Weird algorithm?[edit]

Why is the matrix indexed starting at 0 if the index 0 entries are never used? Shouldn't it start at 1 like the strings? Also, the following:

               if L[i,j] > z
                   z := L[i,j]
                   ret := {}
               if L[i,j] = z
                   ret := ret ∪ {S[i-z+1..i]}

Would be clearer as:

               if L[i,j] > z
                   z := L[i,j]
                   ret := S[i-z+1..i]

I believe the two are equivalent. —Preceding unsigned comment added by Schmmd (talk • contribs) 18:23, 26 August 2008 (UTC)[reply]

You're correct about the 0,0 array start, and I changed that. However, your code is not equivalent to the original. The original will accumulate all equal length substrings, while yours will only return one such substring. Daggerbox (talk) 12:02, 2 November 2010 (UTC)[reply]

The current page shows

               if L[i,j] > z
                   z := L[i,j]
                   ret := {S[i-z+1..i]}
               if L[i,j] = z
                   ret := ret ∪ {S[i-z+1..i]}

But that clearly duplicates S[i-z+1..i] whenever z gets incremented. It should be changed so the line inside "if L[i,j] > z" resets ret to nothing, i.e. ret :={} . If I'm missing something in the interpretation of pseudocode if/else flow, please let me know. Cellocgw (talk) 14:17, 25 April 2013 (UTC)[reply]

Error in figure?[edit]

Is the figure correct? It seems like AB should be labeled 1,3.

I think i made that figure. Now that I had some trouble understanding it myself, it is clearly not good. I think I chose to used three strings to show how it works well for more than two. When the tree got a bit big, I merged the leaf nodes into the internal nodes. If there is a number in a node in the current figure, the path from the root to the node spells a suffix of the corresponding string. Only the first string ends with "AB". I can try to make a better figure after Easter. Now I also see that the strings are numbered from 1, while in the figure in generalised suffix tree are numbered from 0. -- Nils Grimsmo 15:49, 10 April 2006 (UTC)[reply]

I am sorry it took forever before I fixed this. Now it has the same syntax as the figure in generalised suffix tree. Is it understandable? -- Nils Grimsmo 06:33, 24 July 2006 (UTC)[reply]

The diagram appears to contain "ABAB$2" when it should have "ABAB$0". Daggerbox (talk) 14:22, 29 October 2010 (UTC)[reply]

I'm pretty sure that change is correct, so I made it myself. Daggerbox (talk) 18:30, 29 October 2010 (UTC)[reply]

I think the entry 0:0 should be 1:0 and the entry 1:0 should be 0:0. Lagbhag (talk) 21:30, 26 April 2020 (UTC)Lagbhag[reply]

Myers' algorithm[edit]

I think we should also include the algorithm from Myers, Eugene W. (1986). 'An O(ND) Difference Algorithm and Its Variations', Algorithmica, 1: 251 – 256.

It has O(ND) complexity where D is the size of the minimum edit script between the two strings. Therefore it is more useful than the dynamic programming algorithm where the two strings differ only a little. Jyotirmoyb (talk) 04:32, 20 November 2009 (UTC)[reply]

Naive algorithm should be removed[edit]

The dynamic programming solution is absolutely ridiculous, it's not performant even for just two arrays. It should be removed as other entries about other computational problems only list optimal or otherwise notable algorithms, not the brute force solutions. — Preceding unsigned comment added by 69.181.124.151 (talk) 04:29, 29 March 2013 (UTC)[reply]

Probable Bug[edit]

This edit https://en.wikipedia.org/w/index.php?title=Longest_common_substring_problem&oldid=913812336 changed i to z which is wrong IMHO, the .. doesn't mean "start..length" it means "start..end" and in that case the end should be i. — Preceding unsigned comment added by 81.110.231.83 (talk) 12:47, 11 December 2019 (UTC)[reply]