-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Diff Calculating wrong results for certain cases #146
Comments
@AjithAj2125 I'm having trouble making sense of your expected outputs. It may help to review what the output of In the case of
now we're done and the transformed string is in other words the output is an edit script - a list of operations to apply to the operations should always result in does this help clarify what the library is doing? |
@dmsnell thank you for the swift response. And that clears my doubt. I had gotten the whole thing wrong. I had understood it as the algorithm giving an output for each edit that takes place for each of the strings positionally, which should explain what I was expecting as an output. Just to make sense out of my "expected output". Considering
Hence the output for first edit would be
Hence the output for second edit would be Combining both the outputs at each steps would result in my final output. And as you mentioned earlier I know what I'm expecting can be achieved using for loops and traversing the string simultaneously and finding the edits but I wasn't sure about how optimal it could have turned out given that I'm parsing huge chunks of texts. That's when I came across this algorithm that was readily available. |
There's a complicated algorithm to find the minimum edit script and it has some catastrophic edge cases. Libraries like
|
I've been working and testing this algorithm (C#), It works as expected most times but there are few cases where the algorithm doesn't return correct results.
Example 1 :
The expected output for the above case would be
Insert "o", Delete "n", Insert "n", Delete "o"
but the output that the algo gives is wrong.Example 2 :
The same issue arises here too, even though numbers are treated as strings here. The expected output would be
Insert "3", Delete "0", Insert "1", Delete "3", Equal ".00"
While these are just some random test cases, I think the common pattern where this issue arises is when there are same characters in both the string which have been inserted and deleted or occur in different order.
Has this been observed by anyone and has this been addressed before? Can I get some guidance on how to handle these cases as I'm trying to utilize this algorithm.
Or am I not using it the right way? Any help would be appreciated.
The text was updated successfully, but these errors were encountered: