View Single Post
  #1  
Old 01-22-2018, 07:09
chants chants is offline
VIP
 
Join Date: Jul 2016
Posts: 826
Rept. Given: 47
Rept. Rcvd 50 Times in 31 Posts
Thanks Given: 737
Thanks Rcvd at 1,140 Times in 529 Posts
chants Reputation: 51
The problem with this task - e.g. common substrings problem, is that is a high complexity so that it requires a lot of difficult heuristic tricks to get it below O(n^2) otherwise it is too slow or uses too much memory. I have not seen any tools to do this. It would work with pictures or videos or audio as well - to find matching image sections, video subclips, etc. But really, it would be quite useful. I am quite certain we are talking an NP-hard problem please see:
Quote:
https://en.wikipedia.org/wiki/Longest_common_substring_problem
Although this problem is less complex, you did not specify only the longest common substring but all common substrings. Yes the suffix tree and other tricks can solve this one faster.

And there are proofs I believe that shortest common substring is NP-hard.
See for example
Quote:
https://docs.lib.purdue.edu/cgi/viewcontent.cgi?httpsredir=1&article=2225&context=cstech
Reply With Quote
The Following 2 Users Say Thank You to chants For This Useful Post:
dila (01-23-2018), Stingered (01-23-2018)