1

So I have a long string of chars for example - "wdllwdwwwlldd" The string just contains the same chars -wld (try and guess what I'm doing ;))

The string will be quite long, approx 420 chars long.

I want to find, if they exist, any patterns in the string. For example if the string was - "wllddwllddwlldd" then it "wlldd" would be the pattern that was found.

So i kind of want to find Any repeated sequences in the string.

Having done a bit of research, suffix trees and suffix arrays seem to get mentioned a lot on these problems.

Is thst correct or is there another way to do this?

I can tell that this is quite a large task and could potentially take a long time.

Thanks in advance.

lbalazscs
  • 17,474
  • 7
  • 42
  • 50
  • So are you then looking for the largest pattern that, when together, make up the entire original string? – ChiefTwoPencils Aug 29 '15 at 21:48
  • Ideally yes, but that's not a constraint. Cause that would mean thst the string is following a pattern. But I doubt that the whole string will be, so I'd also like to know if any smaller patterns occur in the string, I.e more than 3 times. – Cayongrayoo Aug 29 '15 at 22:06

1 Answers1

0

So what you want is to extract all occurrences of some pattern from some string, did I get your point? If so, something very similar was discussed in this thread. It should at least send you in the right direction.

In your case, using regular expression such as w+l+d+ should do the trick.

EDIT

Question clarified a bit more ... so the algorithm you're looking for is in a detail explained in this post

Community
  • 1
  • 1
Sva.Mu
  • 1,141
  • 1
  • 14
  • 24
  • Yeah I think we are on the same lines, the only thing is, I don't know the pattern. So it needs to see if one exists.from looking at that thread it seems that the guy knows the pattern and has a Regex for it? – Cayongrayoo Aug 29 '15 at 22:12
  • @Cayongrayoo Misunderstood your question a bit - you want to know IF there are any patterns in the string, you do not use a specific one. In that case very detailed explanation on the algorithm is available in this post: http://stackoverflow.com/questions/9452701/ukkonens-suffix-tree-algorithm-in-plain-english/9513423#9513423 – Sva.Mu Aug 29 '15 at 22:28
  • Yeah that's correct. So suffix tree is the correct thing to do for this? That looks Luke s very helpful thread so thank you. – Cayongrayoo Aug 29 '15 at 22:31
  • Okay that's great thanks. One thing I never could find when doing research is what to do once the suffix tree is created, because from what I can see, a suffix tree itself does not actually tell you if there are any patterns. – Cayongrayoo Aug 29 '15 at 22:39