0

General: I need a Regular Expression (RE) that will find characters from a given ordered set that appear in the target string out of order. Here's the layout:

I have the following ordered multiset of characters or character pairs (multiset, because the first and last elements are identical), each of which can appear at most once in a target string in the order given in the set:---exceptions: the ^ can appear twice in a row, and the single v (not paired with a .) can appear at either or both ends. (Note that the | character is not in the set but will often appear in the target string.)

# Ordered multiset of characters that should appear in target string

{v, v., <, .\, \, \., ^, ./, /, /., >, .v, v}  

Here is a typical target string that doesn't have an error:

# Typical target string

v.A|<B2|\T|E^|D|E^|A/.|B2>

Notice that the characters from the ordered multiset that do appear in the string occur in the order given in the set. Now, a user would be entering these strings (indirectly through a GUI), and the user could make a mistake. For instance, s/he could leave out a | character (I already have an RE for that), or s/he could enter character(s) from the ordered set in the incorrect order. This is the case that I am asking about here. Here is a string similar to the first but with an order mistake:

# Target string with an error

v.A|<B2|\T|E^|D|E^|A>|B2/.

Notice that the > and /. characters near the right end of the string appear out of order. (Notice also that the ^ validly appears twice.) I need to build an RE that tests for items from the ordered set being out of order in a given string. Here are a few more examples I have encountered in my work:

# Three "real life" target strings

1.    <A|vE2|B|D/|A>  
2.    <E|C^|\T|C|E>|T2/  
3.    vC2|\G|<L|B|L>|Gv 

The first string has one order mistake, the second has two, and the third has one. A few things to note: 1) for strings with multiple mistakes, technically I only need to know that the string has at least one mistake, but knowing all of the mistakes would be a bonus; 2) the single v (not v. or .v) can only appear at either or both ends of the string, as in the third example above; 3) Simple target strings, such as <A or |B|A/ or |B|, with only one or no characters from the set, often occur, and the RE must be able to handle these as valid cases, i.e. not match them.

I already have an RE with three alternatives that test for other kinds of possible mistakes, so the RE I seek here will become another alternative. I suspect I will have to handle the single v case in its own alternative and the other set items in yet another. (I realize the order of the alternatives matters, too.)

So, any help with this would be greatly appreciated. I have tried lots of RE's from simple to complex but have gone nowhere but down a rabbit hole, and now I am quite confused.

celticflute
  • 121
  • 4
  • Possible duplicate of [Learning Regular Expressions](http://stackoverflow.com/questions/4736/learning-regular-expressions) – Biffen Nov 16 '16 at 06:17
  • check this instead: http://stackoverflow.com/questions/3389574/check-if-multiple-strings-exist-in-another-string – Mohammad Yusuf Nov 16 '16 at 06:28

0 Answers0