-1

Given set of n characters, what regex do we need to match a sequence of 0-x permutations of these characters?

We want permutations. Given set of 3 characters A,B,C we want to match ABC, ACB, BAC, BCA, CAB, CBA.

However, we want to match a sequence of these permutations. Sequence may contain 0 or more permutations, meaning that we want to match empty string, ABC, ABCBCA, BACCAB, BCAABCCBAABC, etc.

I was able to find solutions to match a permutation, but was unable to modify it to match a sequence of permutations.

I understand that sometimes the used regex engine might make difference. I would like to use this regex in C#'s Microsoft.VisualStudio.TestTools.UnitTesting.StringAssert.Matches method, should that make a difference. We simply want to check whether output string of tested method matches this regex, i.e. is a sequence of permutations of given set of characters.

nortander
  • 29
  • 5
  • This is doable for a fixed n (n = 3 for your ABC example), but overly ambitious for an arbitrary n if a single regex is to be used. You may wish to sharpen the focus of your question. "0-x permutations" is not clear. If you are not wed to using a single regular expression, it certainly would be easier to use ordinary c# code to develop a solution for an arbitrary collection of characters. – Cary Swoveland Mar 11 '20 at 02:35

2 Answers2

3

I cannot recommend too highly1 the use of a regular expression here!

You can use the following regex to test strings for conformance when for n = 3 and the characters are 'A', 'B' and 'C':

/^(?:([ABC])(?!\1)([ABC])(?!\1|\2)[ABC])*$/

Demo

The regex can be made self-documenting by writing it in free-spacing mode:

/
^           # match beginning of line
(?:         # begin non-capture group
  ([ABC])   # match 'A', 'B' or 'C' in capture group 1
  (?!\1)    # next character cannot be the content of capture group 1
  ([ABC])   # match 'A', 'B' or 'C' in capture group 2
  (?!\1|\2) # next character cannot be the content of capture group 1 or 2
  [ABC]     # match 'A', 'B' or 'C'
)           # end non-capture group
*           # execute non-capture group 0+ times
$           # match end of line
/x          # free-spacing mode

(?!\1|\2) is a negative lookaheads.

I've used beginning- and end-of-line anchors to facilitate testing at the link, but beginning- and end-of-string anchors would be more appropriate (\A and \z).

1 Literal interpretation intended.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • Thank you, this seems to work perfectly! I tested with up to 1,000,000 samples and it worked (I know, in regex amount should not matter). Seeing this and the other answer, I understand that regex probably isn't the best choice in my use case. Still this answered my question, so I think this deserves be marked as correct answer – nortander Mar 11 '20 at 14:07
  • I simplified the regex ever so slightly. – Cary Swoveland Mar 11 '20 at 16:58
0

You could just use a little helper i guess

static bool Check(string source, string target)
{
   int? sLen = source?.Length, tLen = target?.Length;

   if (!(sLen > 0) || !(tLen > 0) || tLen % sLen != 0)
      return false;

   IEnumerable<string> Chunks(string str, int chunkSize)
      => Enumerable.Range(0, str.Length / chunkSize)
                   .Select(i => str.Substring(i * chunkSize, chunkSize));

   return Chunks(target, source.Length).All(x => source.All(x.Contains));

}

This is likely to be a lot faster than regex, and can deal with nulls

TheGeneral
  • 79,002
  • 9
  • 103
  • 141
  • I originally discarded idea of verifying the result via function because I was afraid I wouldn't see the result in unit test overview and thus have difficulty debugging. Based on your suggestion I looked up a few things and it turn out I can have results of tested method printed and actually see them [link](https://stackoverflow.com/questions/4786884/how-to-write-output-from-a-unit-test). Your answer fits the best my use case, but the way I formulated the question, I marked the other one as correct answer. – nortander Mar 11 '20 at 14:15