-1

I'm trying to extract the CN of an LDAP DN string.

Here's an example string that illustrates the problem

var dn = @"CN=Firstname Lastname\, Organization,OU=some ou,DC=company,DC=com";

What I want is the position of the first non escaped ',' character, which is at position 32.

var pos = dn.IndexOf(',');

returns the first comma, regardless of escaping or not. Now can I bring IndexOf to skip the escaped comma in the string?

Stephan Steiner
  • 1,043
  • 1
  • 9
  • 22
  • The '@' is making the backslash look like a regular character instead of an escape sequence. Remove the '@'. – jdweng May 12 '21 at 09:54
  • 1
    The string `"\,"` is not an [escape sequence](https://learn.microsoft.com/cpp/c-language/escape-sequences), it is just a backslash followed by a comma and it is for example printed out as-is in the console or any text box. –  May 12 '21 at 09:54
  • Also, it's better to use Regex for this sort of thing –  May 12 '21 at 09:57
  • [What is an escape sequence (M Docs)](https://learn.microsoft.com/cpp/c-language/escape-sequences) –  May 12 '21 at 10:01
  • The `@` means treat everything inside the string as [literal](https://stackoverflow.com/a/4879175/585968). Thus there is no _"escape sequences"_ in your string –  May 12 '21 at 10:04

3 Answers3

1

Assuming that \ should be escaped by itself: \\ to put just \ you can implement a simple finite state machine

private static int IndexOfUnescaped(string source, 
                                    char toFind, 
                                    char escapement = '\\') {
  if (string.IsNullOrEmpty(source))
    return -1;

  for (int i = 0; i < source.Length; ++i) 
    if (source[i] == escapement)
      i += 1; // <- skip the next (escaped) character
    else if (source[i] == toFind)
      return i;

  return -1;
}

...

var dn = @"CN=Firstname Lastname\, Organization,OU=some ou,DC=company,DC=com";

var pos = IndexOfUnescaped(dn, ',');
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
1

You can use Regex:

string s = @"CN=Firstname Lastname\, Organization,OU=some ou,DC=company,DC=com";
Regex regex = new Regex("(?<!\\\\),", RegexOptions.Compiled);
int firstMatch = regex.Matches(s).FirstOrDefault()?.Index ?? -1;

Demo: https://regex101.com/r/Jxco8K/1

It's using a negative lookbehind, so check all commas and look if it's not preceeded by a backslash.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
0

Colleague of mine whipped up this regex. Not entirely the question, but since I wanted the position to then use SubString it also does the trick.

var CnRegex = new Regex(@"([a-zA-Z_]*)=((?:[^\\,}]|\\.)*)");
var match = CnRegex.Match(input);
if (match.Success)
    return match.Value;
return null;

I feared it would come down to a Regex, as in Tim's solution, or 'brute force' as with Dmitry's solution.

Stephan Steiner
  • 1,043
  • 1
  • 9
  • 22