4

Say I have a string in a form similar to this:

"First/Second//Third/Fourth" (notice the double slash between Second and Third)

I want to be able to split this string into the following substrings "First", "Second//Third", "Fourth". Basically, what I want is to split the string by a char (in this case /), but not by double of that char (in this case //). I though of this in a number of ways, but couldn't get it working.

I can use a solution in C# and/or JavaScript.

Thanks!

Edit: I would like a simple solution. I have already thought of parsing the string char by char, but that is too complicated in my real live usage.

Adrian Marinica
  • 2,191
  • 5
  • 29
  • 53

4 Answers4

10

Try with this C# solution, it uses positive lookbehind and positive lookahead:

        string s = @"First/Second//Third/Fourth";
        var values = Regex.Split(s, @"(?<=[^/])/(?=[^/])", RegexOptions.None);

It says: delimiter is / which is preceded by any character except / and followed by any character except /.

Here is another, shorter, version that uses negative lookbehind and lookahead:

        var values = Regex.Split(s, @"(?<!/)/(?!/)", RegexOptions.None);

This says: delimiter is / which is not preceded by / and not followed by /

You can find out more about 'lookarounds' here.

Ivan Golović
  • 8,732
  • 3
  • 25
  • 31
  • Thanks! It works great! Will mark your answer when the option will be available. – Adrian Marinica Feb 27 '13 at 09:54
  • I'm not sure how you want to handle it, but the use of _positive_ lookaround here means that this _won't_ split on a slash at the _start or end_ of the input. (Since it's looking for "next to two non-slashes" rather than "not next to a slash".) – Rawling Feb 27 '13 at 09:59
  • 1
    @Rawling I added the version with negative lookaround, except for what you said, negative lookaround version is shorter and, in my opinion, more readable. – Ivan Golović Feb 27 '13 at 10:04
  • Could you explain why < is needed for the lookahead but not for the lookbehind? – zer0ne Jun 26 '13 at 20:04
  • @zer0ne `<` is used to denote lookbehind, it's a syntax symbol that makes it clear that this pattern is actually negative lookbehind, without it, syntax would be the same as for positive lookahead. – Ivan Golović Jul 15 '13 at 12:46
5

In .NET Regex you can do it with negative assertions.(?<!/)/(?!/) will work. Use Regex.Split method.

RoadBump
  • 733
  • 7
  • 16
3

ok one thing you can do is to split the string based on /. The array you get back will contain empty allocations for all the places // were used. loop through the array and concatenate i-1 and i+1 allocations where i is the pointer to the empty allocation.

Aashray
  • 2,753
  • 16
  • 22
2

How about this:

var array = "First/Second//Third/Fourth".replace("//", "%%").split("/");

array.forEach(function(element, index) {
    array[index] = element.replace("%%", "//");
});
Amberlamps
  • 39,180
  • 5
  • 43
  • 53