Extracting substring based on the same identifier in two locations

Question

I found this question, which achieves what I am looking for, however I only have one problem: the "start" and "end" of the substring are the same character.

My string is:

.0.label unicode "Area - 110"

and I want to extract the text between the inverted commas ("Area - 110").

In the linked question, the answers are all using specific identifiers, and IndexOf solutions. The problem is that if I do the same, IndexOf will likely return the same value.

Additionally, if I use Split methods, the text I want to keep is not a fixed length - it could be one word, it could be seven; so I am also having issues specifying the indexes of the first and last word in that collection as well.

What's wrong with the linked question's Regex solution? `Regex.Match(input, @"(?<="")(.+?)(?="")");` will match your string and you'll be able to extract `Area - 110` from it's value. — Jonathon Chase, May 24 '18 at 00:55

score 2 · Accepted Answer · answered May 24 '18 at 00:55

The problem is that if I do the same, IndexOf will likely return the same value.

A common trick in this situation is to use LastIndexOf to find the location of the closing double-quote:

int start = str.IndexOf('"');
int end = str.LastIndexOf('"');
if (start >= 0 && end > start) {
    // We have two separate locations
    Console.WriteLine(str.Substring(start+1, end-start-1));
}

Demo.

score 0 · Answer 2 · answered May 26 '18 at 18:15

I would to it like this:

string str = ".0.label unicode \"Area - 110\"";
str = input.SubString(input.IndexOf("\"") + 1);
str = input.SubString(0, input.IndexOf("\""));

In fact, this is one of my most used helper methods/extensions, because it is quite versatile:

/// <summary>
/// Isolates the text in between the parameters, exclusively, using invariant, case-sensitive comparison. 
/// Both parameters may be null to skip either step. If specified but not found, a FormatException is thrown.
/// </summary>
public static string Isolate(this string str, string entryString, string exitString)
    {
        if (!string.IsNullOrEmpty(entryString))
        {
            int entry = str.IndexOf(entryString, StringComparison.InvariantCulture);
            if (entry == -1) throw new FormatException($"String.Isolate failed: \"{entryString}\" not found in string \"{str.Truncate(80)}\".");
            str = str.Substring(entry + entryString.Length);
        }

        if (!string.IsNullOrEmpty(exitString))
        {
            int exit = str.IndexOf(exitString, StringComparison.InvariantCulture);
            if (exit == -1) throw new FormatException($"String.Isolate failed: \"{exitString}\" not found in string \"{str.Truncate(80)}\".");
            str = str.Substring(0, exit);
        }

        return str;
    }

You'd use that like this:

string str = ".0.label unicode \"Area - 110\"";
string output = str.Isolate("\"", "\"");

Extracting substring based on the same identifier in two locations

2 Answers2