10
string s = "Gewerbegebiet Waldstraße"; //other possible input "Waldstrasse"

int iFoundStart = s.IndexOf("strasse", StringComparison.CurrentCulture);
if (iFoundStart > -1)
    s = s.Remove(iFoundStart, 7);

I'm running CultureInfo 1031 (german).

IndexOf matches 'straße' or 'strasse' with defined 'strasse' and returns 18 as position.

Neither Remove nor Replace got any overload for setting a culture.

If I remove 6 chars using Remove 1 character will be left if input-string is 'strasse' and 'straße' will work. If input-string is 'straße' and I remove 7 chars I get ArgumentOutOfRangeException.

Is there a way to safely remove the found string? Any method which provides the last index of IndexOf? I stepped closer into IndexOf and it's native code under the hood as expected - so no way to do something own...

Hamid Pourjam
  • 20,441
  • 9
  • 58
  • 74
isHuman
  • 125
  • 9
  • How about replacing it with empty string? `s = s.Replace("strasse","");` – Hamid Pourjam Nov 05 '15 at 18:22
  • @dotctor I believe the OP is saying that `string.Replace` doesn't take the culture into account, so "ss" doesn't match "ß". – juharr Nov 05 '15 at 18:32
  • Im running on `en-US` and got this problem.the thing is IndexOf behaves different. – M.kazem Akhgary Nov 05 '15 at 18:34
  • @M.kazemAkhgary Because in English "ß" and "ss" are not the same. – Jakub Lortz Nov 05 '15 at 18:36
  • 2
    What if you first did `s.Replace("ß", "ss");`? – juharr Nov 05 '15 at 18:36
  • @JakubLortz they are not same. but even i get the correct index when i use indexOf. if i use `StringComparison.Ordinal` then it gives me `-1` of course but .Net guys should think about adding this overloads for `Replace` and `Remove` methods too and they should behave like `IndexOf` do. – M.kazem Akhgary Nov 05 '15 at 19:06
  • life is too short to learn German! – Hamid Pourjam Nov 05 '15 at 19:23
  • @juharr: Well in my case there may be more then one 'ß' in my input-string not related to 'straße'. I can replace 'straße' with 'strasse', but there may be a 'sTraße' or a 'stRaße' ... you just never know what was inputted. What I wanted to achieve was replace any '*strasse' to '*str.' and any 'Straße' to 'Str.' – isHuman Nov 06 '15 at 10:36

1 Answers1

5

The native Win32 API does expose the length of the string that was found. You can use P/Invoke to call FindNLSStringEx directly:

static class CompareInfoExtensions
{
    [DllImport("kernel32.dll", CharSet = CharSet.Unicode, ExactSpelling = true)]
    private static extern int FindNLSStringEx(string lpLocaleName, uint dwFindNLSStringFlags, string lpStringSource, int cchSource, string lpStringValue, int cchValue, out int pcchFound, IntPtr lpVersionInformation, IntPtr lpReserved, int sortHandle);

    const uint FIND_FROMSTART = 0x00400000;

    public static int IndexOfEx(this CompareInfo compareInfo, string source, string value, int startIndex, int count, CompareOptions options, out int length)
    {
        // Argument validation omitted for brevity
        return FindNLSStringEx(compareInfo.Name, FIND_FROMSTART, source, source.Length, value, value.Length, out length, IntPtr.Zero, IntPtr.Zero, 0);
    }
}

static class Program
{
    static void Main()
    {
        var s = "<<Gewerbegebiet Waldstraße>>";
        //var s = "<<Gewerbegebiet Waldstrasse>>";
        int length;
        int start = new CultureInfo("de-DE").CompareInfo.IndexOfEx(s, "strasse", 0, s.Length, CompareOptions.None, out length);
        Console.WriteLine(s.Substring(0, start) + s.Substring(start + length));
    }
}

I'm not seeing a way to do this using purely the BCL.

  • If I want to match 'Berliner Straße' and use CompareOptions.IgnoreCase this fails - any idea why? – isHuman Nov 06 '15 at 10:53
  • @isHuman I left out the conversion from `CompareOptions` to `FindNLSStringEx` option values: you can see that the `options` parameter does not get used. You will need to add a conversion from `CompareOptions.IgnoreCase` to either `LINGUISTIC_IGNORECASE` or `NORM_IGNORECASE` (to be determined by you). –  Nov 06 '15 at 10:58
  • This seems to work. I defined another flag `LINGUISTIC_IGNORECASE` and combined it with `FIND_FROMSTART` using `|`. Thank you, too bad that there is no high level approach doing this. – isHuman Nov 06 '15 at 11:05