string x = "hello "; Console.WriteLine("\"" + x.Trim() + "\"");
Output: "hello " I want the output: "hello"
How to deal with such symbols? This is the symbol U+200B
string x = "hello "; Console.WriteLine("\"" + x.Trim() + "\"");
Output: "hello " I want the output: "hello"
How to deal with such symbols? This is the symbol U+200B
public static class StringExtension
{
private readonly static string regExp = "((?=^)[\\s\\u200b]*)|([\\s\\u200b]*(?=$))";
public static string TrimZSC(this string s)
=> Regex.Replace(s, regExp, "");
}
var x = " he llo ";
Console.WriteLine(x);
Console.WriteLine(x.TrimZSC());
It will trim whitespace and 'Zero Space Character' at the end and at the beginning of the given string. Characters in between will remain (including zsc) - as they should (in the trim function).
You can use Regex
to write your own trim extension. For example, this trims all mark, separator and other categories at the beginning and end. RegexOptions.Singleline
should make it behave like standard Trim
.
I've split it into two expressions because a single one would have O(N^2) worst-case runtime. With GeneratedRegex
, the performance shouldn't be terrible either.
internal static partial class MyTrimExtensions
{
public static string TrimMarksSeparatorsOthers(this string toTrim)
{
var toTrimFront = TrimMarksSeparatorsOthersFrontRegex().Match(toTrim);
if (toTrimFront.Length == toTrim.Length)
return "";
var toTrimBack = TrimMarksSeparatorsOthersBackRegex().Match(toTrim);
return toTrim[toTrimFront.Length..toTrimBack.Index];
}
[GeneratedRegex("^[\\p{M}\\p{Z}\\p{C}]*", RegexOptions.Singleline)]
private static partial Regex TrimMarksSeparatorsOthersFrontRegex();
[GeneratedRegex("[\\p{M}\\p{Z}\\p{C}]*$", RegexOptions.Singleline | RegexOptions.RightToLeft)]
private static partial Regex TrimMarksSeparatorsOthersBackRegex();
}