I have written this method to reverse a string
public string Reverse(string s)
{
if(string.IsNullOrEmpty(s))
return s;
TextElementEnumerator enumerator =
StringInfo.GetTextElementEnumerator(s);
var elements = new List<char>();
while (enumerator.MoveNext())
{
var cs = enumerator.GetTextElement().ToCharArray();
if (cs.Length > 1)
{
elements.AddRange(cs.Reverse());
}
else
{
elements.AddRange(cs);
}
}
elements.Reverse();
return string.Concat(elements);
}
Now, I don't want to start a discussion about how this code could be made more efficient or how there are one liners that I could use instead. I'm aware that you can perform Xors and all sorts of other things to potentially improve this code. If I want to refactor the code later I could do that easily as I have unit tests.
Currently, this correctly reverses BML strings (including strings with accents like "Les Misérables"
) and strings that contain combined characters such as "Les Mise\u0301rables"
.
My test that contains surrogate pairs work if they are expressed like this
Assert.AreEqual("", _stringOperations.Reverse(""));
But if I express surrogate pairs like this
Assert.AreEqual("\u10000", _stringOperations.Reverse("\u10000"));
then the test fails. Is there an air-tight implementation that supports surrogate pairs as well?
If I have made any mistake above then please do point this out as I'm no Unicode expert.