14

Is there a built in .NET function or an easy way to convert from:

"01234"

to:

"\u2070\u00B9\u00B2\u00B3\u2074"

Note that superscript 1, 2 and 3 are not in the range \u2070-\u209F but \u0080-\u00FF.

dtb
  • 213,145
  • 36
  • 401
  • 431
Dave Hillier
  • 18,105
  • 9
  • 43
  • 87

1 Answers1

20

EDIT: I hadn't noticed that the superscript characters weren't as simple as \u2070-\u2079. You probably want to set up a mapping between characters. If you only need digits, you could just index into a string fairly easily:

const string SuperscriptDigits = 
    "\u2070\u00b9\u00b2\u00b3\u2074\u2075\u2076\u2077\u2078\u2079";

Then using LINQ:

string superscript = new string(text.Select(x => SuperscriptDigits[x - '0'])
                                    .ToArray());

Or without:

char[] chars = text.ToArray();
for (int i = 0; i < chars.Length; i++)
{
    chars[i] = SuperscriptDigits[chars[i] - '0'];
}
string superscript = new string(chars);
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • The codepoints for superscript 1-3 are somewhere else: http://unicode.org/charts/PDF/U2070.pdf – dtb Jun 21 '11 at 20:33
  • I think it would be easier using `String.Concat()` instead of the string constructor with the LINQ approach. That is unless there's a significant performance difference between the two. – Jeff Mercado Jun 21 '11 at 20:46
  • @Jeff: I don't see how that would be easier. – Jon Skeet Jun 21 '11 at 20:50
  • @Jon: The consructor overload requires an array of characters while a generic overload of `Concat()` accepts an `IEnumerable` so the `ToArray()` call wouldn't be necessary. That's my reasoning anyhow. – Jeff Mercado Jun 21 '11 at 20:52
  • 1
    @Jeff: I can see a `Concat>` and a `Concat>` call, but not `Concat>`. I suspect the former would call `ToString` on each element, which would create a bunch of strings unnecessarily. Also, these calls are only available in .NET 4, whereas my approach would also work in .NET 3.5. I think I'd rather stick with ToArray and the constructor :) – Jon Skeet Jun 21 '11 at 20:54
  • @Jon: Fair enough. And it does have a slight performance difference as you pointed out. I'll have to try to remember to avoid this with character conversions. – Jeff Mercado Jun 21 '11 at 20:57
  • There is a typo with the digits \u20b9 should be \u00b9 – Dave Hillier Jul 21 '11 at 16:16