3

I have strings with mathematical expressions like 2⁻¹² + 3³ / 4⁽³⁻¹⁾.

I want to convert these strings to the form of 2^-12 + 3^3 / 4^(3-1).

What I got so far is that I can extract the superscript number and prepend the ^.

Fiddle of code below: https://dotnetfiddle.net/1G9ewP

using System;
using System.Text.RegularExpressions;
                    
public class Program
{
    private static string ConvertSuperscriptToText(Match m){
        string res = m.Groups[1].Value;
            
        res = "^" + res;
        return res;
    }
    public static void Main()
    {
        string expression = "2⁻¹² + 3³ / 4⁽³⁻¹⁾";
        string desiredResult = "2^-12 + 3^3 / 4^(3-1)";
        
        string supChars = "([¹²³⁴⁵⁶⁷⁸⁹⁰⁺⁻⁽⁾]+)";
        string result = Regex.Replace(expression, supChars, ConvertSuperscriptToText);

        Console.WriteLine(result); // Currently prints 2^⁻¹² + 3^³ / 4^⁽³⁻¹⁾
        Console.WriteLine(result == desiredResult); // Currently prints false
    }
}

How would I replace the superscript characters without replacing each one of them one by one?

If I have to replace them one by one, how can I replace them using something like a collection similar to PHP's str_replace which accepts arrays as search and replace argument?

Bonus question, how can I replace all kinds of superscript characters with normal text and back to superscript?

Useme Alehosaini
  • 2,998
  • 6
  • 18
  • 26
Joshua Beckers
  • 857
  • 1
  • 11
  • 24
  • 2
    I think `res = "^" + res.Normalize(NormalizationForm.FormKD);` should do the trick. See: [How to convert super- or subscript to normal text in C#](https://stackoverflow.com/q/2673513/8967612) – 41686d6564 stands w. Palestine Apr 16 '20 at 22:11
  • 1
    @AhmedAbdelhameed The minus signs are a bit different but otherwise that works well. – juharr Apr 16 '20 at 22:19
  • 1
    Should be noted that `\p{No}` will match any superscript, subscript, or non 0-9 digit. Unfortunately nothing for just superscript, but if you know you don't have any of those other characters you could use it instead of listing the superscript digits. You'd still need to list the superscript plus, minus, and parenthesis `@"([\p{No}⁽⁾⁻⁺]+)"`. – juharr Apr 16 '20 at 22:30
  • @AhmedAbdelhameed this looks very elegant. And works quite well except for the minus sign as juharr mentioned. Unfortunately, the minus is quite important for the whole shebang ;-). – Joshua Beckers Apr 16 '20 at 22:36
  • @juharr I tried the `@"([\p{No}⁽⁾⁻⁺]+)"` already but could not find out how to replace it so it went back to just list them again. I like AhmedAbdelhameed's solution, I might go with that paired with cleaning up the minus sign in the end during further sanitazion. – Joshua Beckers Apr 16 '20 at 22:39

1 Answers1

3

You just need a dictionary to map the values and then you can use Linq to translate them over and create a new string out of them.

private static Dictionary<char, char> scriptMapping = new Dictionary<char, char>()
{
    ['¹'] = '1',
    ['²'] = '2',
    ['³'] = '3',
    ['⁴'] = '4',
    ['⁵'] = '5',
    ['⁶'] = '6',
    ['⁷'] = '7',
    ['⁸'] = '8',
    ['⁹'] = '9',
    ['⁰'] = '0',
    ['⁺'] = '+',
    ['⁻'] = '-',
    ['⁽'] = '(',
    ['⁾'] = ')',
};

private static string ConvertSuperscriptToText(Match m){
    string res = m.Groups[1].Value;

    res = "^" + new string(res.Select(c => scriptMapping[c]).ToArray());
    return res;
}

You could also create your regex from the dictionary so there's only one place to add new subscripts.

string supChars = "([" + new string(scriptMapping.Keys.ToArray()) + "]+)"
juharr
  • 31,741
  • 4
  • 58
  • 93