-2

I have to write an equivalent of this in C++ in C#,

string val_in;
float val;
char unit[100];

val_in = NoSpace(val_in);

int nscan = sscanf(val_in.c_str(), "%f%s", &val, &unit);

if (nscan < 2) {
    return val_in; //do nothing if scan fail
}

where the NoSpace() method trims and removes all spaces in val_in.

I have looked around here on SO and most of the similar questions involves strings that contain delimiters such as spaces or commas, but don't apply to this case. So I turned to RegEx.

So far, I have this,

string val_in;
float val;
char[] unit = new char[100];

string[] val_arr;

val_in = NoSpace(val_in);

val_arr = Regex.Split(val_in, @"([-]?\d*\.?\d+)([a-zA-Z]+)");
val = Single.Parse(val_arr[1]);

if (val_arr.Length < 2) {
    return val_in; //do nothing if scan fail
}

It works so far, but I was wondering if there is another way to do this? I a bit wary of RegEx, because according the accepted answer on this question, having ([-]?\d*\.?\d+) instead of ([-]?(\d*\.)?\d+) is potentially dangerous because of evil RegEx. But if I include those extra parenthesis, then I have an extra group. This causes Split() to split something like 123.456miles into an array with the elements,

{emptystr, 123.456, 123., miles}

This way, I can't be sure that the unit, miles in this case, will be in val_arr[2], which is a problem.

I tested this on this .NET RegEx tester. I also tried to break my RegEx pattern, ([-]?\d*\.?\d+), but it seems to be fine and "evil RegEx safe". So I'm not sure if I should stick to what I've done so far, or find a more elegant solution, if one exist.

Community
  • 1
  • 1
skwear
  • 563
  • 1
  • 5
  • 24
  • 2
    What's this got to do with C#? – Izzy Oct 03 '16 at 16:42
  • Oh my goodness sorry. I forgot to say that I have to rewrite that C++ snippet in C#. Edited. – skwear Oct 03 '16 at 16:46
  • The C++ code is supposed to take string of a number and a unit, I believe just in "miles" or "Km" (though there may be variations, and I can't be sure at this point) and separate it into `float val` and `string unit`. I am trying to do something similar in my C# code, where the RegEx pattern will split `val_in` into a string array, then I can take `val_arr[1]` as "val" and `val_arr[2]` as "unit". – skwear Oct 03 '16 at 16:56
  • This question seems to be pretty badly received so far. I apologize if I haven't been clear enough. The RegEx pattern is supposed to take things like `123`, `123.456` and `.456` but will not match things like `123.` and `12.34.56`. – skwear Oct 03 '16 at 17:03
  • An example input would be `string val_in = 123.456miles`. In the C++ code, this string would be split into `float val = 123.456` and `string unit = "miles"`. – skwear Oct 03 '16 at 17:05
  • Is your question about splitting a string into two parts, or about a regex pattern that validates a string prior to splitting it?? – BackDoorNoBaby Oct 03 '16 at 17:39
  • Both. I want to make sure I'm not taking the long way round, and if I'm not, I want to make sure that the pattern is safe. – skwear Oct 03 '16 at 18:13

1 Answers1

1

Not very elegant, but can't you just look for the first letter in the string to know where your unit starts?

  static void SplitValAndUnit(string unsplitData)
  {
     for (int x = 0; x < unsplitData.Length; x++)
     {
        if (Char.IsLetter(unsplitData[x]))
        {
           string value = unsplitData.Substring(0, x);
           // TryParse value to whatever data type
           string unit  = unsplitData.Substring(x, unsplitData.Length - x);
        }
     }
  }
BackDoorNoBaby
  • 1,445
  • 2
  • 13
  • 20
  • Note: The `Char.IsLetter` check can be replaced with a `RegEx` check to account for strange characters still considered letters – BackDoorNoBaby Oct 03 '16 at 17:31