0

I have a text file. One of the columns contains a field which contains text along with numbers.

I'm trying to figure out the best way to split the numbers and text.

Below is an example of the typical values in the field.

.2700 Aqr sh./Tgt sh.

USD 2.4700/Tgt sh.

Currently I'm making use of the Split function (code below) however feel there is probably a smarter way of doing this.

My assumption is there will only ever be one number in the text (I'm 99% sure this is the case) however I have only seen a few examples so its possible my code below will not work.

I have read a little on regex. But not sure I tested it properly as it didn't quite get the output I wanted. For example

string input = "USD 2.4700/Tgt sh.";

string[] numbers = Regex.Split(input, @"\D+");
foreach (string value in numbers)
{
    if (!string.IsNullOrEmpty(value))
    {
        int i = int.Parse(value);
        Console.WriteLine("Number: {0}", i);
    }
}

But the output is,

2 47

Whereas I was expecting 2.47 and I also don't want to lose the text. My desired result is

myText = "USD Tgt sh." myNum = 2.47

For the other example

myText = "Aqr sh./Tgt sh." myNum = 0.27

My Code

string[] sData = sTerms.Split(' ');

double num;
bool isNum = double.TryParse(sData[0], out num);

if(isNum)
{
    ma.StockTermsNum = num;

    StringBuilder sb = new StringBuilder();
    for (int i = 1; i < sData.Length; i++)
        sb = sb.Append(sData[i] + " ");

    ma.StockTerms = sb.ToString();
}
else
{
    string[] sNSplit = sData[1].Split('/');

    ma.StockTermsNum = Convert.ToDouble(sNSplit[0]);

    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < sData.Length; i++)
    {
        if (i == 1)                            
            sb = sb.Append(sNSplit[i] + " ");                            
        else
            sb = sb.Append(sData[i] + " ");
    }                            
    ma.StockTerms = sb.ToString();
}
mHelpMe
  • 6,336
  • 24
  • 75
  • 150
  • 1
    Easy. Build a pattern to match the numbers of the format you need. Then use `Regex.Split(s, @"(YOUR_NUMBER_REGEX)").Where(x => !string.IsNullOrEmpty(x))`. Matching numbers is a solved task - see https://www.regular-expressions.info/floatingpoint.html – Wiktor Stribiżew Apr 06 '18 at 10:42
  • @WiktorStribiżew thanks for you reply. Will see if I can get that to work & hopefully it is easy :-) – mHelpMe Apr 06 '18 at 10:47
  • The problem is now with what exactly you call "numbers" :) The pattern requirements are missing in the question. – Wiktor Stribiżew Apr 06 '18 at 10:48
  • your output is that, because in first code snippet using int, not double. Seems to me, that both is working. Values of ma.StockTerms and ma.StockTermsNum outputs as you expected – raichiks Apr 06 '18 at 10:55

3 Answers3

2

I suggest spliting by group, (...) in order to preserve delimiter:

  string source = @".2700 Aqr sh./Tgt sh.";
  //string source = "USD 2.4700/Tgt sh.";

  // please, notice "(...)" in the pattern - group
  string[] parts = Regex.Split(source, @"([0-9]*\.?[0-9]+)");

  // combining all texts
  string myText   = string.Concat(parts.Where((v, i) => i % 2 == 0));
  // combining all numbers
  string myNumber = string.Concat(parts.Where((v, i) => i % 2 != 0));

Tests:

  string[] tests = new string[] {
     @".2700 Aqr sh./Tgt sh.",
     @"USD 2.4700/Tgt sh.",
  };

  var result = tests
    .Select(test => new {
      text = test,
      parts = Regex.Split(test, @"([0-9]*\.?[0-9]+)"),
    })
    .Select(item => new {
      text = item.text,
      myText = string.Concat(item.parts.Where((v, i) => i % 2 == 0)),
      myNumber = string.Concat(item.parts.Where((v, i) => i % 2 != 0)),
    })
    .Select(item => $"{item.text,-25} : {item.myNumber,-15} : {item.myText}");

  Console.WriteLine(string.Join(Environment.NewLine, result));

Outcome:

 .2700 Aqr sh./Tgt sh.     :  Aqr sh./Tgt sh.   : .2700
 USD 2.4700/Tgt sh.        : USD /Tgt sh.       : 2.4700
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • Thanks for your answer! One question though how would I get item.myNumber & item.myText into a variable? – mHelpMe Apr 06 '18 at 12:55
  • @mHelpMe: Sorry, I don't follow you. Do you mean `.Select(item => new {myNum = double.Parse(item.myNumber), myText = item.myText})` instead of the last `Select`? Or even something like `.Select(item => new StockTerm(double.Parse(item.myNumber, item.myText)));` – Dmitry Bychenko Apr 06 '18 at 13:04
  • No sorry I confused myself, not hard to do! The answer you supplied is great – mHelpMe Apr 06 '18 at 13:12
0

Could by something like this regex:

string input = "USD 2.4700/Tgt sh.";

var numbers = Regex.Matches(input, @"[\d]+\.?[\d]*");
foreach (Match res in numbers)
{
    if (!string.IsNullOrEmpty(res.Value))
    {
        decimal i = decimal.Parse(res.Value);
        Console.WriteLine("Number: {0}", i);
    }
}
Alexey Klipilin
  • 1,866
  • 13
  • 29
0

I would suggest you to use System.Text.RegularExpressions.RegEx. Here is example how you can achieve it:

static void Main(string[] args)
{
    string a1 = ".2700 Aqr sh./Tgt sh.";
    string a2 = "USD 2.4700/Tgt sh.";
    var firstStringNums = GetNumbersFromString(ref a1);
    Console.Write("My Text: {0}",a1);
    Console.Write("myNums: ");
    foreach(double a in firstStringNums)
    {
        Console.Write(a +"\t");
    }
    var secondStringNums = GetNumbersFromString(ref a2);
    Console.Write("My Text: {0}", a2);
    Console.Write("myNums: ");
    foreach (double a in secondStringNums)
    {
        Console.Write(a + "\t");
    }
}

public static List<double> GetNumbersFromString(ref string input)
{
    List<double> result = new List<double>();
    Regex r = new Regex("[0-9.,]+");
    var numsFromString = r.Matches(input);
    foreach(Match a in numsFromString)
    {
        if(double.TryParse(a.Value,out double val))
        {
            result.Add(val);
            input =input.Replace(a.Value, "");
        }
    }
    return result;
}

The pattern is just an example and off course will not cover every case that you will imagine.

Samvel Petrosov
  • 7,580
  • 2
  • 22
  • 46