0

i am trying to read strings in C# from files and its typical contents are:

17401 "71690090" "HSG" "3384656" 0 "condensate pipe for boiler leaking" "avoid school run 2.30 -4 call route 07777766777" "" "07777766777" "07777766777 " 0 "0" "24 Hour" "YYYYYN" "YYYYYN" ? "00:00" 0 "H1" "Domestic-Repair" "HR" 0 "" ? "00:00" "Tom Timmy" 22/03/2017 "08:16" 22/03/2017 "08:18" 23/03/2017 "08:18" "2010" "Some Company" 2010 "Some Company Lot4" "" "Miss L Burton||90||Mount Pleasant" "Mount Pleasant|Pleasantville||XX1 1XX" "" "" "" 

it contains mixture of:-

1 - alphanumeric with special characters (grouped - no spaces)

2 - string contained within quotations (which has spaces and special characters), some may be empty quotations.

i am trying to split up the above string which going off the count is 42, then putting this into array of string. i have come up with:

("[a-zA-Z0-9 .:+*,#/'~@;=+_)(&^%$£!`¬|-]+")|("")|([?])|((\d{2}\/\d{2}\/\d{4})|(\d+))

which i created on Regex101.com however when i try to put into c# as:

string[] test1 = Regex.Split(line, @"(""[a-zA-Z0-9 .:+*,#/'~@;=+_)(&^%$£!`¬|-]+"")|("""")|([?])|((\d{2}\/\d{2}\/\d{4})|(\d+))");

i get 94 as the count of items in test1, i am trying to replicate from Regex101.com so it splits into 42. can someone kindly please point me in the right direction? also if another efficient way compared to my approach?

solved:

var pattern = @"(""(?<value>[^""]+)""|(?<value>[^\s]+))";
var regex = new Regex(pattern, RegexOptions.Compiled);
string[] test1 = regex.Matches(line).Cast<Match>().Select(m => m.Value).ToArray();

i didn't think i needed to over complicate using CSV parser.

yigames
  • 185
  • 1
  • 5
  • 23
  • 5
    I think you should parse it as a CSV string. See http://stackoverflow.com/questions/6542996/how-to-split-csv-whose-columns-may-contain/6543418#6543418 You just need to set `parser.SetDelimiters(" ");` – Wiktor Stribiżew Jun 20 '17 at 11:19
  • As @WiktorStribiżew said: use a CSV parser. If you still want to do that using regex have a look at [this](http://ideone.com/Hor5lH) implementation. – Sebastian Schumann Jun 20 '17 at 11:27
  • Yes, perfect candidate for CSV parser with [space] delimiter. – spender Jun 20 '17 at 11:36
  • @yigames You can use the same pattern as you wrote in your question and use it in the code you used in the comment. The problem is not in the pattern, but in the usage. – Artholl Jun 20 '17 at 11:57

0 Answers0