0

I have input file like this:

input.txt

aa@aa.com bb@bb.com "Information" "Hi there"
cc@cc.com dd@dd.com "Follow up" "Interview"

I have used this method:

string[] words = item.Split(' ');

However, it splits every words with space. I also have spaces in quotes strings but I won't split those spaces.

Basically I want to parse this input from file to this output:

From = aa@aa.com
To = bb@bb.com
Subject = Information
Body = Hi there

How do I split these strings in C#?

bkm
  • 225
  • 2
  • 4
  • 14
  • 1
    Lookup how to read from a file line-by-line, then lookup up to split a string into an array by a space. – Spencer Wieczorek Apr 30 '17 at 03:37
  • I just specified it more with edit. Reading from file is done. Split into an array is done. However it gives me every other words split with space @SpencerWieczorek – bkm Apr 30 '17 at 03:39

5 Answers5

5

Simply you can use Regex as it is said in this question

var stringValue = "aa@aa.com bb@bb.com \"Information\" \"Hi there\"";

var parts = Regex.Matches(stringValue, @"[\""].+?[\""]|[^ ]+")
            .Cast<Match>()
            .Select(m => m.Value)
            .ToList();

//parts: aa@aa.com
          bb@bb.com
          "Information"
          "Hi there"

Also you may try Replace function to remove those " characters.

Community
  • 1
  • 1
Hossein Narimani Rad
  • 31,361
  • 18
  • 86
  • 116
3

The String.Split() method has an overload that allows you to specify the number of splits required. You can get what you want like this:

  1. Read one line at a time
  2. Call input.Split(new string[" "], 3, StringSplitOptions.None) - this returns an array of strings with 3 parts. Since email addresses don't have spaces in them, the first two strings will be the from/to addresses, and the third string will be the subject and message. Assume the result of this call is stored in firstSplit[], then firstSplit[0] is the from address, firstSplit[1] is the to address, and firstSplit[2] is the subject and message combined.
  3. Call firstSplit[2].Split(new string[""" """], 2, StringSplitOptions.None) - this searches for the string " " in the concatenated subject+message from the previous call, which should pinpoint the separator between the end of the subject and the start of the message. This will give you the subject and message in another array. (The double-quotes inside are doubled to escape them)

This assumes you disallow double quotes in your subject and message. If you do allow double quotes, then you need to ensure you escape them before putting it in the file in the first place.

Community
  • 1
  • 1
Phylyp
  • 1,659
  • 13
  • 16
1

You can do this without using regex by just using IndexOf and SubString just put it in a loop if you have multiple emails to parse.

It's not pretty but it would be faster than RegEx if you're doing a lot of them.

string content = @"abba@aa.com dddb@bdd.com ""Information"" ""Hi there""";

string firstEmail = content.Substring(0, content.IndexOf(" ", StringComparison.Ordinal));
string secondEmail = content.Substring(firstEmail.Length, content.IndexOf(" ", firstEmail.Length + 1) - firstEmail.Length);

int firstQuote = content.IndexOf("\"", StringComparison.Ordinal);
string subjectandMessage = content.Substring(firstQuote, content.Length - content.IndexOf("\"", firstQuote, StringComparison.Ordinal));

String[] words = subjectandMessage.Split(new string[] { "\" \"" }, StringSplitOptions.None);

Console.WriteLine(firstEmail);
Console.WriteLine(secondEmail);
Console.WriteLine(words[0].Remove(0,1));
Console.WriteLine(words[1].Remove(words[1].Length -1));

Output:

aa@aa.com 
bb@bb.com
Information
Hi there
prospector
  • 3,389
  • 1
  • 23
  • 40
  • @bkm yeah but one problem is if there is quotes in the email content itself, you'd have to figure that out with the regex and instead of using split in my solution you should do an `IndexOf` on the first quote after the subject and the last quote to get everything in between. – prospector Apr 30 '17 at 04:16
  • there is only one issue in secondEmail. Parses the secondEmail as same size as firstEmail. if firstEmail => aa@aa.com and secondEmail => bb@bbbb.com splits it as secondEmail = bb@bbb.c @porspector – bkm Apr 30 '17 at 05:42
0

As Spencer pointed out, read this file line by line using File.ReadAllLines() method and then apply String.Split[] method with spaces using something like this:

string[] elements = string.Split(new char[0]);

UPDATE

Not a pretty solution, but this is how I think it can work:

   string[] readText = File.ReadAllLines(' ');
   //Take value of first 3 fields by simple readText[index]; (index: 0-2)

   string temp = "";

   for(int i=3; i<readText.Length; i++)
   {    
    temp += readText[i];
   }
Failed Scientist
  • 1,977
  • 3
  • 29
  • 48
  • that won't work since there is going to be spaces in the subject and message of the email. – prospector Apr 30 '17 at 03:41
  • Reading from file and splitting with space is fine but it splits with every spaces. I need to split with spaces for first 2 strings but then with only quotes without space. @TalhaIrfan – bkm Apr 30 '17 at 03:42
0

Requires reference to Microsoft.VisualBasic, but a bit more reliable than Regex:

using (var tfp = new Microsoft.VisualBasic.FileIO.TextFieldParser("input.txt")) {
    for (tfp.SetDelimiters(" "); !tfp.EndOfData;) {
        string[] fields = tfp.ReadFields(); 
        Debug.Print(string.Join(",", fields)); // "aa@aa.com,bb@bb.com,Information,Hi there"
    }
}
Slai
  • 22,144
  • 5
  • 45
  • 53