0

Im trying to split a string into sub parts to pass into a linq database, but ive come up with a problem. The file is a .csv file so its split up by commas example :

1,Ms,Aleshia,Tomkiewicz,14 Taylor St,St. Stephens Ward,Kent,CT2 7PP,01835-703597,atomkiewicz@hotmail.com.

However some of the data contains commas in the data field like a county/address is split up with a comma however i dont want it to split i want it to keep that data all together for example address: London,Wimbledon.

im using this code currently to do the chopping:


 public static List<string> ReturnCSVFromWeb(string url)
        {
            System.Net.WebClient client = new WebClient();
            string CSVContent = client.DownloadString(url);

            List<string> splitted = new List<string>();
            string csvFile = CSVContent;
            string[] tempStr;

            tempStr = csvFile.Split(',','\n');

            foreach (string item in tempStr)
            {
                if (!string.IsNullOrWhiteSpace(item))
                {
                    splitted.Add(item);
                }
            }

            return splitted;

        }
Backs
  • 24,430
  • 5
  • 58
  • 85
H.Pearson
  • 9
  • 3
  • 3
    Please **[Stop Rolling Your Own CSV Parser](http://www.secretgeek.net/csv_trouble)**. Also respect a standard format for your data. You don't seem to have CSV but rather some custom format if `London,Wimbledon` should not be separated. You should be thinking of how to properly encode your data in the first place. – Darin Dimitrov Nov 29 '15 at 15:54
  • not allowed to use 3rd party software only standard .Net libraries – H.Pearson Nov 29 '15 at 15:57
  • 2
    Then better be prepared to descend to the abyss. And please don't tag your question with `CSV` because clearly that's not the format that you have as input. rather specify that you have some custom format that somebody invented and make sure that you define very precisely this format before trying to parse it. – Darin Dimitrov Nov 29 '15 at 15:57
  • Are you sure that the address values are not just individual columns? – DavidG Nov 29 '15 at 15:59
  • sorry i meant we have an address like this in one field :Broxburn, Uphall and Winchburg which is sperated by commas – H.Pearson Nov 29 '15 at 15:59
  • sorry for bad description the csv file is like this : 14,Mr,Niesha,Bruch,24 Bolton St,"Broxburn, Uphall and Winchburg",West Lothian,EH52 notice how "Broxburn , Uphall" has a comma in the address – H.Pearson Nov 29 '15 at 16:02
  • Does *all* addresses have commas in them in your data? – Lasse V. Karlsen Nov 29 '15 at 16:06
  • no not all address have commas in the data only ones surronded by "" like "Broxburn, Uphall and Winchburg" – H.Pearson Nov 29 '15 at 16:08
  • As @Backs said in his answer's comments, split by `"` first - you'll get an address, then split other parts by `,` ... Also, your company needs to re-evaluate 2 things, it's stupid to do everything in house and straight up reject any 3rd party as the core .net framework wont even implement some features due to already well established libraries such as Newtonsoft.Json and many others. The second, if you store data in a format as CSV, don't use illegal characters in the data in the first place. If you absolutely needed `,` in the data, then the CSV shou've used `;` or `\t` separators... – LostBalloon Nov 29 '15 at 16:30
  • Possible duplicate of [How to split csv whose columns may contain ,](http://stackoverflow.com/questions/6542996/how-to-split-csv-whose-columns-may-contain) – LostBalloon Nov 29 '15 at 16:39

2 Answers2

6

There is no way to solve this problem unless you know ahead of time which data contains commas. A better option would be to have each entry in the csv surrounded by double quote, and then seperated by comma

yemista
  • 433
  • 5
  • 15
  • hmmm thought so but we are not allowed to alter the data as part of a coursework :/ – H.Pearson Nov 29 '15 at 15:57
  • sorry for bad description the csv file is like this : 14,Mr,Niesha,Bruch,24 Bolton St,"Broxburn, Uphall and Winchburg",West Lothian,EH52 notice how "Broxburn , Uphall" has a comma in the address – H.Pearson Nov 29 '15 at 16:04
  • Notice how that string is enclosed in double quotes. That makes it standard CSV. Why did you omit the quotes in the original post? Did you not see them as significant? – Jim Garrison Nov 29 '15 at 17:15
2

If number of rows is fixed (M) and only address column has comma, you can do this:

  1. Split row
  2. Take first N1 columns before address
  3. Take last N2 columns after address
  4. Take M-N1-N2 columns from middle, join them - it's an address
Backs
  • 24,430
  • 5
  • 58
  • 85
  • 3
    This assumes of course that only one column contains a comma in the value – Rotem Nov 29 '15 at 16:00
  • sorry for bad description the csv file is like this : 14,Mr,Niesha,Bruch,24 Bolton St,"Broxburn, Uphall and Winchburg",West Lothian,EH52 notice how "Broxburn , Uphall" has a comma in the address – H.Pearson Nov 29 '15 at 16:04
  • 1
    @hpearson split by `"` first - you'll get an address, then split other parts by `,` – Backs Nov 29 '15 at 16:12