-1

I'm having two problems with reading my .csv file with streamreader. What I'm trying to do is get the values, put them into variables which I'll be using later on, inputting the values into a browser via Selenium. Here's my code (the Console.Writeline at the end is just for debugging):

        string[] read;
        char[] seperators = { ';' };
        StreamReader sr = new StreamReader(@"C:\filename.csv", Encoding.Default, true);
        string data = sr.ReadLine();

        while((data = sr.ReadLine()) != null)
        {
            read = data.Split(seperators);
            string cpr = read[0];
            string ydelsesKode = read[1];
            string startDato = read[3];
            string stopDato = read[4];
            string leverandoer = read[5];
            string leverandoerAdd = read[6];
            Console.WriteLine(cpr + " " + ydelsesKode + " " + startDato + " " + stopDato + " " + leverandoer + " " + leverandoerAdd);
        }

The code in and of itself works just fine - but I have two problems:

  1. The file has values in Danish, which means I get åøæ, but they're showing up as '?' in console. In notepad those characters look fine.
  2. Blank values also show up as '?'. Is there any way I can turn them into a blank space so Selenium won't get "confused"? Sample output:

1372 1.1 01-10-2013 01-10-2013 Bakkev?nget - dagcenter ?

Bakkev?nget should be Bakkevænget and the final '?' should be blank (or rather, a bank space).

P01y6107
  • 69
  • 9
  • Why do you provide `Encoding.Default`? Also, don't roll your own CSV parsing code, this will break when a field contains the value ";". The replacement character `?` means that a certain byte sequence can't be translated to a code point of the chosen encoding. This also means that your "blank" values aren't blank. – CodeCaster Jul 22 '20 at 10:05
  • The Encoding.Default was an attempt at getting the Danish characters to work - with or without it I get the same result. And there are no instances of ';' anywhere in the file, so no breaking. – P01y6107 Jul 22 '20 at 10:07
  • Do not write your own CSV parser. There are edge cases (like you have found) that nuget packages have already found and sorted. – Neil Jul 22 '20 at 10:09
  • @Neil any ones you'd suggest? – P01y6107 Jul 22 '20 at 10:10
  • Open up nuget package manager and take a look. There are many to choose from. – Neil Jul 22 '20 at 10:14
  • Perhaps there's a problem with your file encoding. I created a file with notepad++ using UTF-8 encoding, it only contained one line "Bakkevænget;Bakkevænget" your code works fine and gave me the expected result. Converting the file to ANSI however led me to your result. – Pumkko Jul 22 '20 at 10:30
  • @Pumkko The file was created in Excel (Danish locale) and verified in notepad. Neither have problems, it's only the console that screws it up :-/ – P01y6107 Jul 22 '20 at 10:32
  • @P01y6107 Same with my sample, with ANSI encoding the file looks fine on notepad++ but not on the console – Pumkko Jul 22 '20 at 10:33
  • Oh and the console [can't print such characters by default](https://stackoverflow.com/questions/5750203/how-to-write-unicode-characters-to-the-console). – CodeCaster Jul 22 '20 at 11:35

1 Answers1

0

"Fixed" it by going with tab delimited unicode .txt file instead of .csv. For some reason my version of excel doesn't have the option to save in unicode .csv...

Don't quite understand the problem of "rolling my own" parser, but maybe someday someone will take the time to explain it to me better. Still new-ish at this c# stuff...

P01y6107
  • 69
  • 9