0

I am trying to read a csv

following is the sample.

"0734306547          ","9780734306548       ","Jane Eyre Pink PP                       ","Bronte Charlotte                        ","FRONT LIST",20/03/2013 0:00:00,0,"PAPERBACK","Y","Pen"

Here is the code i am using read CSV

public void readCSV()
        {
            StreamReader reader = new StreamReader(File.OpenRead(@"C:\abc\21-08-2013\PNZdatafeed.csv"),Encoding.ASCII);
            List<string> ISBN = new List<String>();

            while (!reader.EndOfStream)
            {
                string line = reader.ReadLine();
                if (!String.IsNullOrWhiteSpace(line))
                {
                    string[] values = line.Split(',');
                    if (values[9] == "Pen")
                    {
                        ISBN.Add(values[1]);
                    }
                }
            }
            MessageBox.Show(ISBN.Count().ToString());

        }

I am not able to compare it values if (values[9] == "Pen") because when i debug the code it says values[9] value is \"Pen\""

How do i get rid of the special characters.?

user2636163
  • 39
  • 1
  • 5
  • String.Replace("\"", "") is my first thought – VsMaX Aug 21 '13 at 21:41
  • 1
    This is the debugger attempting to be helpful. The value is actually "Pen", but the debugger is showing it as an escaped string. Link: http://stackoverflow.com/a/9922604/1822164 – It'sNotALie. Aug 21 '13 at 21:41
  • 2
    You should use a CSV-reading library to read it - CSV may look like a trivial format, but it's not, as you're discovering. Think about what happens when one of the values contains a quote, for instance. – RichieHindle Aug 21 '13 at 21:42
  • 1
    @It'sNotALie.: That's slightly misleading. The value does have quotes, but not backslashes. It's five characters, `"` `P` `e` `n` `"` – RichieHindle Aug 21 '13 at 21:42
  • 1
    @RichieHindle I agree, but this is a "I don't know how to use the debugger" issue. – It'sNotALie. Aug 21 '13 at 21:43
  • @It'sNotALie.: It's at least partly a "how do I get rid of these quote characters?" issue. – RichieHindle Aug 21 '13 at 21:43
  • @RichieHindle I know, that is what I point out when I say *The value is actually **"Pen"*** (emphasis mine). – It'sNotALie. Aug 21 '13 at 21:44
  • Keep in mind that splitting by comma will also create a problem when you have commas in the value. i.e: "A", "b", "a, b" should be 3 values, since you are using " as a text qualifier – Nick Bray Aug 21 '13 at 21:58
  • For reading CSV with quoting and escaping, see http://stackoverflow.com/a/769713/4525 – harpo Aug 21 '13 at 22:19

1 Answers1

1

The problem here is that you're splitting the line every time you find , and leaving the data like that. For example, if this is the line you're reading in:

 "A","B","C"

and you split it at commas, you'll get "A", "B", and "C" as your data. According to your description, you don't want quotes around the data.

To throw away quotes around a string:

  1. Check if the leftmost character is ".
  2. If so, check if the rightmost character is ".
  3. If so, remove the leftmost and rightmost characters.

In pseudocode:

 if (data.left(1) == "\"" && data.right(1) == "\"") {
      data = data.trimleft(1).trimright(1)
 }

At this point you might have a few questions (I'm not sure how much experience you have). If any of these apply to you, feel free to ask them, and I'll explain further.

  1. What does "\"" mean?
  2. How do I extract the leftmost/rightmost character of a string?
  3. How do I extract the middle of a string?
Joe
  • 3,804
  • 7
  • 35
  • 55