How can I read Persian line in csv file c#

Question

I want to read a simple CSV file with comma separated with this code:

 var reader = new StreamReader(File.OpenRead(@"d:\34.csv"));


List<string> listA = new List<string>();
        List<string> listB = new List<string>();
        while (!reader.EndOfStream)
        {
            var line = reader.ReadLine();
            var values = line.Split(',');

            listA.Add(values[0]);
            listB.Add(values[1]);
        }
        MessageBox.Show("READ IT!!!");

But when I read the file an debug that code,attention can not read Persian or Arabic character! How can I solve that? I think my file is not valid encoding?

`var reader = new StreamReader(File.OpenRead(@"d:\34.csv"), Encoding.Unicode);` have you tried Unicode? — fubo, Apr 28 '15 at 08:24
Could you put on a Persian csv file so we can test it ourselfs? — Trevi Awater, Apr 28 '15 at 08:43

score 0 · Answer 1 · edited May 23 '17 at 12:06

if your CSV file contains just one line the ReadToEnd could be acceptable, but if you have a log file composed by more than one line then it is better to read line by line using ReadLine of the StreamReader object

link for true answer and more information

 using (StreamReader sr = new StreamReader("c:/temp/34.csv"))
    {
        string currentLine;
        // currentLine will be null when the StreamReader reaches the end of file
        while((currentLine = sr.ReadLine()) != null)
        {
            // Search, case insensitive, if the currentLine contains the searched keyword
            if(currentLine.IndexOf("I/RPTGEN", StringComparison.CurrentCultureIgnoreCase) >= 0)
            {
                 Console.WriteLine(currentLine);
            }

        }
    }

More information

Ibrahim ne soylosan?in my file use the persian character,but when read that i see this character: �� 1393,\"1393,01,01\",\"1393,03,01\" — elnaz irani, Apr 28 '15 at 08:26

VERYNET · Answer 2 · 2015-04-28T08:39:50.397

You can create a class composed of get and set for each line of the CSV . You can then instantiate an object list to retrieve the CSV lines. Try this way :

class Program
{

static void Main(string[] args)
{

var reader = new StreamReader(File.OpenRead(@"YourCSV"),Encoding.Unicode);

 List<Customer> customer = new List<Customer>();

 while (!reader.EndOfStream)
 {
    Customer c = new Customer
    {
        m_line1 = null,
        m_line2 = null,
    };

     var line = reader.ReadLine();
     var tokens = line.Split(',');

     c.m_line1 = tokens[0];
     c.m_line2 = tokens[1];
     customer.Add(c);

 }

   foreach(var s in customer)
   {
      Console.Writline(s);
      Console.Readline();
   }
}
}



class Customer
{
   private string line1;
   public string m_line1
   {
   get
   {
     return line1;
   }

  set
  {
    line1= value;
  }
}

private string line2;
public string m_line2
{
  get
  {
    return line2;
  }

  set
  {
    line2= value;
  }
}

Martijn · Answer 3 · 2015-04-28T12:13:06.663

You will have to pass the character encoding to the StreamReader constructor. There is no such thing as plain text. Reading text requires knowing its encoding.

The line

using (StreamReader sr = new StreamReader("c:/temp/34.csv"))

should be

using (StreamReader sr = new StreamReader("c:/temp/34.csv"), myencoding)

what myencoding is is something only you can know. With what encoding was the file saved? That's the encoding you need there. If the file was generated on Windows, and educated guess of the most likely encoding would be it is UTF-16LE. That encoding is available as Encoding.Unicode - which is a bad name, it should have been Encoding.UTF16LE, but that's the name the .NET framework uses.

Other possible encodings that are supported by StreamReader are listed on https://msdn.microsoft.com/en-us/library/System.Text.Encoding_properties(v=vs.110).aspx

If you don't know with what encoding the file was saved, some encodings leave hints in the form of a Byte order mark sometimes abbreviated to BOM. A byte order mark are the first few bytes of a text document that tell you its encoding. You can find more information on the byte order mark, and some of its values on http://en.wikipedia.org/wiki/Byte_order_mark

Relying on the BOM is generally a bad idea, because

it's not a full-proof solution: some encodings don't use a BOM, or make the BOM optional
Even if you successfully determine the encoding, that doesn't mean that StreamReader knows how to handle that encoding (though this is unlikely, but possible)
the BOM might not be a BOM at all, but be part of the actual text (also unlikely but possible)

In some cases it is impossible to know the encoding of a file, notably if the file comes from a file upload on the web, or if someone just mailed you the file, and they don't know how they encoded it. This can be a good reason not to allow "plain text" uploads (which is reasonable because, it can do with a little repetition, there is no such thing as plain text).

tl;dr: The most likely thing to work is one of

using (StreamReader sr = new StreamReader(File.OpenRead(@"c:/temp/34.csv"),Encoding.Unicode) {
  ...
}

or

using (StreamReader sr = new StreamReader(File.OpenRead(@"c:/temp/34.csv"),Encoding.UTF8)

or

using (StreamReader sr = new StreamReader(File.OpenRead(@"c:/temp/34.csv"),Encoding.UTF32)

How can I read Persian line in csv file c#

3 Answers3