2

If below two code snippets are giving same results then please let me know what is the advantage of using csv.reader

1)

import csv
f = open('a.csv', 'rb')
spamreader = csv.reader(f)
for a in spamreader:
    print a
  1. f = open('a.csv', 'rb') for a in f: print a.split(',')

Result:

['SNO', ' Name', ' Dept']
['1', ' Def', ' Electronics']
['2', 'Abc', 'Computers']
S.B
  • 13,077
  • 10
  • 22
  • 49
user2225190
  • 547
  • 1
  • 6
  • 18
  • 2
    in your case, it's the same. But when there are quotes or commas in the fields, csv is really easier to deal with. – Jean-François Fabre Jan 13 '17 at 15:05
  • 1
    Have you noticed how you did not provide a separator to `csv.reader`? – Abdou Jan 13 '17 at 15:05
  • 2
    Change the first line in your CSV to `1,"Evil,Commas",Electronics` and you'll see the difference :-) – Sean Vieira Jan 13 '17 at 15:05
  • 1
    @Abdou the `csv` module does not do any magic there. Comas are the default. There is a `sniffer` method in the module that can try and guess the settings but that's another story. – Ma0 Jan 13 '17 at 15:11
  • @Ev.Kounis, it's still magic to me if you did not have to specify or split in this case. I am aware that it gets more complicated with much messier files. But as you said, that's another story. – Abdou Jan 13 '17 at 15:14
  • @user2225190 I am all for writing custom parsers even in cases where battle-tested modules exist. That being said, you will most likely at some point come to realise why the modules are so valuable. That is when you come across weird casess in which your simplistic approach will either fail or will be to slow or, or.. But you have to understand the problem before you understand or appreciate any solution. That is nothing more than my humble opinion though. – Ma0 Jan 13 '17 at 15:19

3 Answers3

5

In your example, I don't see an advantage of using the csv module. However, things change when you have quoted elements:

SNO,Name,Dept
1,Def,Electronics
2,Abc,Computers
3,"here is the delimiter, in quotes",ghi

With the csv module, it is simply

import csv
with open('a.csv', 'rb') as f:
    csv_reader = csv.reader(f, delimiter=',', quotechar='"')
    for row in csv_reader:
        print(row)

but splitting would ignore the quotes.

(Anyway, I recommend using pandas as shown here for reading CSV files. Please also note that you should close files you've opened. By using the with statement, you can do it implicity.)

Community
  • 1
  • 1
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
3

I clarified your question, since csv.reader() is an iterator. Your question compares the csv module with writing your own parser.

The advantage is that the csv module actually implements the CSV format (including quotes, escaping, and embedded newlines, etc.) while the naive parser you wrote does none of that. So it is more correct, and actually simpler code too!, to use the csv module.

dsh
  • 12,037
  • 3
  • 33
  • 51
1

As already mentioned by others, it is useful when you have quoted elements. But I'm going to show one another brilliant feature of CSV module.

What if you receive different files from somewhere else and you don't know which delimiters they used for separating the fields? You can't predict and also you don't want to implement a logic to parse all the possible delimiters when CSV module has Sniffer class and has already implemented that for you.

sample:

line1|hey|20|40|50
line2|hey|20|40|50
line3|hey|20|40|50
line4|hey|20|40|50
line5|hey|20|50|60
line6|hey|20|50|60
...

code:

import csv

# The more you increase this value, the more accurate CSV can guess the dialect.
sample_bytes = 200
sniffer = csv.Sniffer()

with open('s.txt') as f:
    dialect_object = sniffer.sniff(f.read(sample_bytes))

    # to start from the beginning
    f.seek(0)

    reader = csv.reader(f, dialect=dialect_object)
    for line in reader:
        print(line)

output:

['line1', 'hey', '20', '40', '50']
['line2', 'hey', '20', '40', '50']
['line3', 'hey', '20', '40', '50']
['line4', 'hey', '20', '40', '50']
...
S.B
  • 13,077
  • 10
  • 22
  • 49