1

I'm new to python and try to comprehend how I can use the filter function on an csv.DictReader to filter rows from an csv file. filter() can be used on an "iterable" and as far as I understand the DictReader fits this definition.

However when I try

f = open('file1.csv', 'r')       
dialect = csv.Sniffer().sniff(f.read(1024))
f.seek(0)
reader = csv.DictReader(f, None, None, None, dialect)

filteredReader = filter(None, reader) #None will be replaced with my function
for i in filteredReader:
    print(i)

I get TypeError: normcase() argument must be str or bytes, not 'DictReader'.

Please note, that I don't want to filter on the filepointer (e.g. here), but on parsed csv rows. Do you have an idea how to do that?

Community
  • 1
  • 1
ACNB
  • 816
  • 9
  • 18
  • What is `filteredReader` here? What is the full traceback? When using `dialect`, there is no need to pass in 3 `None` arguments, just use `csv.DictReader(f, dialect=dialect)` instead. – Martijn Pieters Apr 14 '14 at 15:58
  • As for the exception you posted, it cannot be raised by the code you posted here; it looks as if you passed `reader` to the [`os.path.normcase()` function](https://docs.python.org/2/library/os.path.html?highlight=normcase#os.path.normcase) or something. The `fr = filteredReader()` line is entirely a red herring here; you are not even using that object. – Martijn Pieters Apr 14 '14 at 16:18
  • Sorry for the confusion. My question is not valid, other pieces of code raised the error. – ACNB Apr 14 '14 at 17:41

2 Answers2

2

Yes, DictReader() can be used as an iterable, and can be used with filter() just fine.

The filter() function is passed each row (a dictionary) in turn and if the function returns True for that row it is passed on:

>>> from io import StringIO
>>> import csv
>>> demo = StringIO('''\
... foo,bar,baz
... 42,88,131
... 17,19,23
... ''')
>>> reader = csv.DictReader(demo)
>>> def only_answers(row):
...     return '42' in row.values()
... 
>>> for row in filter(only_answers, reader):
...     print(row)
... 
{'baz': '131', 'bar': '88', 'foo': '42'}
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0

Filter works as you expect with DictReader.

Suppose you have this csv file:

numeral, English, Spanish
1, one, uno
2, two, dos
3, three, tres
4, four, quatro
5, five, cinco

(Note the leading spaces in col 2 and 3)

And you just want the odd rows:

>>> with open('/tmp/nums.csv') as f:
...      print filter(lambda d: int(d['numeral'])%2, csv.DictReader(f))
[{' English': ' one', 'numeral': '1', ' Spanish': ' uno'}, {' English': ' three', 'numeral': '3', ' Spanish': ' tres'}, {' English': ' five', 'numeral': '5', ' Spanish': ' cinco'}]

Note that the leading spaces came though to our data. OK, try csv.Sniffer this way:

with open('/tmp/nums.csv') as f:
    dialect = csv.Sniffer().sniff(f.read(1024))
    f.seek(0)
    print filter(lambda d: int(d['numeral'])%2, csv.DictReader(f, dialect=dialect)) 
# [{'numeral': '1', 'Spanish': 'uno', 'English': 'one'}, {'numeral': '3', 'Spanish': 'tres', 'English': 'three'}, {'numeral': '5', 'Spanish': 'cinco', 'English': 'five'}]

OK, sniffer successfully found to use skipinitialspaces in the dialect.

dawg
  • 98,345
  • 23
  • 131
  • 206