Search a single column for a particular value in a CSV file and return an entire row

Question

Issue

The code does not correctly identify the input (item). It simply dumps to my failure message even if such a value exists in the CSV file. Can anyone help me determine what I am doing wrong?

Background

I am working on a small program that asks for user input (function not given here), searches a specific column in a CSV file (Item) and returns the entire row. The CSV data format is shown below. I have shortened the data from the actual amount (49 field names, 18000+ rows).

Code

import csv
from collections import namedtuple
from contextlib import closing

def search():
    item = 1000001
    raw_data = 'active_sanitized.csv'
    failure = 'No matching item could be found with that item code. Please try again.'
    check = False

    with closing(open(raw_data, newline='')) as open_data:
        read_data = csv.DictReader(open_data, delimiter=';')
        item_data = namedtuple('item_data', read_data.fieldnames)
        while check == False:
            for row in map(item_data._make, read_data):
                if row.Item == item:
                    return row
                else:
                    return failure

CSV structure

active_sanitized.csv
Item;Name;Cost;Qty;Price;Description
1000001;Name here:1;1001;1;11;Item description here:1
1000002;Name here:2;1002;2;22;Item description here:2
1000003;Name here:3;1003;3;33;Item description here:3
1000004;Name here:4;1004;4;44;Item description here:4
1000005;Name here:5;1005;5;55;Item description here:5
1000006;Name here:6;1006;6;66;Item description here:6
1000007;Name here:7;1007;7;77;Item description here:7
1000008;Name here:8;1008;8;88;Item description here:8
1000009;Name here:9;1009;9;99;Item description here:9

Notes

My experience with Python is relatively little, but I thought this would be a good problem to start with in order to learn more.

I determined the methods to open (and wrap in a close function) the CSV file, read the data via DictReader (to get the field names), and then create a named tuple to be able to quickly select the desired columns for the output (Item, Cost, Price, Name). Column order is important, hence the use of DictReader and namedtuple.

While there is the possibility of hard-coding each of the field names, I felt that if the program can read them on file open, it would be much more helpful when working on similar files that have the same column names but different column organization.

Research

CSV Header and named tuple: What is the pythonic way to read CSV file data as rows of namedtuples?
Converting CSV data to tuple: How to split a CSV row so row[0] is the name and any remaining items are a tuple?
There were additional links of research, but I cannot post more than two.

You don't need to use ``closing()`` on a file - it's context manager already closes the file by default. Just use ``with open(...) as file:``. — Gareth Latty, Apr 18 '12 at 12:54
What is `while check == False:` all about? It will always be true ... — John Machin, Apr 18 '12 at 13:14
Okay. I was basing my usage of closing() on http://docs.python.org/py3k/library/contextlib.html?highlight=closing#contextlib.closing. I read in the docs for open() that it did close the IO, but I just had felt it was better safe then sorry since I was not certain what the difference was. — TheGreatFox, Apr 18 '12 at 13:15
John, that was a leftover from another statement. I tried to check the code thoroughly, but I had missed it. After a while, the lines start to blend in. — TheGreatFox, Apr 18 '12 at 13:16
Yeah, ``closing()`` is for adding support for context managers to objects that don't already support it. As file objects returned by ``open()`` do, you can safely not use it (although it produces basically the end result). — Gareth Latty, Apr 18 '12 at 13:18

Gareth Latty · Accepted Answer · 2012-04-18T13:17:46.693

3

You have three problems with this:

You return on the first failure, so it will never get past the first line.
You are reading strings from the file, and comparing to an int.
_make iterates over the dictionary keys, not the values, producing the wrong result (item_data(Item='Name', Name='Price', Cost='Qty', Qty='Item', Price='Cost', Description='Description')).
```
for row in (item_data(**data) for data in read_data):
    if row.Item == str(item):
        return row
return failure
```

This fixes the issues at hand - we check against a string, and we only return if none of the items matched (although you might want to begin converting the strings to ints in the data rather than this hackish fix for the string/int issue).

I have also changed the way you are looping - using a generator expression makes for a more natural syntax, using the normal construction syntax for named attributes from a dict. This is cleaner and more readable than using _make and map(). It also fixes problem 3.

edited Apr 18 '12 at 13:17

answered Apr 18 '12 at 13:03

Gareth Latty

86,389
17
178
183

Ah okay. Thank you! I see where my error is now. Is there a particular reason why the map() function exists if you are able to write it as a more natural syntax? The reason I had used map() is that there were a number of other examples that had been using it. – TheGreatFox Apr 18 '12 at 13:12
Sometimes it can be more readable to use ``map()`` - for instance ``map(int, list_of_strings)`` makes every item in the list an ``int``, which some prefer to ``[int(x) for x in list_of_strings]`` as you don't need to invent variables. Generally list, set, and dict comps or generator expressions are a better solution. ``map()`` is also a familiar concept to people used to functional languages. It's also a hangover from before the comprehensions existed in the language. – Gareth Latty Apr 18 '12 at 13:14

Search a single column for a particular value in a CSV file and return an entire row

1 Answers1

Linked

Related