16

I am to take a csv with 4 columns: brand, price, weight, and type.

The types are orange, apple, pear, plum.

Parameters: I need to select the most possible weight, but by selecting 1 orange, 2 pears, 3 apples, and 1 plum by not exceeding as $20 budget. I cannot repeat brands of the same fruit (like selecting the same brand of apple 3 times, etc).

I can open and read the csv file through Python, but I'm not sure how to create a dictionary or list of tuples from the csv file?

For more clarity, here's an idea of the data.

Brand, Price, Weight, Type
brand1, 6.05, 3.2, orange
brand2, 8.05, 5.2, orange
brand3, 6.54, 4.2, orange
brand1, 6.05, 3.2, pear
brand2, 7.05, 3.6, pear
brand3, 7.45, 3.9, pear
brand1, 5.45, 2.7, apple
brand2, 6.05, 3.2, apple
brand3, 6.43, 3.5, apple
brand4, 7.05, 3.9, apple
brand1, 8.05, 4.2, plum
brand2, 3.05, 2.2, plum

Here's all I have right now:

import csv
test_file = 'testallpos.csv'
csv_file = csv.DictReader(open(test_file, 'rb'), ["brand"], ["price"], ["weight"], ["type"])
dawg
  • 98,345
  • 23
  • 131
  • 206
Sean
  • 181
  • 1
  • 2
  • 7
  • Yeah, i got some feedback to change the title and clarity of the question. This is more specific and gives a better understanding of the problem. – Sean Sep 13 '13 at 00:34
  • then please delete the old question, no point in having two of them about the same topic. Also please post your code so far, this will make people much more likely to help you – Tymoteusz Paul Sep 13 '13 at 00:41
  • Yeah I deleted it a while ago, don't know if it takes a while or not to be removed. New to the site, sorry! – Sean Sep 13 '13 at 00:54
  • All of the field names need to be in a single list, like so `csv.DictReader(open(test_file, 'rb'), ["brand", "price", "weight", "type"])`. – Asad Saeeduddin Sep 13 '13 at 01:13
  • Does your file have the blanks as your example does? – dawg Sep 13 '13 at 01:39
  • My file doesn't, just used that for readability. Replace the spaces with commas and that's the csv – Sean Sep 13 '13 at 01:44

2 Answers2

31

You can ponder this:

import csv

def fitem(item):
    item=item.strip()
    try:
        item=float(item)
    except ValueError:
        pass
    return item        

with open('/tmp/test.csv', 'r') as csvin:
    reader=csv.DictReader(csvin)
    data={k.strip():[fitem(v)] for k,v in reader.next().items()}
    for line in reader:
        for k,v in line.items():
            k=k.strip()
            data[k].append(fitem(v))

print data 

Prints:

{'Price': [6.05, 8.05, 6.54, 6.05, 7.05, 7.45, 5.45, 6.05, 6.43, 7.05, 8.05, 3.05],
 'Type': ['orange', 'orange', 'orange', 'pear', 'pear', 'pear', 'apple', 'apple', 'apple', 'apple', 'plum', 'plum'], 
 'Brand': ['brand1', 'brand2', 'brand3', 'brand1', 'brand2', 'brand3', 'brand1', 'brand2', 'brand3', 'brand4', 'brand1', 'brand2'], 
 'Weight': [3.2, 5.2, 4.2, 3.2, 3.6, 3.9, 2.7, 3.2, 3.5, 3.9, 4.2, 2.2]}

If you want the csv file literally as tuples by rows:

import csv
with open('/tmp/test.csv') as f:
    data=[tuple(line) for line in csv.reader(f)]

print data
# [('Brand', ' Price', ' Weight', ' Type'), ('brand1', ' 6.05', ' 3.2', ' orange'), ('brand2', ' 8.05', ' 5.2', ' orange'), ('brand3', ' 6.54', ' 4.2', ' orange'), ('brand1', ' 6.05', ' 3.2', ' pear'), ('brand2', ' 7.05', ' 3.6', ' pear'), ('brand3', ' 7.45', ' 3.9', ' pear'), ('brand1', ' 5.45', ' 2.7', ' apple'), ('brand2', ' 6.05', ' 3.2', ' apple'), ('brand3', ' 6.43', ' 3.5', ' apple'), ('brand4', ' 7.05', ' 3.9', ' apple'), ('brand1', ' 8.05', ' 4.2', ' plum'), ('brand2', ' 3.05', ' 2.2', ' plum')]
dawg
  • 98,345
  • 23
  • 131
  • 206
2
import csv
with open("some.csv") as f:
       r = csv.reader(f)
       print filter(None,r)

or with list comprehension

import csv
with open("some.csv") as f:
       r = csv.reader(f)
       print [row for row in r if row]

for comparison

In [3]: N = 100000

In [4]: the_list = [randint(0,3) for _ in range(N)]

In [5]: %timeit filter(None,the_list)
1000 loops, best of 3: 1.91 ms per loop

In [6]: %timeit [i for i in the_list if i]
100 loops, best of 3: 4.01 ms per loop

[edit] since your actual output does not have blanks you donot need the list comprehension or the filter you can just say list(r)

Final answer without blank lines

import csv
with open("some.csv") as f:
       print list(csv.reader(f))

if you want dicts you can do

import csv
with open("some.csv") as f:
       reader = list(csv.reader(f))
       print [dict(zip(reader[0],x)) for x in reader]
       #or
       print map(lambda x:dict(zip(reader[0],x)), reader)
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • 1
    don't do `filter(bool, ...)`, use `filter(None, ...)`, `filter()` has a special case that can avoid the excessive conversion to bool (since the result of calling bool is also checked for truthyness). Also, don't do `filter(..., list(seq))`, just do `filter(..., seq)`, filter knows how to iterate over sequences, the intermediate list just wastes space. – SingleNegationElimination Sep 13 '13 at 01:14
  • yeah not sure what I was thinking with the list conversion ... I didnt know about the None filter – Joran Beasley Sep 13 '13 at 01:44
  • Worthless use of `filter` You can just do `print [e for e in r]` which is faster and more readable. -1 –  Sep 13 '13 at 02:12
  • please explain? those are not equivelent.... why is this a waste of filter? (See edit for timeit results) – Joran Beasley Sep 13 '13 at 02:18
  • The OP stated there were no blank lines in his CSV file. He posted it incorrectly. If there were blank lines -- you would have a point... –  Sep 13 '13 at 02:20
  • he had blank lines ... he changed it later (after I answered) in which case he can just call `list(r)` he doesnt even need a list comprehension – Joran Beasley Sep 13 '13 at 02:23
  • Thanks for the help! Playing around now that I've got the csv into something I can mess with. – Sean Sep 13 '13 at 02:53