2

I have a csv file which contains among other things the names and the phone numbers. I'm only interested in a name only if I've its phone number.

with open(phone_numbers) as f:
    reader = csv.DictReader(f)
    names =  [record['Name'] for record in reader if record['phone']]

But I also want the respective phone number, I've try this:

user_data = {}
with open(phone_numbers) as f:
    reader = csv.DictReader(f)
    user_data['Name'] =  [record['Name'] for record in reader if record['phone']]
    user_data['phone'] = [record['phone'] for record in reader if record['phone']]

But for the second item I got an empty string, I'm guessing that record is a generator and that's why I can iterate over it twice.

I've try to use tuples, but only had worked this way:

user_data = {}
with open(phone_numbers) as f:
    reader = csv.DictReader(f)
    user_data['Name'] =  [(record['Name'],record['phone']) for record in reader if record['phone']]

In that case I have the two variables, phone and Name stored in user_data['Name'], that isn't what I want.

And if I try this:

user_data = {}
with open(phone_numbers) as f:
    reader = csv.DictReader(f)
    user_data['Name'],user_data['phone'] =  [(record['Name'],record['phone']) for record in reader if record['phone']]

I got the following error:

ValueError: too many values to unpack

Edit:

This is a sample of the table:

+--------+---------------+
| Phone | Number |
+--------+---------------+
| Luis | 000 111 22222 |
+--------+---------------+
| Paul | 000 222 3333 |
+--------+---------------+
| Andrea | |
+--------+---------------+
| Jorge | 111 222 3333 |
+--------+---------------+

So all rows have a Name but not all have phones.

Luis Ramon Ramirez Rodriguez
  • 9,591
  • 27
  • 102
  • 181

6 Answers6

1

Your guess is quite right. If this is the approach you want take - iteration twice, you should use seek(0)

reader = csv.DictReader(f)
user_data['Name'] =  [record['Name'] for record in reader if record['phone']]
f.seek(0)   # role back to begin of file ...
reader = csv.DictReader(f)
user_data['phone'] = [record['phone'] for record in reader if record['phone']]

However, this is not very efficient and you should try and get your data in one roll. The following should do it in one roll:

user_data = {}

def extract_user(user_data, record):
    if record['phone']:
        name = record.pop('name')
        user_data.update({name: record})

[extract_user(user_data, record) for record in reader]

Example:

In [20]: cat phones.csv
name,phone
hans,01768209213
grettel,
henzel,123457123

In [21]: f = open('phones.csv')

In [22]: reader = csv.DictReader(f)

In [24]: %paste
user_data = {}

def extract_user(user_data, record):
    if record['phone']:
        name = record.pop('name')
        user_data.update({name: record})

[extract_user(user_data, record) for record in reader]

## -- End pasted text --
Out[24]: [None, None, None]

In [25]: user_data
Out[25]: {'hans': {'phone': '01768209213'}, 'henzel': {'phone': '123457123'}}
oz123
  • 27,559
  • 27
  • 125
  • 187
1

I think there is a much easier approach Because it is a csv file since there are column headings as you indicate then there is a value for phone in each row, it is either nothing or something - so this tests for nothing and if not nothing adds the name and phone to user_data

import csv
user_data = []
with open(f,'rb') as fh:
   my_reader = csv.DictReader(fh)
   for row in my_reader:
       if row['phone'] != ''
           user_details = dict()
           user_details['Name'] = row['Name']
           user_details['phone'] = row['phone']
           user_data.append(user_details)

By using DictReader we are letting the magic happen so we don't have to worry about seek etc.

If I did not understand and you want a dictionary then easy enough

import csv
user_data = dict()
with open(f,'rb') as fh:
   my_reader = csv.DictReader(fh)
   for row in my_reader:
       if row['phone'] != ''
           user_data['Name'] = row['phone']
PyNEwbie
  • 4,882
  • 4
  • 38
  • 86
  • the OP wanted a dictionary as final result, your construct will give him a list of dictionaries – oz123 Apr 03 '16 at 22:01
  • Thanks I am still not clear but both options will work – PyNEwbie Apr 03 '16 at 22:03
  • @PyNEwbie I've try your second code, I got one phone number assigned to a name, but I want with the name and the phone, if the phone exist. Also for some reason I'm only getting the value of one row, the file has several rows. – Luis Ramon Ramirez Rodriguez Apr 03 '16 at 22:23
  • 1
    @Luis getting the value of one row probably because python dicts don't support duplicate keys - the last one wins. If you need duplicate keys, possible workarounds here: http://stackoverflow.com/questions/10664856/make-dictionary-with-duplicate-keys-in-python – kjarsenal Apr 03 '16 at 22:49
1

Is it possible that what you're looking for is throwing away some info in your data file?

In [26]: !cat data00.csv
Name,Phone,Address
goofey,,ade
mickey,1212,heaven
tip,3231,earth

In [27]: f = open('data00.csv')

In [28]: r = csv.DictReader(f)

In [29]: lod = [{'Name':rec['Name'], 'Phone':rec['Phone']} for rec in r if rec['Phone']]

In [30]: lod
Out[30]: [{'Name': 'mickey', 'Phone': '1212'}, {'Name': 'tip', 'Phone': '3231'}]

In [31]: 

On the other hand, should your file contain ONLY Name and Phone columns, it's just

In [31]: lod = [rec for rec in r if rec['Phone']]
gboffi
  • 22,939
  • 8
  • 54
  • 85
1

You can use dict to convert your list of tuple into dictionary. Also you need to use get if you have record without phone value.

import csv

user_data = {}
with open(phone_numbers) as f:
    reader = csv.DictReader(f)
    user_data = dict([(record['Name'], record['phone']) for record in reader if record.get('phone').strip())

If you want a list of names and phones separately you can use the * expression

with open(phone_numbers) as f:
    reader = csv.DictReader(f)
    names, phones = zip(*[(record['name'], record['value']) for record in reader if record.get('phone').strip()])
styvane
  • 59,869
  • 19
  • 150
  • 156
  • thanks, both worked. Will the dict approach work for more than two items? also it takes the first value as the key, this means that it will broke if there are duplicate values? – Luis Ramon Ramirez Rodriguez Apr 03 '16 at 22:41
  • 1
    @Yes it will work work for more than two items. It will not break if you have duplicate `name` in you file but only the last value will be maintain. The best thing to do if you have a duplicate key, I mean `name` is keep your result as list of `tuple`. BTW that is what `tuple` is used. Also don't forget to accept the answer if it helped. – styvane Apr 03 '16 at 22:52
1

I normally use row indexing:

input = open('mycsv.csv', 'r')
user_data = {}

for row in csv.reader(input):
    if row[<row # containing phone>]:
        name = row[<row # containing name>]
        user_data[name] = row[<row # containing phone>]
kjarsenal
  • 934
  • 1
  • 12
  • 35
1

You were correct the whole time, except for the unpacking.

result = [(record["name"], record["phone"]) for record in reader if record["phone"]]
# this gives [(name1, phone1), (name2,phone2),....]

You have to do [dostuff for name, phone in result] not name,phone = result, which does not make sense semantically and syntactically.

C Panda
  • 3,297
  • 2
  • 11
  • 11