1

I am working on a project where I am trying to take Excel files (read in via xlrd) and geocode addresses in them. For this, I am using a list of directories, with each directory entry a separate site.

Something like addressList[0] will result in the following:

{text:u'First name ': u'John',
 text:u'Site City': u'Indio',
 text:u'Site State': u'CA',
 text:u'Last name': u'Doe',
 text:u'Site Phone': u'760-555-1234',
 text:u'Site Zip': u'92201',
 text:u'Site Address1': u'1313 Mockingbird Lane',
 text:u'Site Name': u'Tyrell Industries',
 text:u'Hours': u'Mon-Fri 12:00-1:00',
 text:u'Affliation': u'Boys & Girls Clubs of America'}

(And I just realized in the spreadsheet, "affiliation" was spelled incorrectly. Meh.)

Now, I know from looking around that keys in Python can have spaces in them, and that this shouldn't be a problem. But entering addressList[0]['Site Phone'] results in a KeyError. In fact, trying to get the value of the 'Hours' key results in a similar KeyError.

Based on a question on Unicode keys, I tried the following:

STRING_DATA = dict([(str(k), v) for k, v in addressList[0].items()])

Which resulted in a dictionary with entries like:

"text:u'Site Name'": u'Tyrell Industries',

This is reasonably okay, except I'm now having to access the value via STRING_DATA["text:u'Site Name'"], which seems like a pain.

Is there a quicker/easier way to use the keys?

Community
  • 1
  • 1
  • I don't understand what that data structure is supposed to be. Where are those 'text' prefixes coming from? How are you receiving it, and how are you outputting it? – Daniel Roseman Mar 13 '13 at 20:52
  • I get the data by opening the workbook using `xlrd.workbook_open`, then read in the data by creating a list and appending the dictionaries: `addressList.append(dict(zip(column_names, siteSheet.row_values(rownum))))` where `column_names` are obtained from the first row of the sheet. (The 'text:u' prefixes are added in automatically.) – techstepper Mar 13 '13 at 20:57

2 Answers2

2

text:u'First name ': u'John', is not a valid dict entry.

the reason you have text: prefixs before keys like text:u'First Name' is, because you're using xlrd cells as dict's keys.

you should explicitly extract values from cells by using cell.value

something like:

new_keys = [k.value for k in addresslist[0]]

thkang
  • 11,215
  • 14
  • 67
  • 83
  • Using `cell_value` when I read in the field names from the Excel file works well. Something like: `for i in range(NUMBER_OF_COLUMNS): column_names.append(site_sheet.cell_value(0, i))` gives me keys I can reference without a problem. Thanks! – techstepper Mar 13 '13 at 21:46
1

All you should need to get your column headings is this:

column_names = site_sheet.row_values(rowx=0, end_colx=NUMBER_OF_COLUMNS)

However it appears the headings need a bit of a scrub e.g. u'First name ' and u'Last name' could be made conform with other headings, by removing leading and trailing spaces and enforcing Title Case:

column_names = [x.strip().title()
    for x in site_sheet.row_values(row=0, end_colx=NUMBER_OF_COLUMNS)]
John Machin
  • 81,303
  • 11
  • 141
  • 189
  • Thank you. I'm getting this from a third party source, so there's a bit of cleanup I know I can still do. But the main problem was doing something I probably shouldn't have been doing when values from the worksheet. – techstepper Mar 14 '13 at 19:39