Best way to get data from excel and sort it using Python 3.6

Question

I am in need of some major help as I am relatively new to Python. I need to perform a task which will read data from Excel, and then allow me to sort it out in descending order. I have initially started working to put into a dictionary as the Excel File has a header row, and then thousands of rows after which contain data. I know dictionaries are not "sortable" per se, but I thought a dictionary would be the best approach, given what I need. However, I'm wondering if maybe a dataframe would work, but need any and all guidance as I'm struggling to figure this out.

I need to get the 3 largest commodities by share for each county, including the commodity name. For instance, for the first row, Lynd County, I want the following returned - Corn - 19.52, Cattle - 13.68, Strawberry - 12.31. It could also be like this Corn: 19.52, Cattle: 13.68, Strawberry: 12.31. However, I need to sort the data by the values for each commodity.

I found the following code online and used it to read Excel Data into a list structure that consisted of dictionaries, but I'm not really sure if this is the best approach.

import xlrd
from xlrd import open_workbook
book = open_workbook('DictionaryProject.xlsx')
sheet = book.sheet_by_name('Sheet1')
keys = [sheet.cell(0, col_index).value for col_index in range(sheet.ncols)]
dict_list =[]

for row_index in range(1, sheet.nrows):
    d= {keys[col_index]: sheet.cell(row_index, col_index).value
        for col_index in range(sheet.ncols)}
    dict_list.append(d)

print(dict_list)

rafaelc · Accepted Answer · 2018-08-04T20:39:55.007

3

You can use pandas to parse the data, and then just pure python to display however you want.

df = pd.read_excel('filename.xlsx')
d = df.set_index('county').to_dict('index')

For example, for the dataframe

    county    cattle    strawberry  corn
0   CountyA   10        30          1
1   CountyB   2         2           2
2   CountyC   50        15          3

You would get

{'CountyA': {'cattle': 10, 'strawberry': 30, 'corn': 1},
 'CountyB': {'cattle': 2, 'strawberry': 2, 'corn': 2},
 'CountyC': {'cattle': 50, 'strawberry': 15, 'corn': 3}}

And you can do, for instance,

for k,v in d.items():
    i = sorted(v.items(), key=lambda x: x[1], reverse=True)
    print(k, ', '.join(['{}: {}'.format(com, value) for com, value in i]))

You can also use f-strings as per @jpp's comment below

print(k, ', '.join([f'{com}: {value}' for com, value in i]))

which would output

CountyA strawberry: 30, cattle: 10, corn: 1
CountyB cattle: 2, strawberry: 2, corn: 2
CountyC cattle: 50, strawberry: 15, corn: 3

edited Aug 04 '18 at 20:39

answered Aug 04 '18 at 20:16

rafaelc

57,686
15
58
82

1

Nice solution. I would also recommend f-strings (Python 3.6+), i.e. `f'{com}: {value}'`. – jpp Aug 04 '18 at 20:32
1

I think I understand the vast majority of that and it seems simple enough, so thank you. I will give it a go. However, could you explain the last line a little bit. I see it includes a list comprehension, and I see what it is outputting, but just struggling to follow exactly how it is working. Thanks so much! – Shawn Schreier Aug 04 '18 at 20:38
1

JPP - Could you explain the benefit to using f-strings? I'm not that familiar with the concept. – Shawn Schreier Aug 04 '18 at 20:40
I am getting the following error - TypeError: '<' not supported between instances of 'str' and 'float' when it tries to run the i = sorted line. Any thoughts? – Shawn Schreier Aug 04 '18 at 20:51
@jpp, thanks for the suggestion man :) @ Shawn, that happens when you try to compare float and strings. Do you have any string in your `.items()` ? – rafaelc Aug 04 '18 at 20:54
@ShawnSchreier, [F-strings are more efficient](https://stackoverflow.com/a/43123579/9209546). The syntax is also simpler / less verbose. – jpp Aug 04 '18 at 20:57
Rafael - I do have some strings and some floats in my data. Is there a way to fix the code to work for that, or is there a way where I can essentially filter out what is included in the dictionary to only include the fields that have floats? – Shawn Schreier Aug 05 '18 at 00:08

Best way to get data from excel and sort it using Python 3.6

1 Answers1