0

I'm working with a CSV file and want to output:

  • Total sales
  • Total sales for
  • Lowest month sales
  • Highest monthly sales

Code

import csv

def read_data():
    data = []

    with open('sales.csv', 'r') as sales_csv:
        spreadsheet = csv.DictReader(sales_csv)
        for row in spreadsheet:
            data.append(row)

        return data

def run():
    data = read_data()
    sales_by_month = {}
    total_sales = {}

    for row in data:
        month = row['month']
        sale = int(row['sales'])
        if month not in sales_by_month:
            sales_by_month[month] = []
        sales_by_month[month].append(sale)

    for month, sales in sales_by_month.items():
        total_sales = sum(sales)
        print('Total sales for {}: {}'.format(month, total_sales))


run()


def run():
    data = read_data()

    sales = []
    for row in data:
        sale = int(row['sales'])
        sales.append(sale)

    total = sum(sales)
    print('Total sales: {}'.format(total))

run()

def run():
    data = read_data()

    lowest = []

    for row in data:
        month = row['month']
        sale = int(row['sales'])
        lowest.append(sale)
    lowest_month = min(lowest)
    print('Lowest month sales in {}: {}'.format(month, lowest_month))

run()

Issue

My code works, except the Lowest month sales and haven't done the highest monthly yet.

The value for 'sales amount' comes back correct, but it prints the wrong month. The month comes back as dec (should be feb).

Question

How do I get the correct month for lowest monthly sales?

hc_dev
  • 8,389
  • 1
  • 26
  • 38
Lucia_End
  • 1
  • 1
  • 1
    Format the code correctly as code (the `{}` button in the editor can help). – Michael Butscher Feb 22 '23 at 21:15
  • 1
    Thank you - first time posting so a bit off the mark – Lucia_End Feb 22 '23 at 21:21
  • 2
    Just use three tildas to wrap you code blocks – georgwalker45 Feb 22 '23 at 21:23
  • Learn the quasi-standard [markdown (fenced code-blocks)](https://markdown.land/markdown-code-block#1) surrounded by 3 backticks with optional language to for code-highlighting. Together with a basic structure (Context/Wanted, Code, Issue, Question) this improved question will help to find a quick answer. – hc_dev Feb 22 '23 at 22:45

2 Answers2

0

A simple trick is to find the minimum of tuples of sale value and month. Only drawback is that if the minimum sale value appears multiple times, the month which comes first in alphabet is chosen.

def run():
    data = read_data()

    lowest = []

    for row in data:
        month = row['month']
        sale = int(row['sales'])
        lowest.append((sale, month))
    lowest_month = min(lowest)
    print('Lowest month sales in {}: {}'.format(lowest_month[1], lowest_month[0]))

run()
Michael Butscher
  • 10,028
  • 4
  • 24
  • 25
0

Provide your input as example

Assume you have this CSV:

month,sales
jan, 0
feb, 2
dec, 12

Issues in code

Assume read_data() is working correctly and returns data as dict (in Python3 technically an OrderedDict):

[OrderedDict([('month', 'jan'), ('sales', ' 0')]), OrderedDict([('month', 'feb'), ('sales', ' 2')]), OrderedDict([('month', 'dec'), ('sales', ' 12')])]

Then your function:

def run():
    data = read_data()

    lowest = []

    for row in data:
        month = row['month']  # always the current iteration's month
        sale = int(row['sales'])
        lowest.append(sale)   # add the sales quantity or amount
    lowest_month = min(lowest)  # WARN: only find the lowest sales, not corresponding month
    print('Lowest month sales in {}: {}'.format(month, lowest_month))
    # month is still the last iterated one, here: dec
run()

will only sort on the lowest list, which contains only the sales amount or quantity, not the related month.

How to solve

You have to sort the entire rows, including the month. Using the for-loop that you already have:

    for row in data:
        month = row['month']  # always the current iteration's month
        sale = int(row['sales'])
        sales.append(sale)  # the sales are added ordered (index)
        months.append(month)   # also add the month to the ordered list

Note the index of list-items, which is same for each row in your CSV, same for both lists (the sales and the months). This is the relation you need.

See Finding the index of an item in a list

Now correlate after min:

    lowest_sale = min(sales)  # TIP: find the lowest sales, and find its index
    index = months.index(lowest_month)
    lowest_month = months[index]  # the related month has the same index (CSV row)
    print('Lowest month sales in {}: {}'.format(lowest_month, lowest_sale))

Drawback of min with index-lookup

If the lowest sales (e.g. 0) appears multiple times, for different months in your CSV, then the index-finding might lead to unexpected results.

This inaccuracy was already mentioned in the answer "sort tuples approach" of Michael.

Improved sorting

Use Python's built-in sort-functions with the key parameter. See Sorting HOW TO — Python 3.11.2 documentation instead of your loop and min:

The key-parameter accepts a lambda or a function. Here I will show you the lambda which sorts the dict-entries (CSV rows) by field sales. The default sort order is ascending, so it will return a list with the "lowest sales" as first element:

sorted_rows = sorted(data, key=lambda row: int(row['sales']))  # sort ascending (lowest first)
print(sorted_rows)
lowest_entry = sorted_rows[0]  # get first element by index 0 (lowest sales)
print(lowest_entry)
lowest_month = lowest_entry['month']
lowest_sales = lowest_entry['sales']
print('Lowest month sales in {}: {}'.format(lowest_month, lowest_sales))

Use sorting-functions' additional parameter reverse=True to sort descending (highest first).

hc_dev
  • 8,389
  • 1
  • 26
  • 38