Python: Sorting data from a text file in an order based upon a new variable

Question

I have a text file that stores data in the following format:

Mike 6 4 3
Terry 4 3 4
Paul 10 7 8

jvecsei helped me yesterday with some code to retrieve the data and identify the highest score for each person. I've modified it slightly so that it now selects the scores and prints an average for each person.

with open ("classa.txt") as f:
    content = f.read().splitlines()
    for line in content:
        splitline = line.split(" ")
        name = splitline[0]
        score = splitline[1:]
        total = int(splitline[-1]) + int(splitline[-2]) + int(splitline[-3])
        average = int(total/3)
        print("{} : {}".format (name, average))

It outputs like this, which is great:

Mike : 4
Terry : 3
Paul : 8

Question: I'd really like it to sort the three people into order of highest score so that they appear with the highest scoring person at the top and the lowest scoring at the bottom, like this:

Paul : 8
Mike : 4
Terry : 3

I have used this in the past to retrieve from a text file and sort into order alphabetically but since the average is a new variable and isn't stored in the text file with the original numbers, I don't know how to reference/implement it.

with open('classc.txt', 'r') as r:
    for line in sorted(r):
        print(line, end='')

Thanks very much for your help. I'm slowly becomeing more familiar with this stuff but I have a long way to go yet.

Check this link http://stackoverflow.com/questions/9001509/how-can-i-sort-a-dictionary-by-key . Although, a dictionary doesn't accept duplicates so if the score is unique you can use a ordered dictionary. Remember also that the dictionary data structure does not have inherent order. Also, take a look of this link http://stackoverflow.com/questions/613183/sort-a-python-dictionary-by-value because you could print the dictionary "sorted" by value (in case you have multiple scores with the same value) — spaghettifunk, Nov 17 '15 at 22:22
That's an interesting link. I hadn't seen that before but chances are, it won't be unique i'm afraid so i'm not sure it will work in this scenario. This is only a sample of the data and there would be instances of a user with scores such as 5 5 5 which may end up causing a problem. Edit: Your second link is pretty helpful too. I'm going to give a proper look into this in the morning when i'm more awake. Thanks — mjolnir, Nov 17 '15 at 22:27

letsc · Answer 1 · 2015-11-17T22:32:41.670

0

Store your Name : Average output into a dictionary and then use the operator.itemgetter to sort your dictionary

d = {}
with open ("file.txt") as f:
    content = f.read().splitlines()
    for line in content:
        splitline = line.split(" ")
        name = splitline[0]
        score = splitline[1:]
        total = int(splitline[-1]) + int(splitline[-2]) + int(splitline[-3])
        average = int(total/3)
        print("{} : {}".format (name, average))
        d[name] = average

sorted_d = sorted(d.items(), key=operator.itemgetter(1), reverse= True)

for i in sorted_d:
    print '{} : {}'.format(*i)

Output:

Paul : 8
Mike : 4
Terry : 3

edited Nov 17 '15 at 22:32

answered Nov 17 '15 at 22:24

letsc

2,515
5
35
54

With this way of doing it, would I need to have the new 'average' variable written to the file so that I can read it into the dictionary? would name = key and average = val? I'm just thinking, or rather, you've given me an idea, that I could write the new data to a temp text file and then split the data into name/average and sort on the -1 column. What do you think? EDIT: Just seen your edit. That looks good. I'll give it a try and let you know how it goes. Thanks very much. – mjolnir Nov 17 '15 at 22:45

Sebastian Wozny · Answer 2 · 2015-11-18T13:13:34.607

0

I took this problem to illustrate some of the nice new features in Python 3.5. You can use the new statistics module and generalized unpacking to solve this in a very pythonic way:

>>> from statistics import mean # Cool new module!

>>> lines =(l.split() for l in open ("classa.txt")) #  Generator consuming the file
# Now split the list by unpacking into name, *scores
>>> persons = ((mean(int(x) for x in scores), name) for name, *scores in lines)
>>> for score in sorted(persons,  reverse=True): #  Some boring I/O
       print("{} : {}".format (score[1], int(score[0])))
>>>
Paul : 8
Mike : 4
Terry : 3 #  Terry really needs to step up his game

The following is more traditional python code:

>>> def mean(x):
       return sum(x)/len(x)
>>> lines =(l.split() for l in open ("classa.txt"))
>>> persons = ((mean([int(x) for x in l[1:]]), l[0]) for l in lines)
>>> for score in sorted(persons, reverse=True):
       print("{} : {}".format (score[1], int(score[0])))
>>> 
Paul : 8
Mike : 4
Terry : 3

edited Nov 18 '15 at 13:13

answered Nov 17 '15 at 22:29

Sebastian Wozny

16,943
7
52
69

Unfortunately, I'm a little restricted to 3.3.4 at work at the moment although I'm going to try this out at home and see how it goes. – mjolnir Nov 17 '15 at 22:37
The traditional Python method looks a little more possible for me. I'll give it a try and report back. Thank you. – mjolnir Nov 17 '15 at 22:56
You're a star! p.s. Terry is a big strapping bloke but he definitely need to try harder. I don't know how Paul ended up with the highest average, he's always drunk :-) – mjolnir Nov 17 '15 at 23:00
I'm getting the error: TypeError: mean() missing 1 required positional argument. – mjolnir Nov 18 '15 at 13:03
I know it's not the most efficient way to do it but I put the contents of the text file in a new temp.txt file so the data is currently averaged and formatted but not sorted. It looks like this: Dave 6 Is there a way to simply sort them now based on the value of the second column using the split function? – mjolnir Nov 18 '15 at 13:05
I think you have the wrong filename. Can you double check? – Sebastian Wozny Nov 18 '15 at 13:08
Sorry, I checked the filename and it's all correct. Same error. – mjolnir Nov 18 '15 at 13:24
I don't know what could be causing that, did you copy paste? – Sebastian Wozny Nov 18 '15 at 13:31

score 0 · Answer 3 · answered Jul 29 '19 at 07:10

You can write a function that calculates the average of the scores and then sorts on based of that. Note that your function will do the calculations and average doesn't really need to get "stored" anywhere in your original file/data -

def mysort(line):
    score1, score2, score3 = map(int, line.split()[1:])
    average = (score1 + score2 + score3) / 3
    return -1*average, line.split()[0]

with open("score-sheet.txt", "r") as f:
    text = f.readlines()
    for line in sorted(text, key=mysort):
        print line

Python: Sorting data from a text file in an order based upon a new variable

3 Answers3