Suppose that song i
has been played f
i times but that Zipf’s Law predicts that it would
have been played z
i times. Then you define the quality of song i
to be q
i = f
i/z
i
.Your software should select the songs with the highest values of q
i.
The first line of input contains two integers n
and m
(1 <= n < 50 000
, 1 <= m <= n
), the number of songs on the album, and the number of songs to select. Then follow n
lines. The i
'th of these lines contains an integer f
i and string s
i, where 0 <= f
i< 10^12
is the number of times the i
'th song was listened to, and s
i is the name of the song.
Each song name is at most 30 characters long and consists only of the characters a-z
, 0-9
, and underscore (_
).
Output a list of the m songs with the highest quality q
i, in decreasing order of quality. If two songs have the same quality, give precedence to the one appearing first on the album (presumably there was a reason for the producers to put that song before the other).
sample input
4 2
30 one
30 two
15 three
25 four
sample output
four
two
I am pretty new to python and i am trying to solve this puzzle I think i am getting the right answer but I have to do it faster, any recommendations ?
from __future__ import division
def main():
import sys
from operator import itemgetter
data = sys.stdin.readlines()
line1 = data[0].split(" ")
numberofselect = line1[1]
qualitydict = {};
songdict = {};
a = 0
for x in range(1, len(data)):
item = data[x].split(" ");
item1 = item[1].split("\n");
f = float(item[0])
z = float(1/x)
qualitydict[item1[0]] = (f/z)
if ((f/z) in songdict.keys()):
songdict[(f/z)].append(item1[0])
else:
songdict[(f/z)] = [item1[0]]
items = songdict.items()
items.sort(key = itemgetter(0), reverse=True)
for key, value in items:
for element in value:
if (a < int(numberofselect)):
print element
a = a + 1
main();