0

I am new in python and I am trying to return the most 50 commons word's in a lyrics of songs and I have a problem that I don't really understand why it's happens.

the name "lyrics" in the code is a string of the song lyrics from a text file. every iteration of the loop is different string of lyrics that I need to include in the total of how much word are shows up in the songs.

if someone know where the problem is and can help it would be very nice.

my output is not with words is in characters: "[(' ', 46), ('o', 24), ('e', 23), ('n', 15), ('t', 15), ('h', 12), ('a', 12), ('w', 8), ('r', 8), ('s', 8), ('\n', 7), ('f', 7), ('d', 6), ('u', 5), ('y', 5), ('m', 5), ('I', 4)..." and i need to get something like: ("the", 555), ("you", 365)... without include white spaces and \n

    count = {}
    for songs in the_dict.values():
        songs = songs[0]
        for lyrics in songs.values():
            lyrics = lyrics[2]
            count = Counter(lyrics)
    return count.most_common(50)
0m3r
  • 12,286
  • 15
  • 35
  • 71
roee
  • 39
  • 5

2 Answers2

3

Call the split() method on the lyrics before counting:

split_lyrics = lyrics[2].split()
count = Counter(split_lyrics)

see https://www.geeksforgeeks.org/find-k-frequent-words-data-set-python/

Eno Gerguri
  • 639
  • 5
  • 22
2

You should split your lyrics at every whitespace and newline, so that you get an array of words (instead of parsing in the string immediately like you do now).

So you should use

lyrics = lyrics[2].split()
fynsta
  • 348
  • 3
  • 10
  • yea, that code know give me common word wthout the \n but it's give me only from the last iteration. and I want to include also from all the songs together because I have a few number of lyrics – roee May 09 '21 at 16:46
  • So this is basically two questions? – fynsta May 09 '21 at 16:48
  • You can just add Counters together after each iteration, then you have the words from every iteration combined https://stackoverflow.com/questions/19356055/summing-the-contents-of-two-collections-counter-objects – fynsta May 09 '21 at 16:49