3

I have a list in my hand and I want to create vocabulary from this list. Then, I want to show each word and count the same strings in this list.

The sample list as below.

    new_list = ['one', 'thus', 'once', 'one', 'count', 'once', 'this', 'thus']

First, I created a vocabulary with below.

    vocabulary = []
        for i in range (0, len(new_list)):
            if new_list[i] not in vocabulary:
                vocabulary.append(new_list[i])`
    print vocabulary

The output of above code is: "count, once, one, this, thus."

I want to show the number of each words in the list as below. [count][1], [once][2], [one][2], [this][1], [thus][2].

In order to get above result; I try below code.

    matris = []

    for i in range(0,len(new_list)):
        temp = []
        temp.insert(0,new_list.count(new_list[i]))        
        matris.append(temp)

    for x in matris:
        print x

Above code only gives the number of words. Can someone advise me how can I print the word name and number of the words together such as in [once][2] format.

Behzat
  • 121
  • 2
  • 10

2 Answers2

6

Use a Counter dict to get the word count then just iterate over the .items:

from collections import Counter

new_list = ['one', 'thus', 'once', 'one', 'count', 'once', 'this', 'thus']

cn = Counter(new_list)
for k,v in cn.items():
    print("{} appears  {} time(s)".format(k,v))

If you want that particular output you can wrap the elements in the str.format:

for k,v in cn.items():
    print("[{}][{}]".format(k,v))

[thus][2]
[count][1]
[one][2]
[once][2]
[this][1]

To get the output from highest count to lowest use .most_common:

cn = Counter(new_list)
for k,v in cn.most_common():
    print("[{}][{}]".format(k,v))

Output:

[once][2]
[thus][2]
[one][2]
[count][1]
[this][1]

If you want the data alphabetically from lowest to highest and from highest to lowest for the count you need to pass a key -x[1] to sorted to negate the count sorting the count from highest to lowest:

for k, v in sorted(cn.items(), key=lambda x: (-x[1],x[0])):
    print("[{}][{}]".format(k, v))

Output:

[once][2]
[one][2]
[thus][2]
[count][1]
[this][1]
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • Thanks Padraic. It works well. I also want to sort the words by number of counts from maksimum to minimum. Firstly, according to number of counts and then alphabeticaly second. such as [once][2], [one][2], [thus][2], [count][1], [this][1]. – Behzat Jun 07 '15 at 11:10
  • See the wiki: https://wiki.python.org/moin/HowTo/Sorting Especially the section on key functions. Sort by word first, THEN by value to get what you expect. Also, note that sorting is stable in Python according to this question: http://stackoverflow.com/questions/1915376/is-pythons-sorted-function-guaranteed-to-be-stable – Kevin Anderson Jun 07 '15 at 11:17
  • @Padraic it does not sort alphabetically. "for k, v in sorted(cn.items(), key=lambda x: -x[1]):" It still sorts according to number of counts only. – Behzat Jun 07 '15 at 11:40
0
new_list = ['one', 'thus', 'once', 'one', 'count', 'once', 'this', 'thus']
vocabulary = list(dict.fromkeys(new_list))
print(*vocabulary, sep = "\n")

OUTPUT:
one
thus
once
count
this

#######################
matris= ["["+str(item)+"]"+"["+str(new_list.count(item))+"]" for item in 
new_list]
print(*list(dict.fromkeys(matris)), sep = "\n")

OUTPUT:
[one][2]
[thus][2]
[once][2]
[count][1]
[this][1]
Jamal
  • 1