Fastest way to count the occurance of a string

Question

I'm counting some string which i'm getting from text file. I have done that already but I want to know is there any other way that i can find quickly. Below is my code:-

Here first I'm finding all the string and putting all these in a list. Then I'm making a list of unique query then after I'm using the count method to find the count.

input.txt

shoes
memory card
earphones
led bulb
mobile
earphones
led bulb
mobile

above is my input file.

new = []
with open("input.txt") as inf:
for line in inf:
    line = line.strip("\n")
    new.append(line)
unique = list(set(new))
for i in unique:
   cnt = new.count(i)
   print i,cnt

and output should look like this:

   mobile 2
   memory card 1
   led bulb 2
   shoes 1
   earphones 2

Example of input data? you have a single word in each line or something? — Marcin, Feb 18 '15 at 06:58
Possibly duplicate of http://stackoverflow.com/questions/893417/item-frequency-count-in-python Check this out for another solutions like collections.defaultdict or itertools.groupby. — Alex Belyaev, Feb 18 '15 at 08:34

Marcin · Accepted Answer · 2015-02-18T07:23:37.737

3

You could use counter:

from collections import Counter        

with open("input.txt") as inf:
   c = Counter(l.strip() for l in inf)

Gives:

Counter({'led bulb': 2, 'earphones': 2, 'mobile': 2, 'memory card': 1, 'shoes': 1})

or

for k,v in c.items():
    print(k,v)

which gives:

memory card 1
mobile 2
earphones 2
led bulb 2
shoes 1

edited Feb 18 '15 at 07:23

answered Feb 18 '15 at 07:01

Marcin

215,873
14
235
294

hi @Marcin! with your answer, `with open("input.txt") as inf:c = Counter(l.strip() for l in inf)` `print(c)` why do I have the word counter when I print (c)? `Counter({'joyeux': 4, 'amour': 2, 'hello': 1})` – Papouche Guinslyzinho Feb 18 '15 at 07:16
What do you mean? isn't what you want? if not, please provide example of your input data and expected output, Otherwise is a guessing game. – Marcin Feb 18 '15 at 07:19
@s_m I updated the answer and it gives the results you expect. So i think its ok? – Marcin Feb 18 '15 at 07:24

score 1 · Answer 2 · answered Feb 18 '15 at 06:59

Much better would be just counting them as they come in using a dictionary:

count = {}
for L in open("input.txt"):
    count[L] = count.get(L, 0) + 1

and you end up with a dictionary from lines to their respective counts.

The count method is fast because it's implemented in C, but still has to scan the full list for each unique string, so your implementation is O(n^2) (consider the worst case of having all strings distinct).

Fastest way to count the occurance of a string

2 Answers2