Search for repeating word in text

Question

I haven't found any straight answers.

I need to find the words in text / string that is being repeated the most. E.g. String that has following values:

000587\local_users
000587\local_users
4444\et-4444
et\pmostowiak
et\pmostowiak
et\pmostowiak

Then the results needs to be et\pmostowiak

How should I accomplish this?

EDIT: I'm using older version of jython so I can't use the collections library with Counter function

This prints all values that are found more than ones:

d = {}

for x in users: 
  d[x] = x in d

_result = [x for x in d if d[x]] # [1]

If I could reuse this further?

I think this is a duplicate of http://stackoverflow.com/questions/2161752/how-to-count-the-frequency-of-the-elements-in-a-list ... — Sundeep, Oct 21 '16 at 11:31
if `s` is string, `import collections` and `print(collections.Counter(s.split('\n')).most_common(1)[0][0])` — Sundeep, Oct 21 '16 at 11:32
It's nearly a duplicate - unless the problem was splitting a string, which will be a duplicate too - but it's two questions really — doctorlove, Oct 21 '16 at 11:36
I think this is a better duplicate http://stackoverflow.com/questions/10390989/python-program-that-finds-most-frequent-word-in-a-txt-file-must-print-word-and — Bhargav Rao, Oct 21 '16 at 11:40

score 2 · Accepted Answer · answered Oct 21 '16 at 11:31

2

Once you have some iterable container of words, collections does exactly what you need.

>>> import collections
>>> words = ['000587\local_users', '000587\local_users', '4444\et-4444', 'et\pmostowiak', 'et\pmostowiak', 'et\pmostowiak']
>>> print collections.Counter(words).most_common(1)
[('et\\pmostowiak', 3)]

This begs the question of how to split a string. This works:

>>> str = """000587\local_users
... 000587\local_users
... 4444\et-4444
... et\pmostowiak
... et\pmostowiak
... et\pmostowiak"""
>>> str.split('\n')
['000587\\local_users', '000587\\local_users', '4444\\et-4444', 'et\\pmostowiak', 'et\\pmostowiak', 'et\\pmostowiak']
>>> words = str.split('\n')

answered Oct 21 '16 at 11:31

doctorlove

18,872
2
46
62

Sorry guys my bad: – Toube Oct 21 '16 at 11:43
I kind forget to mention that I'm dealing with an older version of jython and thereby the collections library is not supported.. sorry about this – Toube Oct 21 '16 at 11:44
2

@user2023042 In that case you should probably update the actual question and tags to reflect this. – Totem Oct 21 '16 at 11:47
@user2023042 once you have the words, you could figure out how to count the frequencies; a dictionary of words to their counts would work – doctorlove Oct 21 '16 at 11:50
counting freq with dictionary: http://stackoverflow.com/a/2161792/4082052 – Sundeep Oct 21 '16 at 11:52
Thanks I check that dictionary out – Toube Oct 21 '16 at 12:05

Search for repeating word in text

1 Answers1