Find multiple occuring string in array and output index

Question

I have a array filled with e-mail addresses which change constantly. e.g.

mailAddressList = ['chip@plastroltech.com','spammer@example.test','webdude@plastroltech.com','spammer@example.test','spammer@example.test','support@plastroltech.com']

How do I find multiple occurrences of the same string in the array and output it's indexes?

What you have tried by yourself and whats your expected output? — Mazdak, May 07 '15 at 07:41

score 2 · Accepted Answer · answered May 07 '15 at 07:42

just group indexes by email and print only those items, where lenght of index list is greater than 1:

from collections import defaultdict
mailAddressList = ['chip@plastroltech.com',
    'spammer@example.test',
    'webdude@plastroltech.com',
    'spammer@example.test',
    'spammer@example.test',
    'support@plastroltech.com'
]

index = defaultdict(list)
for i, email in enumerate(mailAddressList):
    index[email].append(i)

print [(email, positions) for email, positions in index.items()
                        if len(positions) > 1]
# [('spammer@example.test', [1, 3, 4])]

This answer is beautiful. You are awesome :D – elhombre May 07 '15 at 09:28 — elhombre, May 07 '15 at 09:28

score 0 · Answer 2 · answered May 07 '15 at 07:37

0

Try this:

query = 'spammer@example.test''
indexes = [i for i, x in enumerate(mailAddressList) if x == query]

Output:

[1, 3, 4]

answered May 07 '15 at 07:37

Alex Lisovoy

5,767
3
27
28

1

I don't know what e-mail address I am searching for as they change constantly. So this won't help – elhombre May 07 '15 at 07:41

score 0 · Answer 3 · answered May 07 '15 at 07:56

0

In [7]: import collections
In [8]: q=collections.Counter(mailAddressList).most_common()

In [9]: indexes = [i for i, x in enumerate(mailAddressList) if x == q[0][0]]

In [10]: indexes
Out[10]: [1, 3, 4]

answered May 07 '15 at 07:56

Ajay

5,267
2
23
30

marmeladze · Answer 4 · 2015-05-07T08:38:52.700

note: solutions submitted before are more pythonic than mine. but in my opinon, lines that i've written before are easier to understand. i simply will create a dictionary, then will add mail adresses as key and the indexes as value.

first declare an empty dictionary.

>>> dct = {}

then iterate over mail adresses (m) and their indexes (i) in mailAddressList and add them to dictionary.

>>> for i, m in enumerate(mailAddressList):
...     if m not in dct.keys():
...             dct[m]=[i]
...     else:
...             dct[m].append(i)
...

now, dct looks liike this.

>>> dct
{'support@plastroltech.com': [5], 'webdude@plastroltech.com': [2], 
'chip@plastroltech.com': [0], 'spammer@example.test': [1, 3, 4]}

there are many ways to grab the [1,3,4]. one of them (also not so pythonic :) )

>>> [i for i in dct.values() if len(i)>1][0]
[1, 3, 4]

or this

>>> [i for i in dct.items() if len(i[1])>1][0] #you can add [1] to get [1,3,4]
('spammer@example.test', [1, 3, 4])

score 0 · Answer 5 · edited May 23 '17 at 10:24

Here's a dictionary comprehension solution:

result = { i: [ k[0] for k in  list(enumerate(mailAddressList)) if k[1] == i ] for j, i in list(enumerate(mailAddressList)) }
# Gives you: {'webdude@plastroltech.com': [2], 'support@plastroltech.com': [5], 'spammer@example.test': [1, 3, 4], 'chip@plastroltech.com': [0]}

It's not ordered, of course, since it's a hash table. If you want to order it, you can use the OrderedDict collection. For instance, like so:

from  collections import OrderedDict 
final = OrderedDict(sorted(result.items(), key=lambda t: t[0]))
# Gives you: OrderedDict([('chip@plastroltech.com', [0]), ('spammer@example.test', [1, 3, 4]), ('support@plastroltech.com', [5]), ('webdude@plastroltech.com', [2])])

This discussion is less relevant, but it might also prove useful to you.

EvenLisle · Answer 6 · 2015-05-07T08:49:29.413

0

mailAddressList = ["chip@plastroltech.com","spammer@example.test","webdude@plastroltech.com","spammer@example.test","spammer@example.test","support@plastroltech.com"]
print [index for index, address in enumerate(mailAddressList) if mailAddressList.count(address) > 1]

prints [1, 3, 4], the indices of the addresses occuring more than once in the list.

edited May 07 '15 at 08:49

answered May 07 '15 at 08:43

EvenLisle

4,672
3
24
47

Find multiple occuring string in array and output index

6 Answers6