Occurence of not repeated string in alist of list of strings (update with more conditions)

Question

I have a list of list like this one :

List=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang], ['Rana', ['Z', 'Y']]

****Update: In this list, I had another structure where the first element is a single element and the second element is a list like ['Rana', ['Z', 'Y']], which means that Rana has a specific relationship with Z and R, which are not the same relationship as Rana and Jhon. ****

I want to calculate the occurence of the word of this list and I need two kind of output. The first one when we have a duplicated (or repeated word), we ignore it. The second solution, when we detect the repeated word we count it as once not twice.

Update: I want to add this type of relation to be included in the second output.

For example for the first solution the result will be Rana:2 Jhon:1 Zhang:1

the second solution will be Rana:3 (Update: will be 5 instead of 3 since we will consider Z and Y) Jhon:1 Zhang: 1

I have tried to develop the following lignes of code, but I didn´t have results:

from collections import Counter
List1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]
count=0
n=0
for j in range (0, len(List1)-1):
  if (List1[j][0] == List1[j][1] ) or (List1[j][0] != List1[j][1] ):
    count += 1
print(count)

"If they're the same, or they're not the same..." is a strange thing to test. — tadman, Jun 05 '23 at 14:11

Talha Tayyab · Accepted Answer · 2023-06-05T15:56:27.133

1

For the first part you can ignore the inner list with repeating values by:

Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]

l = [x for x in Lst if len(set(x))==len(x)]     #len(set()) will make sure only unique elements are there in inner-list
#[['Jhon', 'Rana'], ['Rana', 'Zhang']]

#Flatten l by

m = [item for sublist in l for item in sublist]
#['Jhon', 'Rana', 'Rana', 'Zhang']

from collections import Counter as c
c(m)

#output
Counter({'Jhon': 1, 'Rana': 2, 'Zhang': 1})

len(set(x)) converts x into a set. If there are repeating values then the len(x) will NOT be equal to len(set(x))

For the second part if there are same values in an inner list you can add only 1 value by:

Lst=[['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]

l = [x if len(set(x))==len(x) else [x[0]] for x in Lst ]
#[['Jhon', 'Rana'], ['Rana'], ['Rana', 'Zhang']]

#Flatten l by:

n = [item for sublist in l for item in sublist]
#['Jhon', 'Rana', 'Rana', 'Rana', 'Zhang']

from collections import Counter as c
c(n)

#output
Counter({'Jhon': 1, 'Rana': 3, 'Zhang': 1})

Edit:

Now, if you have a list with reverse elements like:

Lst1=[["Rana", "Jhon"], ["Rana", "Rana"], ["Jhon", "Rana"], ["Rana", "Alex"]]

You can remove ["Jhon", "Rana"] and ["Rana", "Rana"] by:

#["Rana", "Rana"] is also an inverse of itself.

seen = set()
new_Lst1 = [x for x in Lst1 if tuple(x[::-1]) not in seen and not seen.add(tuple(x))]

print(new_Lst1)
[['Rana', 'Jhon'], ['Rana', 'Alex']]


#Flatten new_Lst1 by

p = [item for sublist in new_Lst1 for item in sublist]
#['Rana', 'Jhon', 'Rana', 'Alex']

from collections import Counter as c
c(p)

#output
Counter({'Rana': 2, 'Jhon': 1, 'Alex': 1})

edited Jun 05 '23 at 15:56

answered Jun 05 '23 at 14:26

Talha Tayyab

8,111
25
27
44

Hello, I tried the code, but it only check if we have the same string in a list then it takes only one. The case when I have the same list but maybe the position change is not tested. like the example [[Rana, Jhon], [Jhon, Rana]] – spain engy Jun 05 '23 at 14:39
@spainengy If the position is changed, what do you want in output? – Talha Tayyab Jun 05 '23 at 14:48
The out put should count it once time not twice. [Rana, Jhon], [Jhon, Rana] ---> Rana 1, Jhon 1 – spain engy Jun 05 '23 at 15:12
gotcha.. let me update my answer – Talha Tayyab Jun 05 '23 at 15:12
In addition, if it is possible can you explain to me the function you use that extract the sublists which are not duplicated: l = [x for x in Lst if len(set(x))==len(x)]. – spain engy Jun 05 '23 at 15:14
@spainengy did you see the updated answer? – Talha Tayyab Jun 06 '23 at 06:38
Yes thank you, I check the code and it works for my list... Now I will try to check other conditions. I hope I will manage thi sby myself, otherwise I will return back – spain engy Jun 08 '23 at 08:49

score 1 · Answer 2 · answered Jun 05 '23 at 15:46

For what I can understand from your description, using https://stackoverflow.com/a/30357006:

from collections import defaultdict

list_1 = [['Jhon', 'Rana'], ['Rana', 'Rana'], ['Rana', 'Zhang']]
# list_1 = [['Rana', 'Jhon'], ['Jhon', 'Rana']]

seen = []

for sub in list_1:
    sub = sorted(sub)
    if sub not in seen:
        seen.append(sub) 

res_1 = defaultdict(lambda: 0)
res_2 = defaultdict(lambda: 0)

for sub in seen:
    a, b = sub
    if a == b:
        res_2[a] += 1;
    else:
        res_2[a] += 1
        res_2[b] += 1
        res_1[a] += 1;
        res_1[b] += 1;


print(dict(res_1)) #=> {'Jhon': 1, 'Rana': 2, 'Zhang': 1}
print(dict(res_2)) #=> {'Jhon': 1, 'Rana': 3, 'Zhang': 1}

Or add more cases, as already commented.

Thank you, your solution is good an easy to understand, Now I will work with more conditions and if I find problems I will return back with more cases and conditions. — spain engy, Jun 08 '23 at 08:51

Occurence of not repeated string in alist of list of strings (update with more conditions)

2 Answers2