1

Is there an efficient way of checking if a list contains an element which is not a certain element like the other ones?

I have many lists looking like that:

a = ["_", "_", "_", "_"]
b = ["_", "b", "_", "_"]
c = ["a", "_", "x", "_"]

They are mostly underscores but sometimes there is an element not being an underscore. I only want to process those with "content". At the moment, I do that with for-loops checking if there is a non-underscore element. Is there a more efficient way?

khelwood
  • 55,782
  • 14
  • 81
  • 108
Hemmelig
  • 803
  • 10
  • 22
  • 2
    Looping of some sort will be required. What processing do you do and what is the resulting data structure? – user2390182 May 05 '21 at 09:32
  • I am looping through a NLP dataset searching for semantic roles in specific columns. Most of them aren’t and therefore all of these columns are empty. Also, the amount of columns is the amount of verbs in a sentence, so I can’t be sure about the number of elements either. I thought about maybe it’s worth it counting how many underscores there are and if it’s not the same as the length, it must contain something? – Hemmelig May 05 '21 at 09:34

2 Answers2

6

For the bare boolean check, you can do sth fancy like this, using map and any:

for lst in (a, b, c):
    print(any(map("_".__ne__, lst)))
# False
# True
# True

Or be a little more explicit:

any(x != "_" for x in lst)

any has the advantage of short-circuiting (stopping at the first hit) as compared to your suggested set length approach:

len(set(lst)) > 1

If that makes it faster in the real world depends on your data. If most your lists are only underscores and therefore iterated in their entirety, the set conversion (C-optimized) may well be faster.

user2390182
  • 72,016
  • 6
  • 67
  • 89
1

Use zip to go through each item in the lists, find the unique items using a set and find the length of that set

a = ["_", "_", "_", "_"]
b = ["_", "b", "_", "_"]
c = ["a", "_", "x", "_"]

has_value = [len(set(col))>1 for col in zip(a, b, c)]

output:

True
True
True
False
Tom McLean
  • 5,583
  • 1
  • 11
  • 36