3

I have two lists.

x=['billed_qty','billed_amt','sale_value']

y=['george billed_qty', 'sam billed_amt', 'ricky sale_value', 'donald billed_qty']

I need to eliminate the words in list y that occur in list x and want the resultant list as:

z=['george','sam','ricky','donald']

How can I achieve this?

Thanks

petezurich
  • 9,280
  • 9
  • 43
  • 57
vicky
  • 249
  • 5
  • 16
  • I hate to be *"that guy"* but you might want to look into reducing the complexity of your data structures. – Malekai Jun 18 '19 at 13:31
  • 1
    @LogicalBranch Nothing wrong with being that guy who gives sage advice instead of rushing to answer the question. – cs95 Jun 18 '19 at 16:56

6 Answers6

7

Use regex with list comprehension:

comp = re.compile('|'.join(x))
z = [re.sub(comp, '', i).strip() for i in y]

print(z)
['george','sam','ricky','donald']
Space Impact
  • 13,085
  • 23
  • 48
3

Use str.join with str.split in list comprehension:

z = [' '.join(w for w in s.split() if w not in x) for s in y]
print(z)

Output:

['george', 'sam', 'ricky', 'donald']
Chris
  • 29,127
  • 3
  • 28
  • 51
2

Why not:

print([' '.join(set(i.split()).difference(set(x))) for i in y])

Output:

['george', 'sam', 'ricky', 'donald']
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
1

I don't know if it covers all your cases, but a simple solution would be this:

for i in x:
  for idx, k in enumerate(y):
    y[idx] = k.replace(" "+i, "")

For every value in array x replace its value in array y with an empty string (including the space on the left).

user1798707
  • 351
  • 1
  • 3
  • 12
0

At the first, split elements of y:

for i in range(0,len(y)):
    y[i] = y[i].split(' ')

So, y is:

[['george', 'billed_qty'], ['sam', 'billed_amt'], ['ricky', 'sale_value'], ['donald', 'billed_qty']]

Now, check existence of elements of x in y:

for i in range(0,len(y)):
    for j in range(0,len(x)):
        if x[j] in y[i][1]:
            y[i] = y[i][0] 

y changes into:

['george', 'sam', 'ricky', 'donald']
Hamed Baziyad
  • 1,954
  • 5
  • 27
  • 40
0

For that you can solve it by using itertools.

Solution is as follow..

import itertools

z = [i.split() for i in y]

# This gives us z = [['george', 'billed_qty'], ['sam', 'billed_amt'], ['ricky', 'sale_value'], ['donald', 'billed_qty']]

w = list(itertools.chain.from_iterable(z))

# This gives us w = ['george', 'billed_qty', 'sam', 'billed_amt', 'ricky', 'sale_value', 'donald', 'billed_qty']

output = [a for a in w if a not in x]

# This gives us output = ['george', 'sam', 'ricky', 'donald']

Urvi Soni
  • 314
  • 1
  • 2
  • 12