1

Here is the list which includes tags to the word type

t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),('Cook','ORGANIZATION')]

The expectation is to conditionally check if the subsequent tuple is tagged as organization if so concatenate them with a space and continue with the same over the entire list.

Expected output:

Wall Mart, Thomas Cook

for x in t:
    if(x[1] == 'ORGANIZATION'):
         org_list = org_list + ' | ' + x[0]

I was just able to extract the names but not really getting a way where I could concatenate the words tagged as organization.

Refereed to other Question asked: [Link]Concatenate elements of a tuple in a list in python

Expected output: Wall Mart, Thomas Cook

2 Answers2

2

Given that there will always be an 'OTHER' between two subsequent 'ORGANIZATION', one approach is using itertools.groupby to group subsequent tuples by their second element, and str.join their first items if the grouping key is 'ORGANIZATION':

t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),
     ('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),
     ('Cook','ORGANIZATION')]

from itertools import groupby
from operator import itemgetter as g

[' '.join(i[0] for i in [*v]) for k,v in groupby(t, key=g(1)) if k=='ORGANIZATION']
# ['Wall Mart', 'Thomas Cook']

If you prefer a for loop solution without any imports, you can do: -- This will work only for two subsequent tags:

f = False
out = []
for i in t:
    if i[1] == 'ORGANIZATION':
        if not f:
            out.append(i[0])
            f = True
        else:
            out[-1] += f' {i[0]}'
            f = False

print(out)
# ['Wall Mart', 'Thomas Cook']
yatu
  • 86,083
  • 12
  • 84
  • 139
1

You can use the following solution:

t = [('The','OTHER'),('name','OTHER'),('is','OTHER'),('Wall','ORGANIZATION'),('Mart','ORGANIZATION'),('and','OTHER'),('Thomas','ORGANIZATION'),('Cook','ORGANIZATION')]

result = [[]]
for i, j in t:
    if j == 'ORGANIZATION':
        result[-1].append(i)
    elif result[-1]:
        result.append([])       

result = [' '.join(i) for i in result if i]
# ['Wall Mart', 'Thomas Cook']
Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73