-1

I have a list of (token, tag) tuples that looks like the following:

 token_tags =  
 [('book', 'noun'),
 ('run', 'noun'),
 (',', ','),
 ('book', 'verb'),
 ('run', 'adj'),
 ('run', 'verb')]

I am trying to find out how many times a token was first tagged as a 'noun' then as a 'verb' in its following appearance in the list. So, I should not count 'run' because it was tagged as an adjective between its 'noun' and 'verb' assignment. Any suggestions on how to do that?

I have converted the tuple into a dict as follows

d = {}
for x, y in token_tags:
d.setdefault(x, []).append(y)

So, now d contains:

 {'book': ['noun', 'verb'], 'run': ['noun', 'adj', 'verb'], ',': [',']}

I have tried regular expresion to solve this but did not work.

Nina
  • 91
  • 4
  • 11
  • The problem description makes sense. So what's your question? – Brian Cain Sep 20 '16 at 01:34
  • 2
    SO isn't a code writing service, can you show what you have tried? One thing to consider is transforming this list of tuples into an alternative data structure that would make it easier to examine the order of tag assignment (e.g.: `{token:[tags]}`) – AChampion Sep 20 '16 at 01:37
  • Do not post the same question as another question that is the same as this one you just posted - http://stackoverflow.com/questions/39582639/counting-items-inside-tuples-in-python –  Sep 20 '16 at 02:14
  • @JarrodRoberson it is not the same – Nina Sep 20 '16 at 02:16
  • Honestly this edit is not much of an improvement, it does have some code which is unrelated and it down have a claim of something about trying regex, but it is still **fundamentally a send me teh codez** question at its basis. A well formatted one, but one none the less. –  Sep 20 '16 at 02:20
  • Could you explain what didn't work? What specific error did you encounter? What output did you get, and why isn't that what you expected? – Makoto Sep 20 '16 at 02:34

1 Answers1

0

now that you have it in a dictionary, counting how many time a certain pair appear is simple, the idea is to take two consecutive element in the list and check if they are the desire pair, for example

>>> data = {'book': ['noun', 'verb'], 'run': ['noun', 'adj', 'verb'], ',': [',']}
>>> result={}
>>> for token, tag_list in data.items():
        count = 0
        for i in range(1,len(tag_list)):
            if tag_list[i-1]=="noun" and tag_list[i]=="verb":
                count = count + 1
        result[token] = count

>>> result
{',': 0, 'book': 1, 'run': 0}
>>> 
Copperfield
  • 8,131
  • 3
  • 23
  • 29