2

I am starting with nlp in spaCy. I am using the EntityRuler with a Spanish model and I would like to know if it is possible to create a pattern that allows to find the form of the verb and the Tense at the same time

In the spanish model the TAG attributes are something like:

AUX__Mood=Sub|Number=Sing|Person=3|Tense=Pres|VerbForm=End

I need a pattern to match Tense=Pres and VerbForm=Fin

Based on this question I have tried to modify the regex:

matcher.add("MOOD_SUB", [[{"TAG": {"REGEX": ".*Tense=Pres|VerbForm=End.*"}}]])

but I have not been able to obtain the expected results. Any ideas?

Marios
  • 26,333
  • 8
  • 32
  • 52
  • Hey, this is a great question. Could you add an example of input and correct output for people who understand spaCy but not Spanish like myself? – polm23 Nov 09 '20 at 03:44
  • 1
    For the record, the code from the question you link to seems to be able to select things with your regex, though I had to modify "End" to "Fin" to match the example sentence given there, since I'm not sure what the VerbForm values mean. – polm23 Nov 09 '20 at 03:48
  • Hello, thank you for your interest in helping me. I studied some regex and was able to get the result I expected with the next line: `.*Tense=Past.*VerbForm=Fin"` – Diego Alvarez Nov 09 '20 at 23:18
  • Oh. I guess maybe the pipe was an issue, but it looks like the main problem was you used `End` instead of `Fin` then... – polm23 Nov 10 '20 at 02:47
  • Yeah, the pipe was the issue. But you are right is `VerbForm=Fin` not `VerbForm=End` When I translated and wrote the question it was misspelled – Diego Alvarez Nov 10 '20 at 03:46

0 Answers0