0

I'm trying to extract verbs from German sentences. The problem is, for example in this sentence

Ich rufe noch einmal an.

Im getting rufe as the verb but its anrufe. I'm using textBlob and dont really know anything about linguistic. and using textblob I came accross POS tags. It tagged an as "RP"(doesnt know what that means) and rufe as "VB". I could just glue all "RP" and "VB" together but then again there could more than one verb in a sentence.

What is the right way of doing this?

  • Hello! What kind of data do you want to extract from data? Verbs list or verb phrases? – roddar92 Oct 20 '20 at 11:05
  • I want to extracts verbs in a given german sentence. using textblob.download_corpora. Its a python pakage. I dont know if that helped – Sisam Khanal Oct 20 '20 at 12:26

1 Answers1

0

If I understand correctly, download_corpora method is a part of textblob installation. Like in this example:

$ pip install -U textblob
$ python -m textblob.download_corpora

Then, you can use textblob for text analysis:

>>> test = TextBlob("Ich rufe noch einmal an.")
>>> test.tags
[('Ich', 'PRON'), ('rufe', 'VB'), ..., ('an', 'RP')]

More one interesting sub-library for German is here: https://pypi.org/project/textblob-de/

Maybe, this answer helps you to deep in POS-tagging, because your POS-tagger probably uses this universal tagset: Java Stanford NLP: Part of Speech labels?

P.S. In German, word an is a part of verb. 'RB' is a participle. Hence, POS tags 'VB' and 'RP' related to the one verb.

roddar92
  • 353
  • 1
  • 4