Any recommendations for languages/libraries to convert sentence like:
"X bumped Y, who in turn kicked Z."
to
- X: Bumped
- Y: Was bumped, kicked Z
Any recommendations for languages/libraries to convert sentence like:
"X bumped Y, who in turn kicked Z."
to
I would suggest you use the Stanford Parser (http://nlp.stanford.edu/software/lex-parser.shtml), which is open source and relatively simple, as these things go. With it, you can extract a typed dependency parse. A dependency parse of a sentence basically decomposes a sentence into a set of binary relations r(B, A)
, where word A grammatically depends on word B.
Take your sentence
X bumped Y, who in turn kicked Z.
In this sentence, both X and Y depend on bumped to get their grammatical relationship in this sentence. The Stanford Parser would extract the following relations for them:
nsubj(bumped, X)
dobj(bumped, Y)
This means the subject of bumped is X and the direct object of bumped is Y. You could then use this information to make a grammatical relation: bumped(X, Y)
. Likewise, the Stanford Parser extracts the following relations for the rest of the sentence:
nsubj(kicked, who)
rcmod(Y, kicked)
dobj(kicked, Z)
In this case, you have the subject of kicked being "who", with Y as the rcmod
(relative clause modifier). I'm not sure what the goal of your system is, but you would probably find that you need to construct a bunch of rules manually to cover situations. In this case, your rule could equate the rcmod
with the nsubj
in order to produce kicked(Y, Z)
.
For more information on using the Stanford Parser typed dependencies, there is an excellent tutorial on the subject at the Stanford Parser website (http://nlp.stanford.edu/software/dependencies_manual.pdf).
To blatantly rip off this answer, why not try the Natural Language Toolkit?
The Stanford Parser as suggested by ealdent would do the job, I would prefer to encode it as:
A POS tagger could also work, but your sentence is complicated ("who in turn").
Apart from the Stanford parser, RASP is a possibility too - it can produce lists of grammatical relations as part of its output. See this question.
It looks like you are interested in identifying the semantic roles in the sentence. SRL tools tag the entities with their corresponding roles. You can play with the demo of one of the tools.
In the relation bumped, X is tagged as A0 (agent) and Y is tagged as A1 (patient).