TL;DR
You have to define the ultimate task you want to perform and define what exactly is "intent" / "main information" or "meaning of text".
In Long
From first look, it seems like you're asking to solve a natural language problem magically. But lets look at the question and what you're really asking, lets avoid all the notion of intent/labels or language (for a while) and just look at what's the in-/output:
[in]: "How Can I raise my voice against harassment"
[out]: "raise voice against harassment"
[in]: "Donald Duck is created by which cartoonist/which man/whom ?"
[out]: "Donald duck is created by"
[in]: "How to retrieve the main intent of a sentence using spacy or nltk ?"
[out]: "retrieve main intent of sentence using spacy nltk"
It seems like all your output tokens/words are just a quote from your input, in that case what if you simply treat your problem as a "span/sequence annotation" task, i.e.
[in]: "How Can I raise my voice against harassment"
[out]: [0, 0, 0, 1, 0, 1, 1, 1]
[in]: "Donald Duck is created by which cartoonist/which man/whom ?"
[out]: [1, 1, 1, 1, 0, 0, 0]
[in]: "How to retrieve the main intent of a sentence using spacy or nltk ?"
[out]: [0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Assuming that each word is a binary label, the output should label 1
for the words that you want to extract from the input and 0
for the ones you don't.
Now given it's a simple binary sequence labeling task, one could simply do:
But step back a little,
- Is that really true that an intent be always part of the input?
- What exactly is an intent? How is it defined?
- What happens if the intent is not in the input?
Okay, even if we don't talk about "intent" and just want to extract the main meaning,
- what exactly is meaning of the sentence, is it just extracting the "important words"? If so, what makes the words "important"? How is "important" defined?
- Are only non-stop words not important? If so, then you can simply remove stopwords, e.g. Stopword removal with NLTK. And also, what then are stopwords?
But I heard people doing it with dependency parsing
What is dependency parsing?
In short, it provides a structured representation of text. But non of the structure in traditional dependency formalism has notion of "intent".
Proof: CTR + F on https://web.stanford.edu/~jurafsky/slp3/15.pdf
So I don't think just simply parsing the text with dependency trees would help unless the notion of "intent" is better defined in your scenario.
How about this SpaCy tool that trains a model for intent?
From https://github.com/explosion/spaCy/blob/master/examples/training/train_intent_parser.py
Yes, that's an example of using a combination parsing labels and sequence labeling and defining that as "intent", more specifically, we see examples from https://github.com/explosion/spaCy/blob/master/examples/training/train_intent_parser.py#L31
TRAIN_DATA = [
(
"find a cafe with great wifi",
{
"heads": [0, 2, 0, 5, 5, 2], # index of token head
"deps": ["ROOT", "-", "PLACE", "-", "QUALITY", "ATTRIBUTE"],
},
),
(
"find a hotel near the beach",
{
"heads": [0, 2, 0, 5, 5, 2],
"deps": ["ROOT", "-", "PLACE", "QUALITY", "-", "ATTRIBUTE"],
},
),
Each training data is made up of
- text
- the index of the dependency head
- the "intent" labels related to the dependency head
And an example in/outputs from https://github.com/explosion/spaCy/blob/master/examples/training/train_intent_parser.py#L173
[in]: find a hotel with good wifi
[out]:
[
('find', 'ROOT', 'find'),
('hotel', 'PLACE', 'find'),
('good', 'QUALITY', 'wifi'),
('wifi', 'ATTRIBUTE', 'hotel')
]
The example above shows that the whole list of triplets are defined as an intent, rather than just the raw strings. The triplets refers to the (dependent, relation, head)
, e.g. the hotel
is the PLACE
to find
from the triplets ('hotel', 'PLACE', 'find')
.
Note: This is solely SpaCy notion of "semantics" or "intent" which is not wrong but well-defined and hence a model to perform this task is trainable in a supervised machine learning paradigm. Details, see https://spacy.io/usage/examples
Depending on how and what you define as intent/semantics, the in/outputs will change and the model to train may be different.
But why do you have to make it so complicated, I just want the intent string?!
Because what does "main meaning" or "intent" mean if it's just a string?
We go back to the lack of definition that makes the task a magical one rather than one that computers can perform.