1

I am working on very large documents {NEWS + Articles} using modeling Natural Sentences into classes, please look at the following example:

1- The System enables a user to shut down the server remotely ==> class 1

2- The Application allows a customer to to close the machine online ==> (must be also) class 1 , why ?

because both sentences have many similar synonyms {System ~Application,enables ~ allows ,user ~ customer ,shut down ~ close,server ~ machine,remotely~online} so I am doing classifier train on some data depending on the similarity rules or synonyms of the words + stemming + may be (lemmatization) the most number of rules the most result we can get.

so the question what is the best strategy to configure/adjust the classifier to that ideas ? Thank you in advance

MWiesner
  • 8,868
  • 11
  • 36
  • 70

1 Answers1

0

Have you taken a look at this ??

Is there an algorithm that tells the semantic similarity of two phrases

The most important is to determine similarity means. If you do that, choosing a classifier is the easy part of the task (ID3, C4.5, bag-of-words, naive bayes, etc.).

Community
  • 1
  • 1
rpd
  • 462
  • 1
  • 9
  • 24