0

I have tried to train one Hidden Markov Model(HMM) tagger to extract some user defined entities. I am trying to run one classifier to extract various relationships and resolving ambiguity of the extracted entities. In both these supervised algorithms I have kept 80% of the data for training, and 20% for testing. I am not comparing any model performance so not keeping any data for validation or cross validation. Am I fine? I tried to read some materials like, Stackexchange Post, Previous post1,Previous Post2 and a Wikipedia Article

Community
  • 1
  • 1
Coeus2016
  • 355
  • 4
  • 14
  • fine with what? And if you are not comparing anything then why you have test set at all? – lejlot Apr 26 '16 at 18:14
  • I meant to say is my approach fine? In test data I am comparing performance of chosen model with training data. By comparison I tried to mean performance comparison among various models. – Coeus2016 Apr 26 '16 at 18:21
  • in order to answer if the setting is ok we will need exact description of what you do and what you are **trying to achieve** – lejlot Apr 26 '16 at 18:33
  • I have user defined labels. I am trying to train one HMM tagger on it as if to extract NERs. Suppose the tagged data is tagged with PERS and LOC in one document, I am trying to train one simple multiclass classifier say NB, with these tagged entity base strings labeled with relation like PERS-LOC, I am now training the NB on this new data to extract relations. – Coeus2016 Apr 26 '16 at 19:17

0 Answers0