1

What's the difference between the two? Articles seem to treat them differently... that is, a paper would show research on either text classification or on sentence classification.

I wonder - if one applied sentence classification on a whole text, and then classified the paragraph according to what most of its sentences were classified to - would that count as proper text classification? or does text classification have a different 'catch'?

Cheshie
  • 2,777
  • 6
  • 32
  • 51
  • @adi92, thanks for the reference (and a very nice answer too!) I notice that your answer and lejlot's are quite the opposite. Do you have any comment on what he wrote? – Cheshie May 05 '14 at 12:32
  • @Chesie both our answers seem to say that there is no real difference.. what makes you say that our answers are opposite? – Aditya Mukherji May 06 '14 at 05:30
  • @adi92 - lejlot says that sentence classification is the same as text calssification, just smaller. You said that, while being similar - you approach them differently. In sentence classification, you need to `squeeze each training instance for all the information it can give you` - meaning adding the order of words, POS tags, maybe skip feature selection... I believe it is slightly different than the way you approach text classification, and that it's not only a smaller problem. – Cheshie May 06 '14 at 14:31
  • 1
    That was more of a side-comment. In any ML task, when the size of your individual training instance is kinda small, you are more likely to need to be cleverer when extracting a feature vector out of that instance. When you are classifying speeches by politicians (which might be long), a 0-1 feature vector indicating presence/absence of certain words might be good enough for classification. When classifying tweets, since you have less text to work with, you might have to get cleverer by looking at POS tags, time since prev tweet, # of retweets, etc – Aditya Mukherji May 06 '14 at 16:27
  • Thanks @adi92. That 'side comment' of yours was the closest answer I've found till now (upvoted) :-) – Cheshie May 06 '14 at 20:08

1 Answers1

0

Task, problem is about what to do not how. So it does not matter how you approach text classification it is always text classification if you classify text. That's all. You could toss a coin to classify it, it would still "count as proper text classification" if it achieves good scores.

Sentence classification can be seen as a "smaller scale" problem, as text classification is rather used in context of bigger chunks of text (like documents). But there are no strict distinctions/lines drawn here. I would rather treat text classification as a bag, general term under which you can put word-level tasks (like POS tagging); sentence classification; sentiment analysis (on the level of words, sentences, paragraphs or documents) etc.

lejlot
  • 64,777
  • 8
  • 131
  • 164