4

I am new to Mahout. I have a requirement to convert a text file to a vector for classification in later stage.

Could anybody of of shed some light on these below questions?

  1. How to convert a text file to a vector in mahout? The file format is like "username|comment about item|rating"
  2. The data will be few TBs. So which algorithm implementable I can use for classification using the vector I suppose to create?

Thanks, Arun

Arun Vasu
  • 297
  • 8
  • 22

1 Answers1

2

You can check these 2 examples that also somewhat do/explain how to use the Sequence File API. Here and here

And you should definitely read this intro to text analysis

Community
  • 1
  • 1
Julian Ortega
  • 947
  • 4
  • 11