I am trying to work on a classification problem: The data is of reviews of a particular product category from an e-commerce platform. Please find below the description of each attribute:
- id: Unique identifier for each tuple.
- category: The reviews have been categorized into two categories representing positive and negative reviews. 0 represents positive reviews and 1 represents negative reviews.
- text: Tokenized text content of the review.
The sample dataset is attached in the picture.
I am thinking to try TF-IDF however, given the text format don't know how to use the same.
I expect to predict the category based on the text column provided.