1

I am testing out Personality Insights and I am curious whether I need to do any data cleansing prior to sending a string of twitter profile's timeline across to IBM.

For example, should I remove urls included in the tweets and other twitter features like hashtags or profile names included in the single tweet.

I am currently not removing any data. However, I am currently concatenating tweets with a full stop and a space using text+=". "+tweetfulltext.

Dmitry Rastorguev
  • 3,473
  • 4
  • 13
  • 14

1 Answers1

2

You don't need to but as they don't count towards the personality then if you already have a cleanup module it will help with the word count. You will want to filter to remove retweets.

chughts
  • 4,210
  • 2
  • 14
  • 27