I have this text file that I read into a Java application and then count the words in it line by line. Right now I am splitting the lines into words by a
String.split([\\p{Punct}\\s+])"
But I know I am missing out on some words from the text file. For example, the word "can't" should be divided into two words "can" and "t".
Commas and other punctuation should be completely ignored and considered as whitespace. I have been trying to understand how to form a more precise Regular Expression to do this but I am a novice when it comes to this so I need some help.
What could be a better regex for the purpose I have described?