I tired to follow this question to create a regex expression that separates contractions from the word.
Here is my attempt:
line = re.sub( r'\s|(n\'t)|\'m|(\'ll)|(\'ve)|(\'s)|(\'re)|(\'d)', r" \1",line) #tokenize contractions
However, only the first match is tokenized. For example: should've can't mustn't we'll
changes to should ca n't must n't we