Folks,
I am using python library of wordsegment
by Grant Jenks for the past couple of hours. The library works fine for any incomplete words or separating combined words such as e nd
==> end
and thisisacat
==> this is a cat
.
I am working on the textual data which involves numbers as well and using this library on this textual data is having a reverse effect. The perfectly fine text of increased $55 million or 23.8% for
converts to something very weird increased 55millionor238 for
(after performing join operation on the retuned list). Note that this happens randomly (may or may not happen) for any part of the text which involves numbers.
- Have anybody worked with this library before?
- If yes, have you faced similar situation and found a workaround?
- If not, do you know of any other python library that does this trick for us?
Thank you.