-4

I have a text with no separated by spaces (whish is 18000 words) and I need to convert to more readable automatically.

For example "accountcategorycode, customersizecode, address1_addresstypecode" to "Account category code, Customer size code, Address 1 address type code"

I would really appreciate suggestions please.

Tony01
  • 1
  • 1
  • I would suggest to read [ask] – Julien Aug 29 '23 at 07:17
  • The simplest one - download english vocabulary sorted by frequency of words. Remove garbage symbols like underscore. Start from the beginning of the string, take a substring of 1 letter, then 2 letters, up to e.g. 30 letters. Try to find the substring in the vocabulary (if missing, frequency is 0), select the substring with maximum frequency. Remove this substring from the beginning, and repeat. This solution won't provide a totally accurate English text, but will be readable – Alexey S. Larionov Aug 29 '23 at 07:26
  • The harder solution - use some language model (neural network) like ChatGPT and ask it to split the text. It is very sophisticated at interpreting the language, and will likely produce a high quality English output. There are probably some free open source language models, less powerful, but free to run in unlimited quantities. Try to search on Google – Alexey S. Larionov Aug 29 '23 at 07:28
  • 1
    See: [How to split text without spaces into list of words](https://stackoverflow.com/questions/8870261/how-to-split-text-without-spaces-into-list-of-words) – Abdul Aziz Barkat Aug 29 '23 at 07:31

0 Answers0