-2

Python question, as it's the only language I know.
I've got many very long text files (8,000+ lines) with the sentences fragmented and split across multiple lines, i.e.

Research both sides and you should then
formulate your own answer. It's simple,
straightforward advice that is common sense.
And when it comes to vaccinations, climate
change, and the novel coronavirus SARS-CoV-2
etc.

I need to concatenate the fragments into full sentences breaking them at the full stops (periods) question marks, quoted full stops, etc. And write them to a new cleaned up text file, but I am unsure the best way to go about it.
I tried looping though but the results showed me that this method was not going to work.

I have never coded Generators (not sure if that is what is called for in this instance) before as I am an amateur developer and use coding to make my life easier and solve problems. Any help would be very greatly appreciated.

THarpster
  • 1
  • 1

1 Answers1

0

If you read the file into a variable f, then you can access the text one row at a time (as in f is similar to a list of strings). The functions that might be helpful to you are String.join and String.split. Join will take a list of strings, and join them with a string in between. 'z'.join["a", "b", "c"] will produce "azbzc". Split will take a string as a parameter, find each instance of that string, and split it up. "azbzc".split('z') will produce ["a", "b", "c"] again. Removing the newline after every line, then joining them with something like a space will rebuild the text back into a single string, then using split on things like question marks, etc. will split it up the way you want it.

xzkxyz
  • 527
  • 3
  • 4