I have huge txt file like this (input.txt
):
word1 word2 word3 word2
word5
word6 word7 word8 word9 word10
I want to transform it to have 3 words per line like this (output.txt
):
word1 word2 word3
word2 word5 word6
word7 word8 word9
word10
Of course last line could have less than 3 words.
Note 1: 3 is a parameter (more realistic value is 200)
Note 2: words are separated by space, so they could be obtained by split(" ")
I've solution that works when I can load whole input.txt
into memory and process it, but my input.txt
is around 300GB so it doesn't fit. Loading whole input.txt
is not necesarry, as I think it could be processed in 'stream fashion' so no real memory problem, but it shouldn't take ages.
Pure python solution would be great, but if more performant or concise solution with some popular library exist, that also will be fine.