2

i have a file "text.txt" that is 1.1MB right now. I want to split it up into 50kb text files. I would use a loop if I could readlines() the file, but since it's one long string, I'm not sure i could do that.

natalie
  • 99
  • 1
  • 3
  • 7
  • Use [`seek()` and `read()`](https://docs.python.org/3.4/tutorial/inputoutput.html#reading-and-writing-files). – TigerhawkT3 Jul 14 '15 at 02:02
  • Read a file in chunks, like done [here](http://stackoverflow.com/questions/519633/lazy-method-for-reading-big-file-in-python) – mike.k Jul 14 '15 at 02:02
  • another option is found in [break a text file into smaller chunks](https://stackoverflow.com/questions/18761016/break-a-text-file-into-smaller-chunks-in-python?rq=1) – LinkBerest Jul 14 '15 at 02:05

2 Answers2

4

Open the file, set up a byte range to iterate through, then seek() to that location, read() in the content, and, if there was content, write it to a new file. If there's no content, break out of the loop.

with open('myfile.txt', 'r') as f:
    for place in range(0, int(2e6), 50000):
        f.seek(place)
        content = f.read(50000)
        if content:
            with open('myfile{}.txt'.format(place), 'w') as o:
                o.write(content)
        else:
            break
TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
  • 2e6 is two million in scientific notation. In Python, that produces a float, hence the `int()` call. It's longer and a tiny bit slower than `2000000`, but more readable. – TigerhawkT3 Jul 14 '15 at 02:31
0

You could use the split command. For ex,:-

split -b 50k text.txt.

If you want to do this from python, you could use subprocess.check_call()

rohithvsm
  • 94
  • 5