3

I have a simple code to add a string s to the beginning of each line of a very big JSON file with 3m records..the problem is that it is done for the only 4430 first records and the rest in deleted from output.this is the code:

s = '{ "index" : { "_index" : "gg2", "_type" : "log"} }'
with open('final.json', 'w') as out_file:
    with open('K2.2.json', 'r') as in_file:
        for line in in_file:
            out_file.write(s + '\n' + line.rstrip('\n') + '\n')

Do you have any ideas why this is happening?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Aziz
  • 31
  • 2
  • 2
    Aziz, are you doing this on Windows? Does your file contain any "exotic" characters? In particular, if you find the point at which your output is getting truncated and look at the corresponding place in the input file, is there anything funny there? I ask because there are some contexts in which a character 0x1A (ctrl-Z) is treated by Windows as an "end-of-file" marker. (I doubt this is your problem, but my memory of exactly how this works on Windows is hazy enough that I'm not sure it isn't.) – Gareth McCaughan Apr 27 '16 at 14:30
  • @GarethMcCaughan: that's *exactly* what I'd be looking for, a 0x1A character. That's very much the most likely cause here. – Martijn Pieters Apr 27 '16 at 14:38

0 Answers0