2

I have a jsonline file like below:

{"id":0,"country":"fr"}
{"id":1,"country":"en"}
{"id":2,"country":"fr"}
{"id":3,"country":"fr"}

I have a list of codes, i want to attribute a code to each user, by updating the file lines.

The result should be the following:

{"id":0,"country":"fr", code:1}
{"id":1,"country":"en", code:2}
{"id":2,"country":"fr", code:3}
{"id":3,"country":"fr", code:4}

This is how i do it now:

import ujson
fh, abs_path = mkstemp()

with open(fh, 'w') as tmp_file:
    with open(shooting.segment_filename) as segment_filename:
        for line in segment_filename:
            enriched_line = ujson.loads(line)
            code = compute_code()
                if code:
                    enriched_line["code"] = code
            tmp_file.write(ujson.dumps(enriched_line) + '\n')

My question is, is there a faster way to do this ? May be via a linux command launched via sarge for example ? or any pythonic way without having to read the read / write / replace the original file ?

Thank you !

Anina
  • 453
  • 3
  • 18

2 Answers2

1

For performance you can skip the json serialization / deserialization step completely and just replace the closing bracket with your code + a closing bracket.

So this should perform much better:

content = ""
with open("inp.txt", "r") as inp:
    for line in inp:
        content += line[:-1] + ", code:%s}\n" % compute_code()

with open("inp.txt", "w") as out:
    out.write(content)

EDIT: If you don't want to load the whole file into memory you can do something like this.

with open("inp.txt", "r") as inp, open("out.txt", "w") as out:
    for line in inp:
        out.write(line[:-1] + ", code:%s}\n" % compute_code())
tfeldmann
  • 3,108
  • 1
  • 23
  • 34
0

I do not know if this will satisfy you but here is some "cleaner" code:

import json

with open(shooting.segment_filename, "r") as f:
    data = [json.loads(line) for line in f.readlines()]

for json_line in data:
    code = compute_code()
    if code:
        json_line["code"] = code

# Will overwrite source file, you might want to give a bogus file to test it first
with open(shooting.segment_filename, "w") as f:
    f.write("\n".join([json.dumps(elem) for elem in data]))
Valentin B.
  • 602
  • 6
  • 18