4

I am implementing a simple application that consists of multiple python scripts that run concurrently. The two scripts that run at the same time are the one for parsing data and the one for looking up data in the database. Because of design decisions, I am generating a part of the data. This data isn't saved in the database but in json files. In my parser I save the data like this:

with open('a-1-test.json', 'w') as outfile:
            json.dump(lookup_table, outfile)
        outfile.close()

The parser runs in a loop until a certain condition is met. Meanwhile other scripts reffer to the look-up script to get data from the database (the data that the parser saves). When the other scripts call the look-up script, he first needs to check the lookup table in the json file to determine which data specifically he has to fetch.

while trigger:
time.sleep(10)
with open('a-1-test.json', 'r') as data_file:
    data = json.load(data_file)

for i in data.keys():
    print i, len(data[i])

This might work for some time but I get two types of errors: JSON document not found and ValueError: Unterminated string starting at(...). I guessed this is because there isnt any concurency measurements set when two different scripts try to access the same file. I know that the first error happens because I use 'w' in the parser, where an existing file will be deleted and a new one created, so in the meantime the lookup script won't be bale to see the file.

I wonder what is the best way in python to do this? Any way to put a lock on the file while it is written in, unlock it when finished so the lookup script can read it?

Thank you in advance

Georgi Nikolov
  • 113
  • 3
  • 11
  • I think for your application much easier to use other ways of interprocessor communication, such as sockets for example. Probably this http://stackoverflow.com/questions/6920858/interprocess-communication-in-python will be useful for you. – kvorobiev Apr 03 '15 at 10:34
  • @kvorobiev I understand you can do the same thing with sockets but I don't want to connect to a server or anything as the json file is on the machine itself in the form of a configuration file *.json. I just wonder if there is a way to access it through various scripts using semaphores or something – Georgi Nikolov Apr 03 '15 at 10:54
  • In this case you could use following algorithm in each process: Try to lock file (with some library like lockfile). If you lock file then open , write data, close else wait – kvorobiev Apr 03 '15 at 11:11
  • what if you work with an additional temporary file? – nicolallias Apr 03 '15 at 11:49
  • 1
    @kvorobiev I am testing lockfile right now. Will see if it works well. – Georgi Nikolov Apr 03 '15 at 11:51
  • @nicolallias How would you go around doing that? – Georgi Nikolov Apr 03 '15 at 11:52
  • the idea is that the "writing script" only writes in file A. When the writing loop is done, A is renamed as B. The "reading script" only opens file B. It is not sultanate anymore, but for small files (less than some Megabytes) it will do as well. – nicolallias Apr 03 '15 at 11:56
  • @nicolallias Its not a bad idea, I will try that too. – Georgi Nikolov Apr 03 '15 at 12:01

0 Answers0