I'm ingesting data with python (2.7.12 - pythonanywhere.com) to elasticsearch and I wrote a separate python script which deletes duplicate entries. When I start the deduplicating script from the bash console, it works perfectly. Now, when I import the file in one of the main scripts, it works once, then I reproduce the duplicates, and the second time the deduplicating script stops deleting after a couple of entries - in the error log I see that a hash was not found on elasticsearch. What then helps is to restart the python server, then it works again for one time.
The pythonanywhere support told me this:
My guess would be that you're relying on global state in some way. When you run it from the console, the global state is new every time you run it. If you're importing it into your web app, then the global state will only be reset when you reload your web app.
I don't know exactly what it means, so instead of importing the file, I tried to execute the py file with:
execfile('duplicates.py')
or
execfile(os.path.join('duplicates.py'))
But I get an "undefined name: execfile"
Any ideas why exexfile doensn't work or any suggestions for alternatives for import?