I need to save once and load multiples times some big arrays in a flask application with Python 3. I originally stored these arrays on disk with the json library. In order to speed up this, I used Redis on the same machine to store the array by serializing the array in a JSON string. I wonder why I get no improvement (actually it takes more time on the server I use) whereas Redis keeps data in RAM. I guess the JSON serialization isn't optimize but I have no clue how I could speed up this:
import json
import redis
import os
import time
current_folder = os.path.dirname(os.path.abspath(__file__))
file_path = os.path.join(current_folder, "my_file")
my_array = [1]*10000000
with open(file_path, 'w') as outfile:
json.dump(my_array, outfile)
start_time = time.time()
with open(file_path, 'r') as infile:
my_array = json.load(infile)
print("JSON from disk : ", time.time() - start_time)
r = redis.Redis()
my_array_as_string = json.dumps(my_array)
r.set("my_array_as_string", my_array_as_string)
start_time = time.time()
my_array_as_string = r.get("my_array_as_string")
print("Fetch from Redis:", time.time() - start_time)
start_time = time.time()
my_array = json.loads(my_array_as_string)
print("Parse JSON :", time.time() - start_time)
Result:
JSON from disk : 1.075700044631958
Fetch from Redis: 0.078125
Parse JSON : 1.0247752666473389
EDIT: it seems that fetching from redis is actually fast, but the JSON parsing is quite slow. Is there a way to fetch directly an array from Redis without the JSON serialization part ? This is what we do with pyMySQL and it is fast.