I'm managing several user accounts on a website with an API and I'm regularly retrieving some information for every user.
To regularly get those information I'm using a python script which loads user data from a database and then uses the API connector to make the request.
The endpoints I'm using to do this are private endpoints and, to authenticate, I need to make a request on a specific endpoint with user's api_key and api_secret as parameters, the response contains an access_token which is then used to authenticate the user on private endpoints.
This token is given using request's headers and it must be refreshed regularly.
The connector is working well, however I recently tried to use this in a multi-threaded context. So instead of looping users, I'm launching a thread for every user and I join them after.
In a multi-threaded context the connector also works, but on some rare occasions I realized that user data were mixed up.
I went further into debugging and I realized that in those cases, the issue was that the connector was using the access_token of another user.
I reproduced this issue with a simple example to expose the logic of the script.
#!/usr/bin/python3
from utils.database import Database
from urllib.parse import urlencode
import threading
import requests
import time
class User():
def __init__(self, user_id, database):
self.user_id = user_id
self.database = database
self.connector = None
def get_connector(self:
if not self.connector:
self.__create_connector()
return self.connector
def __create_connector(self):
__user_api_df = self.database.get_table("users_apis", where=f"WHERE user_id = {self.user_id}")
api_key = __user_api_df["api_key"].values[0]
api_secret = __user_api_df["api_secret"].values[0]
self.connector = ApiConnector(api_key, api_secret, self)
def __str__(self):
return f"User-{self.user_id}"
class ApiConnector():
def __init__(self, api_key, api_secret, user=None):
self.base_url = "https://www.api.website.com"
self.api_key = api_key
self.api_secret = api_secret
self.user = user
self.session = requests.Session()
self.__auth_token = None
self.__auth_timeout = None
def api_call_1(self):
return self.__request("GET", "endpoint_path_1", auth=True)
def api_call_2(self):
return self.__request("GET", "endpoint_path_2", auth=True)
def api_call_3(self):
return self.__request("GET", "endpoint_path_3", auth=True)
def __request(self, method, path, payload={}, auth=False, headers={}):
url = f"{self.base_url}{path}"
headers["Accept"] = "application/json"
if auth:
if not self.__is_authenticated():
self.__authenticate()
headers["Authorization"] = "Bearer " + self.__auth_token
print(f"[{self.user}] IN => {path} - {self.__auth_token}")
if method == "GET":
payload_str = f"?{urlencode(payload)}" if payload else ""
response = self.session.request(method, f"{url}{payload_str}", headers=headers)
else:
response = self.session.request(method, url, params=payload, headers=headers)
if auth:
print(f"[{self.user}] OUT => {path} - {response.request.headers['Authorization']}")
return response.json()
def __authenticate(self):
response = self.__request("GET", "authentication_endpoint", payload={
"api_key": self.api_key,
"api_secret": self.api_secret
})
self.__auth_token = response["result"]["access_token"]
self.__auth_timeout = time.time() + response["result"]["expires_in"]
def __is_authenticated(self):
if not self.__auth_timeout:
return False
if self.__auth_timeout < time.time():
return False
return True
class RequestsTester:
def __init__(self):
self.database = Database("host",
"user",
"password",
"database")
self.user_ids = [1, 2, 3]
self.threads = {}
def run(self):
for user_id in self.user_ids:
user = User(user_id, self.database)
thread_name = f"Thread-{user_id}"
self.threads[thread_name] = threading.Thread(target=self.get_data, args=[user])
self.threads[thread_name].start()
for thread_name in self.threads.keys():
self.threads[thread_name].join()
def get_data(self, user):
user.get_connector().api_call_1()
user.get_connector().api_call_2()
user.get_connector().api_call_3()
if __name__ == "__main__":
RequestsTester().run()
Note 1 : I didn't include the Database
class since it's not relevant for the context but every class method is mutex protected to avoid concurrent access.
Note 2 : I'm using python 3.9.2 and request 2.25.1
Before making the call I print the access_token and after the call I print the access_token from the response's request headers
The output generally looks like this:
[User-1] IN => /private/endpoint_path_1 - 1673482029231.1EPZ7Ya-
[User-3] IN => /private/endpoint_path_1 - 1673482029265.1Cdx06z2
[User-2] IN => /private/endpoint_path_1 - 1673482029284.1JrX_wyQ
[User-3] OUT => /private/endpoint_path_1 - Bearer 1673482029265.1Cdx06z2
[User-1] OUT => /private/endpoint_path_1 - Bearer 1673482029231.1EPZ7Ya-
[User-2] OUT => /private/endpoint_path_1 - Bearer 1673482029284.1JrX_wyQ
But on some rare occasion it looks like this
[User-1] IN => /private/endpoint_path_1 - 1673482029231.1EPZ7Ya-
[User-3] IN => /private/endpoint_path_1 - 1673482029265.1Cdx06z2
[User-2] IN => /private/endpoint_path_1 - 1673482029284.1JrX_wyQ
[User-3] OUT => /private/endpoint_path_1 - Bearer 1673482029231.1EPZ7Ya-
[User-1] OUT => /private/endpoint_path_1 - Bearer 1673482029231.1EPZ7Ya-
[User-2] OUT => /private/endpoint_path_1 - Bearer 1673482029284.1JrX_wyQ
The output access token is not the same than the input one and it's the token of another user that is used.
This minimal example is just to understand how the script works but in real condition I have way more than 3 users and I'm not just making API calls but also processing data and storing some things into database from get_data
function.
Every time this error case happens, the input token is always the good one but the output token is always a token from another user, so the issue seems to come from requests
lib.
If I use a loop instead of launching threads, the error never occurs, so it seems to come from the multi-threading context.
From what I saw requests
lib and Session
class are supposed to be thread-safe so I don't understand where this error can come from.
I'm not experimented with python multi-threading so I may be doing something wrong but I can't find what.
Does anybody already had such an issue with requests
lib miwing headers in a multi-threaded context ?