Preamble: I'm very new to APIs and computer science is not my background.
So, I was testing an API, through fastapi
, my function is very simple, it loads a dataset, performs a couple validations and then returns it in json
format. A simplified version would look like this:
@app.get('/data/{dataset}/{version}')
async def download_data(dataset: str, version: str):
now = time()
file = "data/{dataset}/{dataset}_{year}.feather".format(dataset = dataset, year = version)
data = pd.read_feather(file)
json_data = {"data": data.to_dict()}
print(time() - now)
return json_data
If I run this code locally for a given dataset (which is stored in my computer), it takes about ~7 seconds (the result from print(time() - now)
), but if I run it through the API, like this:
r = requests.get('http://127.0.0.1:8000/data/dataset_name/version_name')
It takes about 80 seconds. In my coworker's computer it takes about 30 (if he's running it locally on his machine). So, my question is why does data transmission take lots of time if everything's running on my machine? What am I missing?
Sorry I can't provide a reproducible example, not sure if possible in this case.