2

Wondering if anyone can help me out please.

In FastAPI I want to set up an endpoint which returns the contents of a generated csv file as plain text. I don't want the file to be downloaded.

I've tried the following which works just fine however a file download is always initiated.

  @app.get('/hosts/last_hour')
        def hosts_last_hour():
            epoch_start=time.mktime(datetime.now().timetuple())-3600
            epoch_end=time.mktime(datetime.now().timetuple())
            process_hosts(epoch_start,epoch_end)
            
            def iterate_csv(epoch_start):
                with open(f'output/hosts_traffic_{int(epoch_start)}.csv',mode='rb') as csv_file:
                    yield from csv_file

            response = StreamingResponse(iterate_csv(epoch_start), media_type="text/csv")
            return(response)

I need the contents of the file to be sent in the response body as text/csv (don't want a downloaded to be initiated and don't want the response in json format). Any ideas how to achieve this?

Thanks in advance.

José

  • Please have a look at related answers [here](https://stackoverflow.com/a/72053557/17865804) and [here](https://stackoverflow.com/a/73586180/17865804), as well as [here](https://stackoverflow.com/a/73580096/17865804), [here](https://stackoverflow.com/a/73694164/17865804) and [here](https://stackoverflow.com/a/70655118/17865804). – Chris Feb 24 '23 at 18:26
  • Thanks. this led me in the right direction. The solution is to ensure you pass 'Content-Disposition':'inline' in your response header. I'll be posting the solution. – José Oliveira Feb 24 '23 at 20:26
  • Why use `time.mktime(datetime.now().timetuple())` rather than just `time.time()`? – Sam Mason Mar 01 '23 at 12:03
  • @SamMason you're right. thanks for pointing that out. changed to time.time() – José Oliveira Mar 02 '23 at 11:11

2 Answers2

1

In case anyone is looking for a similar solution, here's what worked for me (see this answer for more details):

# 1 MB chunks    
CHUNK_SIZE = 1024 * 1024 
    
    @app.get('/hosts/')
        
        async def hosts():
            tenant=TENANT
            # Get data for period based on MAX_QUERY_PERIOD (default=3600s)
            epoch_start=time.time()-MAX_QUERY_PERIOD
            epoch_end=time.time()
    
            process_hosts(epoch_start,epoch_end)
            
            async def iter_file():
                async with aiofiles.open(f'output/hosts_traffic_{int(epoch_start)}.csv',mode='rb') as csv_file:
                    while chunk := await csv_file.read(CHUNK_SIZE):
                        yield chunk
            
            headers = {'Content-Disposition': 'inline'}
            response = StreamingResponse(iter_file(), media_type="text/csv",headers=headers)
    
            return response

Thanks!

  • Please have a look at [this answer](https://stackoverflow.com/a/73843234/17865804), regarding returning data using a `StreamingResponse`. – Chris Feb 25 '23 at 04:27
  • fixed it so the csv_file doesn't remain open. I'll be testing this out with a larger data set to see what kind of response times I get. Thanks! – José Oliveira Feb 25 '23 at 12:29
  • 1
    just editted the answer I posted as fastapi was taking forever to return the contents of the csv file. Reading the file in chunks did the trick. Thank you all for your help and attention. – José Oliveira Feb 27 '23 at 16:42
0

This will be easy using pandas:

Juast read the csv into a DataFrame using pd.read_csv and then turn the DataFrame into a dict using the to_dict() function.

Do note that pandas DataFrames can cause memory issues.

import pandas as pd
 
@app.get('/hosts/last_hour')
def hosts_last_hour():
   df_data = pd.read_csv("SampleCSVFile_11kb.csv")
   dict_data = df_data.to_dict()
   return dict_data 

If you don't want to use pandas then do this:

import csv
import json

@app.get('/hosts/last_hour')
def hosts_last_hour():
    data_dict = []
 
    with open("data.csv", encoding = 'utf-8') as csv_file_handler:
        csv_reader = csv.DictReader(csv_file_handler)
        for rows in csv_reader:
            data_dict.append(rows)

    return json.dumps(data_dict)
Lars K.
  • 59
  • 4
  • thanks for the quick reply. As I don't want to use Pandas I turned to the second method. Unfortunately this outputs as json. I do have a solution that I'll be posting. – José Oliveira Feb 24 '23 at 20:14