1

I have a dataframe which I need to convert to a CSV file, and then I need to send this CSV to an API. As I'm sending it to an API, I do not want to save it to the local filesystem and need to keep it in memory. How can I do this?

Josh
  • 718
  • 2
  • 15
  • 38

1 Answers1

3

Easy way: convert your dataframe to Pandas dataframe with toPandas(), then save to a string. To save to a string, not a file, you'll have to call to_csv with path_or_buf=None. Then send the string in an API call.

From to_csv() documentation:

Parameters

path_or_bufstr or file handle, default None

File path or object, if None is provided the result is returned as a string.

So your code would likely look like this:

csv_string = df.toPandas().to_csv(path_or_bufstr=None)

Alternatives: use tempfile.SpooledTemporaryFile with a large buffer to create an in-memory file. Or you can even use a regular file, just make your buffer large enough and don't flush or close the file. Take a look at Corey Goldberg's explanation of why this works.

Community
  • 1
  • 1
Sergey Kovalev
  • 9,110
  • 2
  • 28
  • 32
  • 1
    It looks like the argument name is actually `path_or_buf`. Works like a charm after making that change – Josh May 06 '20 at 23:24
  • Agree with Josh, `path_or_buf` seems to be the current parameter's name. Plus, converting to pandas adds a index column automatically. Being said that, my code is `csv_string = df.toPandas().to_csv(path_or_buf=None, index=False)` Hope it helps – Fco Dec 21 '22 at 13:32