1

I am trying to stream a csv to azure blob storage, the csv is generated directly from python scripts without local copy, i have the following code, df is the csv file:

    with open(df,'w') as f:
       stream = io.BytesIO(f)
       stream.seek(0)
       block_blob_service.create_blob_from_stream('flowshop', 'testdata123', stream)

then i got the error massage:

  stream = io.BytesIO(f)  TypeError: a bytes-like object is required, not '_io.TextIOWrapper'

i think the problem has been the format incorrect, can you please identify the problem. thanks.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
Pepin Peng
  • 457
  • 1
  • 8
  • 21

1 Answers1

2

You opened df for write, then tried to pass the resulting file object as the initializer of io.BytesIO (which is supposed to to take actual binary data, e.g. b'1234'). That's the cause of the TypeError; open files (read or write, text or binary) are not bytes or anything similar (bytearray, array.array('B'), mmap.mmap, etc.), so passing them to io.BytesIO makes no sense.

It looks like your goal is to read from df, and you shouldn't need io.BytesIO at all for that. Just change the mode from (text) write, 'w', to binary read, 'rb'. Then pass the resulting file object to your API directly:

with open(df, 'rb') as f:
   block_blob_service.create_blob_from_stream('flowshop', 'testdata123', f)

Update: Apparently df was your actual data, not a file name to open at all. Given that, you should really skip the stream API (which is pointless if the data is already in memory) and just use the bytes based API directly:

block_blob_service.create_blob_from_bytes('flowshop', 'testdata123', df)

or if df is str, not bytes:

block_blob_service.create_blob_from_bytes('flowshop', 'testdata123', df.encode('utf-8'))
ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • hi, i did as your code, but after check the blob storage, the created testdata123 is a empty file ,,,, – Pepin Peng Mar 28 '18 at 21:58
  • @PepinPeng: Does whatever file is referenced by `df` have anything in it? Odds are it doesn't, since your previous code (which opened it for write) would have truncated it to an empty file as a side-effect of opening it (so even though the code failed immediately after, the data was already lost). – ShadowRanger Mar 28 '18 at 22:06
  • hi, thanks, but i have put a print (f) just before the blob API, and i got this print out as "<_io.BufferedReader name='WO,1,2,3,4\n1,600,300,400,500\n2,200,500,100,300\n3,100,500,200,300\n4,300,600,100,400\n'>" that means after open as f, the data are still there , should i try write to it ? – Pepin Peng Mar 28 '18 at 22:09
  • @PepinPeng: Your `df` was the actual data, not a file name. That would have been useful to know initially. Given that, you don't need `open` at all. You want `block_blob_service.create_blob_from_stream('flowshop', 'testdata123', io.BytesIO(df))`, possibly `io.BytesIO(df.encode('utf-8'))` if `df` is a `str`. Or just skip the rigmarole, and use the API that accepts `bytes` directly: `create_blob_from_bytes('flowshop', 'testdata123', df)` (again, might need `df.encode('utf-8')` if `df` is `str`, not `bytes`). – ShadowRanger Mar 28 '18 at 22:13
  • i was struggling in the Shadow until i met your ShadowRanger, my whole days coding end with a success, thank you ,hahaha – Pepin Peng Mar 28 '18 at 22:16