3

Is there a chance to upload file by API endpoint which is taking multipart/form-data as a content-type having only URL of that file?

Rule: Download the whole file into memory and then upload by this endpoint is no option (There is no guarantee that the box will ever be big enough to hold a temporary file).

Question: I want to stream file in chunks from one server (GET) to another (multipart/form-data POST). Is this possible? How to achieve that?

Flow: file_server <-GET- my_script.py -POST-> upload server

here is a simple example of downloading into memory (RAM) option (but it's against the rule):

from io import BytesIO

import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

file_url = 'https://www.sysaid.com/wp-content/uploads/features/itam/image-banner-asset.png'
requested_file_response = requests.get(file_url, stream=True)

TOKEN_PAYLOAD = {
    'grant_type': 'password',
    'client_id': '#########',
    'client_secret': '#########',
    'username': '#########',
    'password': '#########'
}


def get_token():
    response = requests.post(
        'https://upload_server/oauth/token',
        params=TOKEN_PAYLOAD)
    response_data = response.json()
    token = response_data.get('access_token')
    if not token:
        print("token error!")
    return token

token = get_token()

file_object = BytesIO()
file_object.write(requested_file_response.content)

# Form conctent
multipart_data = MultipartEncoder(
    fields={
        '--': (
            'test.png',
            file_object  # AttributeError: 'generator' object has no attribute 'encode' when I try to pass generator here.
        ),  
        'id': '2217',
        'fileFieldDefId': '4258',
    }
)

# Create headers
headers = {
    "Authorization": "Bearer {}".format(token),
    'Content-Type': multipart_data.content_type
}

session = requests.Session()
response = session.post(
    'https://upload_server/multipartUpdate',
    headers=headers,
    data=multipart_data,
)

the answer is in a file like object creation for stream purposes

Thank You very much for any help. Cheers!

k.rozycki
  • 635
  • 1
  • 5
  • 12

1 Answers1

3

If I read requests_toolbelt source code right than it requires not only a ability to .read() the file (which we could get just by passing requests.get(..., stream=True).raw), but also that there is someway to determine how much data is left in the stream.

Assuming you are CONFIDENT that you always have a valid content-length header, this would be the solution I would suggest:

import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

file_url = 'https://www.sysaid.com/wp-content/uploads/features/itam/image-banner-asset.png'
target = 'http://localhost:5000/test'


class PinocchioFile:
    """I wish I was a real file"""

    def __init__(self, url):
        self.req = requests.get(url, stream=True)
        length = self.req.headers.get('content-length')
        self.len = None if length is None else int(length)
        self._raw = self.req.raw

    def read(self, chunk_size):
        chunk = self._raw.read(chunk_size) or b''
        self.len -= len(chunk)
        if not chunk:
            self.len = 0
        return chunk


multipart_data = MultipartEncoder(
    fields={
        '--': (
            'test.png',
            PinocchioFile(file_url),
        ),
        'id': '2217',
        'fileFieldDefId': '4258',
    }
)

# Create headers
headers = {
    'Content-Type': multipart_data.content_type
}

response = requests.post(
    target,
    data=multipart_data,
    headers=headers,
)
Maciej Urbański
  • 475
  • 3
  • 12