1

I have encountered in this issue for some days and I cannot get along with it with this configuration. So basically the issue is to open a request stream with octet-stream and as I get the file write it down to a file in filesystem, this works perfectly locally and hosted on Debian Linux servers, but there are some issues hosting it in Kubernetes where I have only one pod (I don't have other pods so that load balancer can cause issues) when I start creating and appending the bytes to file that file keeps the size 0 at all times sometimes it does upload at the end sometimes not, but it's like it is keeping everything in the memory or something since when I pause the file and resume again the server finds the size of the file is 0 and starts again. There are other issues with this too, like uploading more than one file that are related to each other depending on the business logic.

This is some small function that I initialize the empty files

def _initialize_file(uid: str) -> None:
    if not os.path.exists(nfs_path_that_is_shared):
        os.makedirs(nfs_path_that_is_shared)

    open(os.path.join(nfs_path_that_is_shared, f"{uid}"), "a").close()

This patch function accepts the file completes everything

@router.patch("/{uuid}", status_code=status.HTTP_204_NO_CONTENT)
async def upload_chunk(
    response: Response,
    uuid: str,
    content_type: str = Header(None),
    content_length: int = Header(None),
    upload_offset: int = Header(None),
    _: bytes | None = Depends(_get_request_chunk),
) -> Response:
    return await _get_and_save_the_file(
        response,
        uuid,
        content_type,
        content_length,
        upload_offset,
    )

and here is this method that is not working as expected in kubernetes deployments

async def _get_request_chunk(
    request: Request,
    uuid: str = Path(...),
    post_request: bool = False,
) -> bool | None:
    log.info("Reading metadata")
    meta = _read_metadata(uuid)
    if not meta or not _file_exists(uuid):
        log.info("Metadata not found")
        return False

    path = f"{nfs_path_that_is_shared}/{uuid}"
    with open(path, "ab") as f:

        log.info("Getting chunks")
        async for chunk in request.stream():
            if post_request and chunk is None or len(chunk) == 0:
                log.info("Chunk is empty")
                return None

            log.info("Checking chunk size")
            if _get_file_length(uuid) + len(chunk) > MAX_SIZE:
                log.info("Throwing HTTP_ENTITY_TO_LARGE")
                raise HTTP_ENTITY_TO_LARGE

            log.info("Writing chunk")
            f.write(chunk)
            log.info("Modifying metadata")
            meta.offset += len(chunk)
            meta.upload_chunk_size = len(chunk)
            meta.upload_part += 1
            log.info("Writing metadata")
            _write_metadata(meta)
            log.info("Metadata written")

    return True

I am planning to add Redis as a solution since all our infrastructure if I cannot find a solution for this...

Any help is much appreciated!

I tried adding NFS that is shared between namespaces and pods, but that was the same issue it wrote the file sometimes usually until everything is uploaded it doesn't write the bytes to the file where in contrast locally it writes bytes as it receives.

Edit: I did also implemented aiofiles async non blocking but it did not work.

Edi
  • 1,900
  • 18
  • 27
  • Does this answer your question? [How to Upload a large File (≥3GB) to FastAPI backend?](https://stackoverflow.com/questions/73442335/how-to-upload-a-large-file-%e2%89%a53gb-to-fastapi-backend) – Chris Feb 01 '23 at 07:24
  • No, I did try the links you offered earlier on the other post, but it did not work... – Edi Feb 01 '23 at 07:46

0 Answers0