403 Forbidden when trying to upload PDF as blob to S3 bucket using PUT

Question

What I'm Trying to Do

Upload a PDF file from a browser client without exposing any credentials or anything unsavory. Based on this, I thought it could be done, but it doesn't seem to work for me.

The premise is:

you request a pre-signed URL from an S3 Bucket based on a set of parameters supplied to a function that is part of the JavaScript AWS SDK
you supply this URL to the frontend, which can use it to place a file in the S3 Bucket without needing to use any credentials or authentication on the frontend.

GET a Pre-Signed URL From S3

This part is simple and it works for me. I just request a URL from S3 with this little JS nugget:

const s3Params = {
    Bucket: uploadBucket,
    Key: `${fileId}.pdf`,
    ContentType: 'application/pdf',
    Expires: 60,
    ACL: 'public-read',
}

let uploadUrl = s3.getSignedUrl('putObject', s3Params);

Use the Pre-Signed URL to Upload a File to S3

This is the part that doesn't work, and I can't figure out why. This little chunk of code basically sends a blob of data to the S3 bucket pre-signed URL using a PUT request.

const result = await fetch(response.data.uploadURL, {
        method: 'put',
        body: blobData,
});

PUT or POST?

I've found that using any POST requests results in 400 Bad Request, so PUT it is.

What I've Looked At

Content-Type (in my case, it'd be application/pdf, so blobData.type) -- they match between the backend and frontend.

x-amz-acl header

Frontend (fileUploader.js, using Vue):

...

uploadFile: async function(e) {
      /* receives file from simple input element -> this.file */
      // get signed URL
      const response = await axios({
        method: 'get',
        url: API_GATEWAY_URL
      });

      console.log('upload file response:', response);

      let binary = atob(this.file.split(',')[1]);
      let array = [];

      for (let i = 0; i < binary.length; i++) {
        array.push(binary.charCodeAt(i));
      }

      let blobData = new Blob([new Uint8Array(array)], {type: 'application/pdf'});
      console.log('uploading to:', response.data.uploadURL);
      console.log('blob type sanity check:', blobData.type);

      const result = await fetch(response.data.uploadURL, {
        method: 'put',
        headers: {
          'Access-Control-Allow-Methods': '*',
          'Access-Control-Allow-Origin': '*',
          'x-amz-acl': 'public-read',
          'Content-Type': blobData.type
        },
        body: blobData,
      });

      console.log('PUT result:', result);

      this.uploadUrl = response.data.uploadURL.split('?')[0];
    }

Backend (fileReceiver.js):

'use strict';

const uuidv4 = require('uuid/v4');
const aws = require('aws-sdk');
const s3 = new aws.S3();

const uploadBucket = 'the-chumiest-bucket';
const fileKeyPrefix = 'path/to/where/the/file/should/live/';

const getUploadUrl = async () => {
  const fileId = uuidv4();
  const s3Params = {
    Bucket: uploadBucket,
    Key: `${fileId}.pdf`,
    ContentType: 'application/pdf',
    Expires: 60,
    ACL: 'public-read',
  }

  return new Promise((resolve, reject) => {
    let uploadUrl = s3.getSignedUrl('putObject', s3Params);
    resolve({
      'statusCode': 200,
      'isBase64Encoded': false,
      'headers': { 
        'Access-Control-Allow-Origin': '*',
        'Access-Control-Allow-Headers': '*',
        'Access-Control-Allow-Credentials': true,
      },
      'body': JSON.stringify({
        'uploadURL': uploadUrl,
        'filename': `${fileId}.pdf`
      })
    });
  });
};

exports.handler = async (event, context) => {
  console.log('event:', event);
  const result = await getUploadUrl();
  console.log('result:', result);

  return result;
}

Serverless config (serverless.yml):

service: ocr-space-service

provider:
  name: aws
  region: ca-central-1
  stage: ${opt:stage, 'dev'}
  timeout: 20

plugins:
  - serverless-plugin-existing-s3
  - serverless-step-functions
  - serverless-pseudo-parameters
  - serverless-plugin-include-dependencies

layers:
  spaceOcrLayer:
    package:
      artifact: spaceOcrLayer.zip
    allowedAccounts:
      - "*"

functions:
  fileReceiver:
    handler: src/node/fileReceiver.handler
    events:
      - http:
          path: /doc-parser/get-url
          method: get
          cors: true
  startStateMachine:
    handler: src/start_state_machine.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
    events:
      - existingS3:
          bucket: ingenio-documents
          events:
            - s3:ObjectCreated:*
          rules:
            - prefix: 
            - suffix: .pdf
  startOcrSpaceProcess:
    handler: src/start_ocr_space.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  parseOcrSpaceOutput:
    handler: src/parse_ocr_space_output.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  renamePdf:
    handler: src/rename_pdf.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  parseCorpSearchOutput:
    handler: src/node/pdfParser.handler
    role: 
    runtime: nodejs10.x
  saveFileToProcessed:
    handler: src/node/saveFileToProcessed.handler
    role: 
    runtime: nodejs10.x

stepFunctions:
  stateMachines:
    ocrSpaceStepFunc:
      name: ocrSpaceStepFunc
      definition:
        StartAt: StartOcrSpaceProcess
        States:
          StartOcrSpaceProcess:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-startOcrSpaceProcess"
            Next: IsDocCorpSearchChoice
            Catch:
            - ErrorEquals: ["HandledError"]
              Next: HandledErrorFallback
          IsDocCorpSearchChoice:
            Type: Choice
            Choices:
              - Variable: $.docIsCorpSearch
                NumericEquals: 1
                Next: ParseCorpSearchOutput
              - Variable: $.docIsCorpSearch
                NumericEquals: 0
                Next: ParseOcrSpaceOutput
          ParseCorpSearchOutput:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseCorpSearchOutput"
            Next: SaveFileToProcessed
            Catch:
              - ErrorEquals: ["SqsMessageError"]
                Next: CorpSearchSqsErrorFallback
              - ErrorEquals: ["DownloadFileError"]
                Next: CorpSearchDownloadFileErrorFallback
              - ErrorEquals: ["HandledError"]
                Next: HandledNodeErrorFallback
          SaveFileToProcessed:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-saveFileToProcessed"
            End: true
          ParseOcrSpaceOutput:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseOcrSpaceOutput"
            Next: RenamePdf
            Catch:
            - ErrorEquals: ["HandledError"]
              Next: HandledErrorFallback
          RenamePdf:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-renamePdf"
            End: true
            Catch:
              - ErrorEquals: ["HandledError"]
                Next: HandledErrorFallback
              - ErrorEquals: ["AccessDeniedException"]
                Next: AccessDeniedFallback
          AccessDeniedFallback:
            Type: Fail
            Cause: "Access was denied for copying an S3 object"
          HandledErrorFallback:
            Type: Fail
            Cause: "HandledError occurred"
          CorpSearchSqsErrorFallback:
            Type: Fail
            Cause: "SQS Message send action resulted in error"
          CorpSearchDownloadFileErrorFallback:
            Type: Fail
            Cause: "Downloading file from S3 resulted in error"
          HandledNodeErrorFallback:
            Type: Fail
            Cause: "HandledError occurred"

Error:

403 Forbidden

PUT Response

Response {type: "cors", url: "https://{bucket-name}.s3.{region-id}.amazonaw…nedHeaders=host%3Bx-amz-acl&x-amz-acl=public-read", redirected: false, status: 403, ok: false, …} body: (...) bodyUsed: false headers: Headers {} ok: false redirected: false status: 403 statusText: "Forbidden" type: "cors" url: "https://{bucket-name}.s3.{region-id}.amazonaws.com/actionID.pdf?Content-Type=application%2Fpdf&X-Amz-Algorithm=SHA256&X-Amz-Credential=CREDZ-&X-Amz-Date=20190621T192558Z&X-Amz-Expires=900&X-Amz-Security-Token={token}&X-Amz-SignedHeaders=host%3Bx-amz-acl&x-amz-acl=public-read" proto: Response

What I'm Thinking

I'm thinking the parameters supplied to the getSignedUrl call using the AWS S3 SDK aren't correct, though they follow the structure suggested by AWS' docs (explained here). Aside from that, I'm really lost as to why my request is rejected. I've even tried exposing my Bucket to the public fully and it still didn't work.

Edit

#1:

After reading this, I tried to structure my PUT request like this:

      let authFromGet = response.config.headers.Authorization;      

      const putHeaders = {
        'Authorization': authFromGet,
        'Content-Type': blobData,
        'Expect': '100-continue',
      };

      ...

      const result = await fetch(response.data.uploadURL, {
        method: 'put',
        headers: putHeaders,
        body: blobData,
      });

this resulted in a 400 Bad Request instead of a 403; different, but still wrong. It's apparent that putting any headers on the request is wrong.

@ChumiestBucket I am curious if you got to a resolution? You did such a great job writing up the problem that it would be a shame to see no resolution even if you just wrote it yourself. — Nexeh, May 13 '20 at 15:05

James Beswick · Answer 1 · 2019-07-14T20:21:11.733

Digging into this, it's because you are trying to upload an object with a public ACL into a bucket that doesn't allow public objects.

Optionally remove the public ACL statement or...
Ensure the bucket set to either
- Be publicly viewable or
- Ensure no other policy blocks public access (e.g. Do you have an account policy preventing publicly viewable objects but attempting to upload an object with the public ACL?)

Basically, you cannot upload objects with a public ACL into a bucket where there is some restriction preventing that - you'll get the 403 error you describe. HTH.