4

This is Part Two of my journey to try and upload a file from the browser to an S3 bucket. Check out Part One.

What I'm Trying to Do

Upload a file (PDF) from my browser client (Vue/Nuxt app) to an S3 bucket.

What I'm Using

Front End

Nuxt, Vue, Axios, Fetch.

Back End

JS, Serverless Framework for easy AWS deployment.

Permissions

The permissions are good, unless there is some permission regarding POSTing to an S3 bucket that I don't know about. The S3 bucket is setup to allow CORS; all methods, all origins. No, the Lambda shown here does not have the incorrect permissions; if it did, I would not be able to get a pre-signed URL successfully.

How It Should Work

Based on this and this, I thought I'd be able to PUT (or POST) a PDF to an S3 bucket by first retrieving a pre-signed URL from the my backend code shown below, which called s3.getSignedUrl('putObject'). I was able to easily get this URL to the frontend, but PUTing (or POSTing) the PDF to the returned URL wasn't working.

What I did in Part One proved useless, and I haven't been able to track down an answer as to why, so I decided to try another JS AWS SDK function called createPresignedPost.

Instead of a 403 error, I get a 412 error. After reading, it seemed like I wasn't passing the right options into the POST request. After looking at this, I set up my request as shown below, but still got the error.

Code

Backend (fileReceiver.js):

const getUploadUrl = async () => {
  const fileId = uuidv4();

  let params = {
    Bucket: 'the-chumiest-bucket',
    ContentType: 'application/pdf',
    Conditions: [
      {'bucket': 'the-chumiest-bucket'}, // added to both places 
      ['starts-with', '$key', 'path/to/where/the/file/should/go/'],
      {'acl': 'public-read'},
      {'success_action_status': '200'},
      ['content-length-range', 1, 1024 * 1024 * 15],
      ['starts-with', '$Content-Type', 'application/pdf'],
    ]
  };

  return new Promise((resolve, reject) => {
    s3.createPresignedPost(params, function(err, data) {
      if (err) {
        console.error('Presigning post data encountered an error', err);
        reject(err);
      } else {
        //data.Fields.key = 'path/to/uploads/${filename}';
        console.log('The post data is', data);
        resolve({
          'statusCode': 200,
          'headers': { 
            'Access-Control-Allow-Origin': '*',
            'Access-Control-Allow-Headers': '*',
            'Access-Control-Allow-Credentials': true,
            'Content-Type': 'application/pdf',
          },
          'body': JSON.stringify({
            'data': data,
          }),
        });
      }
    });
  });

};

Frontend (fileUploader.js):

I wanted to structure my request to match this and this.

uploadFile: async function(e) {
      const response = await axios({
        method: 'get',
        url: API_GATEWAY_URL,
        data: {
          'fileName': this.fileNames[0],
          'contentType': this.file.type || 'application/pdf',
          'fileLength': this.file.length,
        },
        body: {
          'fileName': this.fileNames[0],
          'contentType': this.file.type || 'application/pdf',
          'fileLength': this.file.length,
        },
      });

      console.log('upload file response:', response);

      let binary = atob(this.file.split(',')[1]);
      let array = [];

      for (let i = 0; i < binary.length; i++) {
        array.push(binary.charCodeAt(i));
      }

      let fields = response.data.data.fields;
      let blobData = new Blob([new Uint8Array(array)], {type: 'application/pdf'});

      if (blobData.type != 'application/pdf') {
        console.error('Filetype is wrong! Upload a PFD or youre dead to me');
      }

      console.log(`Uploading "${this.fileNames[0]}" to bucket "${response.data.data.fields.bucket}"`);

      const postHeaders = {
        'Policy': fields.Policy,
        'X-Amz-Algorithm': fields['X-Amz-Algorithm'],
        'X-Amz-Credential': fields['X-Amz-Credential'],
        'X-Amz-Security-Token': fields['X-Amz-Security-Token'],
        'X-Amz-Signature': fields['X-Amz-Signature'],
        'Content-Type': blobData.type || 'application/pdf',
        'acl': 'public-read',
        'key': 'path/to/where/the/file/should/go/' + this.fileNames[0],
      }

      const result = await fetch(response.data.data.url, {
        method: 'post',
        body: blobData,
        headers: postHeaders,
      });

      this.uploadUrl = response.data.uploadURL.split('?')[0];
    },

Serverless Config (serverless.yml):

service: ocr-space-service

provider:
  name: aws
  region: ca-central-1
  stage: ${opt:stage, 'dev'}
  timeout: 20

plugins:
  - serverless-plugin-existing-s3
  - serverless-step-functions
  - serverless-pseudo-parameters
  - serverless-plugin-include-dependencies

layers:
  spaceOcrLayer:
    package:
      artifact: spaceOcrLayer.zip
    allowedAccounts:
      - "*"

functions:
  fileReceiver:
    handler: src/node/fileReceiver.handler
    role:
    events:
      - http:
          path: /doc-parser/get-url
          method: get
          cors: true
  startStateMachine:
    handler: src/start_state_machine.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
    events:
      - existingS3:
          bucket: ingenio-documents
          events:
            - s3:ObjectCreated:*
          rules:
            - prefix: 
            - suffix: .pdf
  startOcrSpaceProcess:
    handler: src/start_ocr_space.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  parseOcrSpaceOutput:
    handler: src/parse_ocr_space_output.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  renamePdf:
    handler: src/rename_pdf.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  parseCorpSearchOutput:
    handler: src/node/pdfParser.handler
    role: 
    runtime: nodejs10.x
  saveFileToProcessed:
    handler: src/node/saveFileToProcessed.handler
    role: 
    runtime: nodejs10.x

stepFunctions:
  stateMachines:
    ocrSpaceStepFunc:
      name: ocrSpaceStepFunc
      definition:
        StartAt: StartOcrSpaceProcess
        States:
          StartOcrSpaceProcess:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-startOcrSpaceProcess"
            Next: IsDocCorpSearchChoice
            Catch:
            - ErrorEquals: ["HandledError"]
              Next: HandledErrorFallback
          IsDocCorpSearchChoice:
            Type: Choice
            Choices:
              - Variable: $.docIsCorpSearch
                NumericEquals: 1
                Next: ParseCorpSearchOutput
              - Variable: $.docIsCorpSearch
                NumericEquals: 0
                Next: ParseOcrSpaceOutput
          ParseCorpSearchOutput:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseCorpSearchOutput"
            Next: SaveFileToProcessed
            Catch:
              - ErrorEquals: ["SqsMessageError"]
                Next: CorpSearchSqsErrorFallback
              - ErrorEquals: ["DownloadFileError"]
                Next: CorpSearchDownloadFileErrorFallback
              - ErrorEquals: ["HandledError"]
                Next: HandledNodeErrorFallback
          SaveFileToProcessed:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-saveFileToProcessed"
            End: true
          ParseOcrSpaceOutput:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseOcrSpaceOutput"
            Next: RenamePdf
            Catch:
            - ErrorEquals: ["HandledError"]
              Next: HandledErrorFallback
          RenamePdf:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-renamePdf"
            End: true
            Catch:
              - ErrorEquals: ["HandledError"]
                Next: HandledErrorFallback
              - ErrorEquals: ["AccessDeniedException"]
                Next: AccessDeniedFallback
          AccessDeniedFallback:
            Type: Fail
            Cause: "Access was denied for copying an S3 object"
          HandledErrorFallback:
            Type: Fail
            Cause: "HandledError occurred"
          CorpSearchSqsErrorFallback:
            Type: Fail
            Cause: "SQS Message send action resulted in error"
          CorpSearchDownloadFileErrorFallback:
            Type: Fail
            Cause: "Downloading file from S3 resulted in error"
          HandledNodeErrorFallback:
            Type: Fail
            Cause: "HandledError occurred"

POST Response

POST https://s3.{regionName}.amazonaws.com/{bucketName} 412 (Precondition Failed) PreconditionFailedAt least one of the pre-conditions you specified did not holdBucket POST must be of the enclosure-type multipart/form-data934186E69EF6F90EAPU1d8pkKL3XxXjZ8T1oXkWuDRECAPZROklZbHBv+lmNRv/ivoLO/8BhoS8QYXA98850RhrGwhI=

Edit

Looks like the request is being 412'd because it's expecting multipart/form-data, not application/pdf as I specified

New Code

I wrote a function to convert the PDF to base64 and upload it as part of a FormData object.

convertToBase64(file) {
      if (file.length > 0) {
        let fileToLoad = file[0];
        let fileReader = new FileReader();
        let base64;
        fileReader.onload = (fileLoadedEvent) => {
          console.log('fileLoadedEvent', fileLoadedEvent);
          base64 = fileLoadedEvent.target.result;
          console.log('base64', base64);
        };
        fileReader.readAsDataURL(fileToLoad);
      } else {
        console.log('file length was 0');
      }
    },

...

let formData = new FormData();
formData.append('pdfFile', this.convertToBase64(this.file));

// upload code from above...

Still no love! Not sure if what I'm trying to do is even possible.

ChumiestBucket
  • 868
  • 4
  • 22
  • 51

0 Answers0