3

I have an endpoint that uploads an image file to the server, then to S3.

When I run on localhost, the MultipartFile byte size is correct, and the upload is successful.

However, the moment I deploy it to my EC2 instance the uploaded file size is incorrect.

Controller Code

@PostMapping("/{id}/photos")
fun addPhotos(@PathVariable("id") id: Long,
              @RequestParam("file") file: MultipartFile,
              jwt: AuthenticationJsonWebToken) = ApiResponse.success(propertyBLL.addPhotos(id, file, jwt))

Within the PropertyBLL.addPhotos method, printing file.size results in the wrong size.

Actual file size is 649305 bytes, however when uploaded to my prod server it reads as 1189763 bytes.

  • My production server is an AWS EC2 instance, behind Https.
  • The Spring application yml files are the same. The only configurations I overrode were the file max size properties.
  • I'm using PostMan to Post the request. I'm passing the body as form-data, key named "file".
  • Again, it works perfectly when running locally.

I did another test where I wrote the uploaded file to the server so I could compare.

Uploaded file's first n bytes in Hex editor:

EFBFBD50 4E470D0A 1A0A0000 000D4948 44520000 03000000 02400802 000000EF BFBDCC96 01000000 0467414D 410000EF BFBDEFBF BD0BEFBF BD610500 00002063 48524D00 007A2600

Original file's first n bytes:

89504E47 0D0A1A0A 0000000D 49484452 00000300 00000240 08020000 00B5CC96 01000000 0467414D 410000B1 8F0BFC61 05000000 20634852 4D00007A 26000080 840000FA 00000080

They both appear to have the text "PNG" in them and also have the ending EXtdate:modify/create markers.

Per Request, the core contents of addPhoto:

val metadata = ObjectMetadata()
metadata.contentLength = file.size
metadata.contentType = "image/png"
LOGGER.info("Uploading image of size {} bytes, name: {}", file.size, file.originalFilename)
val request = PutObjectRequest(awsProperties.cdnS3Bucket, imageName, file.inputStream, metadata)
awsSdk.putObject(request)

This works when I run web server locally. imageName is just a custom built name. There is other code involving hibernate models, but is not relevant.

Update

This appears to be Https/api proxy related. When I hit the EC2 node's http url, it works fine. However, when I go through the api proxy (https://api.thedomain.com), which proxies to the EC2 node, it fails. I will continue down this path.

Kenny Cason
  • 12,109
  • 11
  • 47
  • 72
  • are you doing any gzip compression on the files? – BarathVutukuri Jun 28 '19 at 05:30
  • Can you check [this](https://stackoverflow.com/questions/37601388/size-of-the-file-is-changing-after-upload-in-spring) once. Logging filter is altering the request, which is resulting in different file size. – BarathVutukuri Jun 28 '19 at 05:39
  • Can you provide for `PropertyBLL.addPhotos()` method as well? – Phenomenal One Jun 28 '19 at 05:39
  • @MaruthiAdithya Added. I also during my testing performed a Files.copy(from, to) to perform a sanity test on the binary files. (The results are in comments about the hex dump) – Kenny Cason Jun 28 '19 at 06:32
  • @BarathVutukuri I am not doing anything specifically in regards to compression that I know of. Also `file.size` when running locally matches the local file size. When I upload to the production server however file.size reports significantly larger. (I'll check that Logging filter link you posted as well, I did find that post earlier in my Google search) – Kenny Cason Jun 28 '19 at 06:34
  • @BarathVutukuri I do not see any logging filters on my end. Also there is no difference in my prod vs local configuration. It's even using the same application.yml config. – Kenny Cason Jun 28 '19 at 06:44
  • I also just updated various libraries/dependencies, problem still persists. – Kenny Cason Jun 28 '19 at 07:10
  • As I see the object uploaded is Image, are you able to see the contents properly after upload? This [link on serverfault](https://serverfault.com/questions/570310/amazon-s3-upload-bytes-transferred-is-larger-than-actual-file-size) says the file size is different due to metadata being added in the request. Correct me, If I am wrong. – BarathVutukuri Jun 28 '19 at 07:33
  • @KennyCason While testing the scenario locally, are you uploading the file to S3? or just to a local path? – BarathVutukuri Jun 28 '19 at 07:42
  • @KennyCason Can you check the md5sum for the file once it is uploaded? I think you can calculate md5 while putting the object in S3 using boto and then compare both of them. – BarathVutukuri Jun 28 '19 at 07:47
  • @BarathVutukuri I am doing the full S3 upload from localhost as well. I am programmatically doing nothing different between the two environments. If I upload via localhost the image file in s3 is fine. If i immediately copy the MultipartFile bytes to a local file, it's also readable. It's ONLY when I make the post call to upload the file to the web server installed on my EC2 instance that the file data is strange. On the prod server – Kenny Cason Jun 28 '19 at 07:53
  • EVEN IF I directly take the MultipartFile byte data and write it directly to disk, the file data is corrupt (and bytes size does not match, per my example with the Hexdump). I have taken all files and compared them and they do not match. That serverfault post is interesting, but I'm very surprised the byte difference would be so sever. Also, I'm not sure why it doesn't happen when on localhost. When uploading from localhost the s3 file size matches the local file size. – Kenny Cason Jun 28 '19 at 07:57
  • @BarathVutukuri I discovered something interesting... Updated the question. It's likely an AWS configuration issue. Which makes me feel a lot better. – Kenny Cason Jun 28 '19 at 08:28
  • 1
    @KennyCason It looks like MultiPart file upload is having issues with AWS API Gateway. There is a thread in AWS Developer page which just talks about your issue completely. See [this](https://forums.aws.amazon.com/thread.jspa?threadID=252037) and [this](https://stackoverflow.com/questions/41756190/api-gateway-post-multipart-form-data). Found lot of links with similar issue all focuses on changing the binary media type to use `multipart/form-data`. – BarathVutukuri Jun 28 '19 at 09:06
  • @BarathVutukuri I missed your latest comment sadly. I discovered the same thread. After a bit more tinkering I was able to resolve. Thanks for the help! :) – Kenny Cason Jun 28 '19 at 20:21

3 Answers3

4

After more debugging I discovered that when I POST to the EC2 instance directly everything works as expected. Our primary and public api url makes proxies requests through Amazon's API Gateway service. This service for some reason converts the data to Base64 instead of just passing through raw binary data.

I have found documentation to update the API Gateway to passthrough binary data: here.

I am using the Content-Type value of multipart/form-data. Do not forget to also add it in your API Settings where you enable binary support.

I did not have to edit the headers options, additionally I used the default "Method Request Passthrough" template.
Integration Example

And finally, don't forget to deploy your api changes...

It's now working as expected.

Kenny Cason
  • 12,109
  • 11
  • 47
  • 72
0

Sorry, but many of the comments make no sense. file.size will return the size of the uploaded file in bytes, NOT the size of the request (which, yes, due to different filters could potentially be enhanced with additional information and increase in size). Spring can't just magically double the size of a PNG file (in your case adding almost another ~600kb of information on top of whatever you've sent). While I'd like to trust that you know what you're doing and the numbers you are giving us are indeed correct, to me, all evidence points to human error... please, double-, triple-, quadruple- check that you're indeed uploading the same file in all scenarios.

How did you get to 649305 bytes in the first place? Who gave you that number? Was it your code or did you actually look at the file on disk and see how big it was? The only way compression discussions make any sense in this context is if 649305 bytes is the already compressed size of the file when running locally (it's actual size on disk being 1189763 bytes) and indeed, compression not being turned on when deployed to AWS for some reason and you receive the full uncompressed file (we don't even know how you are deploying it... is it really the same as locally? Are you running a standalone .jar in both cases? Are you deploying a .war to AWS perhaps instead? Are you really running the app in the same container and container version in both cases or are you perhaps running Tomcat locally and Jetty on AWS? etc. etc. etc.). Are you sure your Postman request is not messed up and you're not sending something else by accident (or more than you think)?

EDIT:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelVersion>4.0.0</modelVersion>

    <groupId>com.sandbox</groupId>
    <artifactId>spring-boot-file-upload</artifactId>
    <version>1.0-SNAPSHOT</version>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.6.RELEASE</version>
    </parent>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>
package com.sandbox;

import static org.springframework.http.ResponseEntity.ok;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;

@SpringBootApplication
public class Application {

    public static void main(final String[] arguments) {
        SpringApplication.run(Application.class,
                              arguments);
    }

    @RestController
    class ImageRestController {

        @PostMapping(path = "/images")
        ResponseEntity<String> upload(@RequestParam(name = "file") final MultipartFile image) {
            return ok("{\"response\": \"Uploaded image having a size of '" + image.getSize() + "' byte(s).\"}");
        }
    }
}

The example is in Java because it was just faster to put together (the environment is a simple Java environment having the standalone .jar deployed - no extra configs or anything, except for the server port being on 5000). Either way, you can try it out yourself by sending POST requests to http://test123456.us-east-1.elasticbeanstalk.com/images

This is my Postman request and the response using the image you've provided:

enter image description here enter image description here

Everything seems to be looking fine on my AWS EB instance and all the numbers add up as expected. If you are saying your setup is as simple as it sounds then I'm unfortunately just as puzzled as you are. I can only assume that there's more to what you have shared so far (however, I doubt the issue is related to Spring Boot... then it is more likely that it has to do with your AWS configs/setup).

Khathuluu
  • 114
  • 3
  • > First, I'm not sure what part of the comments "make no sense", I re-read them and they seem pretty clear to me. > "file.size will return the size of the uploaded file in bytes, NOT the size of the request". I know, this is what I'm saying. This is Exactly why I'm so surprised. The file.size should be the same for both cases given the file being uploaded is the same. I can't figure out what is causing the discrepancy. The only variable that has changed is the web server running local vs on ec2 instance. Spring, or something in the flow, IS magically doubling the size. Hence, my insanity. – Kenny Cason Jun 28 '19 at 08:04
  • I got the file sizes by looking at them directly on disk. > 634K Jun 20 17:05 fayetteville_property.png, when I upload on localhost running webserver, the file.size (bytes) match perfectly. When I upload with the EXACT same postman request to prod instance (running same code/config), it's practically doubled at 1.1MB – Kenny Cason Jun 28 '19 at 08:07
  • The jar being ran is a standalone jar, they both share the same application.yml, I do have have a application-dev.yml but the only difference is the database configs. I've also just used the same application.yml locally. They are even packaged in the jar, and not externally configed (new project), so I know i'm not screwing up the environment. – Kenny Cason Jun 28 '19 at 08:08
  • I'm not deploying a war. It's a fat jar. BUT, I am running locally in intellij, That is one other difference... let me confirm it's not a problem with how the jar is packaged. – Kenny Cason Jun 28 '19 at 08:09
  • I ran the same jar i run on the EC2 server locally. The request works as expected and uploads to s3. Here's the url of the final uploaded file https://cdn.joinarrived.com/properties/ecfd23c0cbbd468bb08069fe641fa66e.png – Kenny Cason Jun 28 '19 at 08:13
  • Here's an example of a "bad" file uploaded in the prod server. Again, the MutlipartFile is written straight to S3: https://cdn.joinarrived.com/properties/7bc7197de0d5431d82cca30129880f49.png I also did an experiment where I write the MultipartFile straight to a local file (/tmp/test.png) on the prod EC2 server, and it was also bad and ~2x the size. – Kenny Cason Jun 28 '19 at 08:16
  • I'm very open to ANY idea that could be causing this at this point. I've handled files in this exact same manner many many times in my career, but this issue is really stumping me. If I can't find something obvious, I'm going to have to dig in deeper with a remote debugger. – Kenny Cason Jun 28 '19 at 08:18
  • I think it's Https related. I'll update my Question with a new finding. #progress – Kenny Cason Jun 28 '19 at 08:25
  • I threw together a fast example myself and it seems to be working fine. I don't have any S3 related code, just a simple REST endpoint that gets a `MultipartFile` and reads and returns its size. – Khathuluu Jun 28 '19 at 08:38
0

CloudFormation Template snippet for achieving Kenny Cason's solution:

  MyApi:
    Type: AWS::Serverless::Api
    Properties:
      BinaryMediaTypes:
        - "multipart/form-data"