0

I have an amazon s3 bucket which contains one file and I am looking for a way to download that one file using the file extension. currently, I have the code to download multiple files from s3 bucket using key and then filter based on the extension. Something like this:

s3.listObjects(operaBucketName, keyName)
          .getObjectSummaries()
          .forEach(s -> keys.add(s.getKey()));
List<String> filteredKeys =
        keys.stream().filter(s -> s.contains(extension)).collect(Collectors.toList());
//add to file list
List<File> files = new ArrayList<>();
    for (String key : filteredKeys) {
      File file = new File(FilenameUtils.getFullPath(localDirectory) + FilenameUtils.getName(key));
      downloadFileFromS3(operaBucketName, key, file);
      files.add(file);
    }

I want to do this but targeting a single file. I tried s3.getObject(operaBucketName, keyName);, but the com.amazonaws.services.s3.model.S3Object doesn't have a way to check the file extension. I think using this How to write an S3 object to a file? I can write S3object contents to file. Also, just in case, there are multiple files in the given s3 folder(key), will getObject throw an exception?

Sdk version used:

<dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-s3</artifactId>
      <version>1.11.792</version>
    </dependency>

Thanks!

Abhinash Jha
  • 165
  • 1
  • 3
  • 17

1 Answers1

1

AWS provides a bom and show how you can include it here shortly below.

Importing the BOM

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.amazonaws</groupId>
      <artifactId>aws-java-sdk-bom</artifactId>
      <version>1.12.411</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

Using the SDK Maven modules

<dependencies>
  <dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-ec2</artifactId>
  </dependency>
  <dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-s3</artifactId>
  </dependency>
  <dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-dynamodb</artifactId>
  </dependency>
</dependencies>

On aws java sdk documentation you can find how to create project template too.

mvn -B archetype:generate \
 -DarchetypeGroupId=software.amazon.awssdk \
 -DarchetypeArtifactId=archetype-lambda -Dservice=s3 -Dregion=US_WEST_2 \
 -DgroupId=com.example.myapp \
 -DartifactId=myapp

How to get a single file from s3 more and full example

GetObjectRequest getObjectRequest = GetObjectRequest.builder()
    .bucket(bucketName)
    .key(key)
    .build();

s3.getObject(getObjectRequest);

One important note about file extension

An Amazon S3 bucket has no directory hierarchy such as you would find in a typical computer file system. You can, however, create a logical hierarchy by using object key names that imply a folder structure. For example, instead of naming an object sample.jpg, you can name it photos/2006/February/sample.jpg.

To get an object from such a logical hierarchy, specify the full key name for the object in the GET operation. For a virtual hosted-style request example, if you have the object photos/2006/February/sample.jpg, specify the resource as /photos/2006/February/sample.jpg. For a path-style request example, if you have the object photos/2006/February/sample.jpg in the bucket named examplebucket, specify the resource as /examplebucket/photos/2006/February/sample.jpg. For more information about request types, see HTTP Host Header Bucket Specification.

There is no folder/directory in s3 so you will not get any multi file exception.

And AWS suggesting to use file extension in the object name, this way you can get the object/file and no need to check extension. javadoc

To determine the type of object or file, you can verify its content-type. How to get the content type and with apache tika find the extension. This is more resource consuming and slow way therefore I suggest using the extensions in the object name.

ozkanpakdil
  • 3,199
  • 31
  • 48