3

Previously, I stored client files locally, on a server using PHP (and running Apache). They would upload files, and each one would be given a randomized string ending in a pdf / jpg file extension. The original file name would be kept in a database along with the randomized name to link them back together when the user wanted the file.

I wanted to transition to storing files on a private bucket in S3. The first thing I'm seeing is this article which says to give Object keys a unique name, but all the examples I'm seeing just put the user's file name in there.

This is an issue because if a user stores test.pdf and another, entirely different user uploads test.pdf, then it won't be uploaded. Another issue is if I use the random file names like I have previously been doing, and then the user gets the file from a pre-signed request, then they will be accessing a file named via some random string and not the file they thought they uploaded.

What should I be doing to separate out a user's files, while keeping the original file name on s3?

Alex
  • 2,145
  • 6
  • 36
  • 72

2 Answers2

4

Personally, I do exactly what you describe in your first example. The S3 file gets a UUID generated for the file name in the bucket and all the metadata including the original file name goes in the database.

I don't even bother giving the S3 file an extension.

To expand on my comments and the question about how to read the files back;

I'm using Laravel with Intervention\Image (site).

My GET endpoint for the attachment controller returns this function in my model:

/**
 * Gets an image from Amazon and returns it
 * @param boolean $thumb
 * @return null|Image
 */
public function output($thumb = false)
{
    if ($this->s3_filename === null) {
        return null;
    }
    // Grab the image from S3
    $this->image = $this->s3->get('/' . $this->getPath() . '/' . ($thumb ? 'thumb/' : '') . $this->s3_filename);
    if ($this->image === null) {
        return null;
    }
    return Image::make($this->image)->response()->withHeaders([
        'content-disposition' => 'inline; filename="' . ($thumb ? 'thumb_' : '') . $this->filename . '"',
    ]);
}
Skrrp
  • 660
  • 7
  • 11
  • I guess my question is, how do you serve this back to the user? Doesn't something like a pre-signed request just give you a URL back to the object in question. Do you transform it back into the original file name? Or just serve it to them as a random UUID? – Alex Jan 27 '18 at 00:13
  • See my answer below, if you give them a url w/ the path to the file, the filename remains the same it's just the path to their file is a bit longer :) – ACVM Jan 27 '18 at 00:16
  • 1
    I have an endpoint in my application that is used as the img src= source. That bit is `myApp/attachments/{UUID}`. This hides the fact I'm using S3 completely from my users. My controller fetches the file from S3 and serves it back as the body with a `content-disposition` header for the file name in case the user wants to save the file. This does mean that my server has to process the file in memory. Not good for video, fine for most web images. – Skrrp Jan 27 '18 at 01:08
  • The other advantage of the application getting the file from S3 is that my bucket and its files are private and I use an IAM user account to pull the files. No public snooping on my bucket and I can implement per-file access control. – Skrrp Jan 27 '18 at 01:12
  • @Alex in case you missed it, note that `content-disposition' => 'inline; filename="...` causes the browser to save the file with this name, not the name under which it's stored in S3. This can also be done with [`response-content-disposition` in a signed URL](https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html) or by simply setting the `Content-Disposition` request header when saving the object to S3, which replays it back to the browser when the object is downloaded. – Michael - sqlbot Jan 27 '18 at 02:49
1

How about considering using buckets/folders?

Buckets need to have unique names (across ALL of AWS... not sure if that has changed). But the folders within them are fine.

But otherwise:

myBucket/
    user1/
        test.pdf
    user2/
        test.pdf

There's not an additional cost to having directories within buckets AFAIK so you should be good.

You can also use a UUID instead of user1, and have a table somewhere that maps usernames to UUID to generate the bucket/folder path.

ACVM
  • 1,497
  • 8
  • 14
  • I was hoping I wouldn't have to get crazy with the paths, as there will realistically be thousands of sub-sub directories within the application, but I just might have to D: Thanks for the response! – Alex Jan 27 '18 at 00:18
  • How about updating your question to specifying all your constraints? Then maybe we can help you come up with a better solution. I'm sure there are ways to get a web browser to download and rename a file :) – ACVM Jan 27 '18 at 00:20
  • Maybe this question might help you? https://stackoverflow.com/questions/10049259/change-name-of-download-in-javascript – ACVM Jan 27 '18 at 00:21
  • Completely forgot about content-disposition. Thanks for that! – Alex Jan 27 '18 at 00:23
  • Glad I could help! Feel free to upvote/accept the answer :) – ACVM Jan 27 '18 at 00:36
  • @Alex what's your concern about paths? You could simply use a unique identifier for the user as the top level of your key. You don't need to create any folders here - just upload files to uuid/filename. That will also allow you to list files owned by a given user, by doing a prefix list. And, you should ideally not use the native filename as part of your key. If you do that, you'll potentially have to escape certain characters. – jarmod Jan 27 '18 at 01:15