11

I've got a pretty specific problem here, we've got a system that we already have and maintain, the system involves using subdomains to route people to specific apps.

on a traditional server that goes like follows; we have a wildcard subdomain, *.domain.com that routes to nginx and serves up a folder

so myapp.domain.com > nginx > serves up myapp app folder > myapp folder contains a static site

I'm trying to migrate this in some way to AWS, I basically need to do a similar thing in AWS, I toyed with the idea of putting each static app into an s3 bucket and then the wildcard domain in route 53 but i'm unsure how s3 would know which folder to serve up as that functionality isn't part of route 53

Anyone have any suggestions?

Thanks for all your help

Grant
  • 446
  • 6
  • 24

2 Answers2

25

CloudFront + Lambda@Edge + S3 can do this "serverless."

Lambda@Edge is a CloudFront enhancement that allows attributes of requests and responses to be represented and manipulated as simple JavaScript objects. Triggers can be provisioned to fire during request processing, either before the cache is checked ("viewer request" trigger) or before the request proceeds to the back-end ("origin server", an S3 web site hosting endpoint, in this case) following a cache miss ("origin request" trigger)... or during response processing, after the response is received from the origin but before it is considered for storing in the CloudFront cache ("origin response" trigger), or when finalizing the response to the browser ("viewer response" trigger). Response triggers can also examine the original request object.

The following snippet is something I originally posted at the AWS Forums. It is an Origin Request trigger which compares the original hostname to your pattern (e.g. the domain must match *.example.com) and if it does, the hostname prefix subdomain-here.example.com is request is served from a folder named for the subdomain.

lol.example.com/cat.jpg        -> my-bucket/lol/cat.jpg
funny-pics.example.com/cat.jpg -> my-bucket/funny-pics/cat.jpg

In this way, static content from as many subdomains as you like can all be served from a single bucket.

In order to access the original incoming Host header, CloudFront needs to be configured to whitelist the Host header for forwarding to the origin even though the net result of the Lambda function's execution will be to modify that value before the origin acually sees it.

The code is actually very simple -- most of the following is explanatory comments.

'use strict';

// if the end of incoming Host header matches this string, 
// strip this part and prepend the remaining characters onto the request path,
// along with a new leading slash (otherwise, the request will be handled
// with an unmodified path, at the root of the bucket)

const remove_suffix = '.example.com';

// provide the correct origin hostname here so that we send the correct 
// Host header to the S3 website endpoint

const origin_hostname = 'example-bucket.s3-website.us-east-2.amazonaws.com'; // see comments, below

exports.handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;
    const headers = request.headers;
    const host_header = headers.host[0].value;

    if(host_header.endsWith(remove_suffix))
    {
        // prepend '/' + the subdomain onto the existing request path ("uri")
        request.uri = '/' + host_header.substring(0,host_header.length - remove_suffix.length) + request.uri;
    }

    // fix the host header so that S3 understands the request
    headers.host[0].value = origin_hostname;

    // return control to CloudFront with the modified request
    return callback(null,request);
};

Note that index documents and redirects from S3 may also require an Origin Response trigger to normalize the Location header against the original request. This will depend on exactly which S3 website features you use. But the above is a working example that illustrates the general idea.

Note that const origin_hostname needs to be set to the bucket's endpoint hostname as configured in the CloudFront origin settings. In this example, the bucket is in us-east-2 with the web site hosting feature active.

Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427
  • 1
    Thank you! But this didn't work without a couple changes. 1., you need to whitelist the `Host` header in order for the original incoming request host to be passed along, e.g., `host_header` (you [posted](https://forums.aws.amazon.com/thread.jspa?messageID=798876󃂜) this here originally!)([AWS Docs](https://aws.amazon.com/premiumsupport/knowledge-center/configure-cloudfront-to-forward-headers/)). 2., the `origin_hostname` should not contain the `s3-website-us-east-1` part. Change that to `example-bucket.s3.amazonaws.com` – mfink Feb 28 '19 at 16:11
  • 2
    @mfink thank you for the feedback. I did indeed forget to include the mention of `Host` header whitelisting. For the `origin_hostname`, though, using the REST endpoint hostname for the bucket (e.g. `example-bucket.s3.amazonaws.com`) will not work if the web site hosting feature is enabled and you are using the web site endpoint as the origin domain name -- which is typically the case in this application, because that is necessary in order for automatic index documents to be rendered. – Michael - sqlbot Feb 28 '19 at 18:01
  • Thanks for clarifying the `origin_hostname`. I guess my origin settings are set to the s3 bucket name sans `s3-website-us-east-1`. My s3 is also configured as a website so this might explain why I needed to specify index.html as my `Default Root Object` when not specifying my origin as the website url. – mfink Feb 28 '19 at 18:12
  • 1
    @mfink yes. Default Root Object only handles the actual path `/`. Index documents from the web site hosting feature can implicitly render any index.html file at any path (`/foo/bar` redirects to `/foo/bar/` which implicitly renders `/foo/bar/index.html`... but may require more Lambda@Edge in this context so that the `Location` header is correct). If your bucket is configured as a web site, you may want to use the web site endpoint as the origin domain name: https://stackoverflow.com/a/34065543/1695906. – Michael - sqlbot Feb 28 '19 at 21:36
  • @Michael-sqlbot would this be possible to route to different s3 buckets rather than folders within a single bucket? – h0bb5 Apr 07 '19 at 23:57
  • @user3648969 yes, but the code would need to work a little differently. More notably, if you're using a different bucket for each domain, it might be easier just create multiple CloudFront distributions. Switching the bucket entails a possible security risk if you are trusting information from the browser to select the bucket -- you **need** to validate against a known list. If you just allow `*.example.com` and infer the bucket name from the request, then anyone can create a bucket with a name matching the wildcard -- a bucket you aren't using -- and your code will obediently select it. – Michael - sqlbot Apr 08 '19 at 00:30
  • @Michael-sqlbot Ahh, okay thanks for this. I think maybe multiple distributions might be the right option.. I hadn't thought of that. I guess each distribution could use the same SSL cert. Currently I am deploying my s3 buckets with the aws cli and setting the policy for the bucket.. would I add to this script to setup the cloudfront distribution when I push up each new subdomain/bucket? – h0bb5 Apr 08 '19 at 00:37
  • @Michael-sqlbot I have an open SO question if you'd be willing to take a look on what i'd need to change to configure this properly: https://stackoverflow.com/questions/55564199/how-to-point-wildcard-domain-example-com-to-s3-buckets-with-cloudfront-route – h0bb5 Apr 08 '19 at 00:43
  • This really helped me out! Thank you so much! – TomLisankie Jul 28 '20 at 22:39
  • In my case I could get the `origin_hostname` from `request.origin.s3.domainName`, that way I could avoid hard-coding the origin in the lambda. – Grabofus Apr 21 '23 at 18:07
  • 1
    @Grabofus That is a good point, and is partially correct. It depends on whether you are using the S3 web site hosting endpoint, which is treated as a custom origin, rather than an S3 origin. When I originally wrote this on the old AWS forums, the [event structure](https://web.archive.org/web/20170805190419/https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-event-structure.html) didn't contain information about the origin. That had apparently been changed by the time I posted this answer on SO but I guess I didn't think to update it, since the code worked as written. – Michael - sqlbot Apr 25 '23 at 21:13
-1
  1. Create a Cloudfront distribution

  2. Add all the Alternate CNAMEs records in the cloudfront distribution

  3. Add a custom origin as the EC2 server.

  4. Set behaviours as per your requirements.

  5. Configure nginx virtualhosts in the server to route to specific folders.

Varun Chandak
  • 943
  • 1
  • 8
  • 25