0

I have a single page application, I want to make it crawlable so I have generated snapshots. My application stack is rails + unicorn + nginx(as reverse proxy).

Now, Aws Opsworks generates a nginx config from this cookbook. I ssh-ed into the system & modified the default config to include the following lines to redirect all requests from search engine bots as follows(they convert the url which contains #! & send a new request with _escaped_fragment_ in query parameters):

if ($args ~ "_escaped_fragment_=(.+)") {
  rewrite ^ /snapshots$uri$1?;
}

Everything worked great when I loaded the url in the browser. The issue I am facing is with automating the same thing using chef. Since the code I added was in the config file generated using default cookbook by opsworks, I need a way to define a nginx server block to achieve this. So, I defined the following server block.

server {
    listen 80;
    server_name example.com;

    if ($args ~ "_escaped_fragment_=(.+)") {
        set $foo $1;
        rewrite ^ /snapshots$uri$foo?;
    }
}

But nginx will never select this block given there already exists another server block with the same server_name. So, is there a way that I can define a server block to be selected by nginx based on the existence of _escaped_fragment_ in the $args ?

Something as follows(I know this won't work since regex doesn't match query parameters)

server {
    listen 80;
    server_name example.com(.+)_escaped_fragment_=(.+);

    ...
}
Mudassir Ali
  • 7,913
  • 4
  • 32
  • 60
  • Taking a step back, do you actually need snapshots? Most of the major search engines (including Google) are rendering the content they receive from the website, in our (Google's) case with something close to a headless browser, so whatever you do for the users the search engines will also get it. – methode Jul 21 '15 at 09:51
  • That was my initial plan too but right now I don't have the infrastructure bandwidth to do that. I wanted to use https://github.com/prerender/prerender_rails but I have only one microinstance in the free aws plan & I don't want to add to the CPU load by running a phantomJS instance. This is a temporary solution for my MVP. – Mudassir Ali Jul 21 '15 at 11:19

1 Answers1

0

In order to do this in chef, you need to create a custom cookbook (if you don't have one already) and a recipe in it which would overwrite the opsworks generated file with your preferred file. In the cookbook you'd need 2 files, nginx template and a recipe to overwrite the default template with the custom one:

  1. mycookbook -> templates -> default -> custom_nginx.erb
  2. mycookbook -> recipes -> customise_nginx.rb

Content of (1):

whatever you want your nginx config file to be, so:

server {
    listen 80;
    server_name example.com;

    if ($args ~ "_escaped_fragment_=(.+)") {
        set $foo $1;
        rewrite ^ /snapshots$uri$foo?;
    }
}

Content of (2):

template "/etc/nginx/sites-enabled/<nginx file name>" do
  source "custom_nginx.erb"
  user "root"
  group "root"
  mode "644"
end

service "nginx" do
  action :reload
end

Then add mycookbook::customise_nginx to the custom setup recipe section in your layer settings.

If you don't have a custom cookbook already, a bit more set up will be needed: https://www.digitalocean.com/community/tutorials/how-to-create-simple-chef-cookbooks-to-manage-infrastructure-on-ubuntu http://docs.aws.amazon.com/opsworks/latest/userguide/workingcookbook-installingcustom-enable.html

Edit: If you want to keep opsworks config file, you have two options: to take the template that opsworks is using, I'm guessing this one? https://github.com/aws/opsworks-cookbooks/blob/release-chef-11.10/nginx/templat‌​es/default/site.erb, create a copy and put your changes there in file 1 as above. Or use chef to modify the existing file content - for example using FileEdit library (check the second answer to this question)

Community
  • 1
  • 1
semirami
  • 289
  • 2
  • 7
  • I already created a custom cookbook & a chef recipe, Infact with the same content as you posted but in my way it creates an additional file under the **sites-enabled** folder. Since, this config file is loaded after the default generated file, nginx never picks up my `server` block due to equal specificity. What you are suggesting is to overwrite the content of the opsworks generated config file. But I want the content of the opsworks generated file plus the `if` condition, is there a way to achieve that ? More like if the opsworks config generation had a block where I can plug in my code. – Mudassir Ali Jul 21 '15 at 11:47
  • In that case, you have two options: to take the template that opsworks is using, I'm guessing this one? [https://github.com/aws/opsworks-cookbooks/blob/release-chef-11.10/nginx/templates/default/site.erb](https://github.com/aws/opsworks-cookbooks/blob/release-chef-11.10/nginx/templates/default/site.erb), create a copy and put your changes there in file 1 as above. Or use chef to modify the existing file content - for example using FileEdit library (check the second answer to this [question](http://stackoverflow.com/questions/14848110/how-i-can-change-a-file-with-chef) – semirami Jul 21 '15 at 13:56
  • This sounds reasonable, can you add your comment to the answer so that I can accept it ? – Mudassir Ali Jul 21 '15 at 13:58