0

I have created a React boilerplate configured with react-router that has 2 pages (home and about).

I have hosted this simple application in an Amazon S3 bucket with static website hosting enabled.

The issue I have, is that any URL's such as /about return a 404 error to page speed insights. Example: https://developers.google.com/speed/pagespeed/insights/?url=http%3A%2F%2Fwww.workzey.com%2Fabout

In my S3 bucket I have set up the index and error documents both as index.html - as it should be so that react-router can do it's thing.

This is an issue for me as I need search engines to be able to crawl my site. I have run page speed insights on internal URL's from other react JS sites, and they do not seem to get a 404.

This is the simplest version of a React App hosted on S3, why is this happening when I run page speed insights on internal URLs configured with react-router? The resolution is greatly appreciated!

Notorious
  • 3,057
  • 3
  • 23
  • 33
  • I have just seen that before I reload the page, a 404 error flashes in the console and then disapears which shows why crawlers like PageSpeed Insights and GTMetrix pick up this 404 error. I gather this is because the only document found is index.html. But then how on earth do I not respond with a 404 error when reloading an inner page? – Notorious Feb 15 '18 at 13:20

2 Answers2

8

Ok so after some time trying to figure this out the answer to my question is here: S3 Static Website Hosting Route All Paths to Index.html

Setting the error document to index.html in S3 is the wrong way of doing things as search engines will not index the site because of 404 response headers.

The correct way to avoid this is by setting up a redirect within CloudFront to send 404 errors to /index.html with a 200 response code.

Notorious
  • 3,057
  • 3
  • 23
  • 33
  • 1
    hi, what do you chose for this setting - Error Caching Minimum TTL (seconds)? (The minimum amount of time (in seconds) that you want CloudFront to cache an error response before forwarding another request to your origin. The default time is 300 seconds.) Does it somehow impact on a search engine? – SAndriy May 30 '20 at 11:18
  • @SAndriy I set TTL to one month (2629746). Since the redirect is longterm, I see no reason for refreshing it every 10 or 300 secs. Cached results are supposed to be faster, so this should do no harm in respect of SEO. - Just my best guess, feedback welcome. – wenzf Nov 20 '20 at 08:41
0

You need to create a sitemap.xml file in the root directory of your S3 bucket (i.e www.workzey.com/sitemap.xml) which provides information to search engines on what pages should be crawled on, on your website.

https://support.google.com/webmasters/answer/183668?hl=e

  • I have created a sitemap and added it to Webmaster tools. This does not resolve the issue with page speed insights getting a 404 on internal pages. – Notorious Feb 15 '18 at 12:58