1

I got my SPA website (based on Node/Express/Mongo/Angular X) up and running. I created a sitemap.xml and submitted to Microsoft Bing, and from the server log, I see they started crawling. However, I noticed the page URL is called, but not the associated API for that page. So, basically it's just indexing the static skeleton of each page, not the dynamic real content.

I googled and see people saying "google can't index dynamic content" as suggested in this article. However, I also see other people saying crawler is just a person browsing and it should get its dynamic content.

I'm confused. Can somebody clarify? How to fix it?

newman
  • 6,841
  • 21
  • 79
  • 126
  • Web crawlers don't execute JavaScript. Google probably has a more sophisticated crawler that does, but most don't. It's just too compute intensive for trillions of pages. Add a static link for crawlers to follow. See https://stackoverflow.com/a/28075506/148844 – Chloe May 11 '18 at 20:13
  • 1
    @Chloe "Web crawlers don't execute JavaScript" is not true anymore. https://stackoverflow.com/a/1785101/8384 – McKay May 21 '18 at 20:05

1 Answers1

-1

Web crawlers don't execute JavaScript. Google probably has a more sophisticated crawler that does, but most don't. It's just too compute intensive for trillions of pages. Add a static link for crawlers to follow. See https://stackoverflow.com/a/28075506/148844

Bing doesn't index JavaScript generated content.

https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a

Site technology The technology used on your website can sometimes prevent Bingbot from being able to find your content. Rich media (Flash, JavaScript, etc.) can lead to Bing not being able to crawl through navigation, or not see content embedded in a webpage. To avoid any issue, you should consider implementing a down-level experience which includes the same content elements and links as your rich version does. This will allow anyone (Bingbot) without rich media enabled to see and interact with your website.

Rich media cautions – don’t bury links to content inside JavaScript

Rich media warnings – don’t bury links in Javascript/flash/Silverlight;keep content out of these as well

Down-level experience enhances discoverability – avoid housing content inside Flash or JavaScript – these block crawlers form finding the content

Chloe
  • 25,162
  • 40
  • 190
  • 357
  • Thanks, Chloe. I'm still confused. I can understand web crawlers don't execute JavaScript if that needs user interaction. However, in my case, I don't have JavaScript on my page but I use Angular and the API is normally called on ngInit event which doesn't need any user interaction. Also, what do you mean by "Add a static link for crawlers to follow"? Can you give an example? – newman May 13 '18 at 00:56
  • Both Bing and Google say that they follow some javascript. I know I have pages that both google and bing index content that is only available through javascript. https://stackoverflow.com/a/1785101/8384 – McKay May 21 '18 at 20:00
  • Having said that, a sitemap can make things easier for a developer to get the content indexed by the search engines. – McKay May 21 '18 at 20:01
  • Just because bing says "sometimes causes difficulty" doesn't mean "bing doesn't index JS generated content". https://stackoverflow.com/a/1785101/8384 I have pages that both bing and google find the content of, that's only accessable via javascript. – McKay May 21 '18 at 20:05
  • BingBot most definitely executes Javascript -- I see lots of GET requests from BingBot hitting dynamically-built analytics URLs on my server. (It's somewhat annoying because the URL is supposed to be a POST-endpoint only, so the frequent GET requests clutter up the error logs.) – zacronos Sep 20 '18 at 18:14