Questions tagged [google-crawlers]

"Crawler" is a generic term for any program (such as a robot or spider) used to automatically discover and scan websites by following links from one webpage to another. Google's main crawler is called Googlebot.

395 questions

votes

4 answers

Passing arguments to process.crawl in Scrapy python

I would like to get the same result as this command line : scrapy crawl linkedin_anonymous -a first=James -a last=Bond -o output.json My script is as follows : import scrapy from linkedin_anonymous_spider import LinkedInAnonymousSpider from…

asked Dec 20 '15 at 15:06

yusuf

3,591
8
45
86

votes

2 answers

Is including harmful for pages with hashbang?

Google says about this meta tag: The following important restrictions apply: The meta tag may only appear in pages without hash fragments. Only "!" may appear in the content field. The meta tag must appear in the head of the document. Source:…

seo meta-tags hashbang google-crawlers

asked Jun 18 '13 at 20:38

Christoph

26,519
28
95
133

votes

3 answers

Avoid crawling part of a page with "googleoff" and "googleon"

I am trying to tell Google and other search engines not to crawl some parts of my web page. What I do is: