-2

I am new to programming so please if I say something stupid don't judge me.

I was wondering if there is any way to trick web crawlers, so some of the content of a website will be different for a human visitor, than a web spider.

So here's an idea I thought.

Everytime a visitor enter a page, there will be a script that will identify users gender from facebook API. If there is a return (if user is connected to facebook in the same browser) then some code will be printed with PHP to the page code. If it's a crawler, there will be no return so the code will not exist in the source code of that page.

I know that PHP is a server side language, so web crawlers don't have permition to scan those codes. If I am not right, please correct me.

Thank you.

Steve
  • 1
  • 4
  • 1
    see [1](http://stackoverflow.com/questions/677419/how-to-detect-search-engine-bots-with-php) and [2](http://www.cult-f.net/detect-crawlers-with-php/) may help you –  Mar 03 '13 at 15:41
  • You are assuming that all human visitors a) have a Facebook account, b) are logged in to Facebook when they visit your site, and c) their profile is public or they explicitly give your site access to their profile. – JJJ Jun 16 '13 at 05:50

1 Answers1

0

I think what you are trying to do can be accomplished with robots.txt

This file can sit at the root of your web directory and it defines the rules for web crawlers. See here: http://www.robotstxt.org/

jas
  • 1
  • 2
  • robot.txt is unrealiable. It is more about your request to google than, being arbitrary. Plus, what do you think a user will see when they manually open robot.txt from your root? – samayo Mar 03 '13 at 15:46
  • I know that robots.txt can be ignored by some crawlers. Thanks for the answer anyway. – Steve Mar 03 '13 at 15:52