1

Assuming I need a public web page that displays the email information from a user of my site. In addition to obfuscation, would javascript like this be helpful?

setTimeout(function(){
    document.getElementById(id).innerHTML = "<span>" + username + "@" + hostname +  "</span>";
},50)
TrtG
  • 2,778
  • 6
  • 26
  • 39

1 Answers1

1

It entirely depends on the spambot. This could stop some spambots, but it wouldn't stop a scraper designed specifically to work around this defense.

That's how arms races work.

It would be pretty straightforward to build a bot that works around this defense you have in mind. You could use a headless browser (such as PhantomJS) to fetch the page, evaluate all the JavaScript on the page, wait an arbitrary amount of time (say, 10 seconds), and then scrape the DOM for email addresses.

Matt Ball
  • 354,903
  • 100
  • 647
  • 710
  • Ok, do you think that having to wait on each page would be a lost of efficiency for the bot and then assume that many of them won't do this? – TrtG Jun 04 '15 at 17:12
  • 2
    @TrtG, I highly doubt any spam bot currently does this. It'd be easy to target if someone wants to specifically target your site (any reason to expect such?), but yeah, I wouldn't expect most spam bots to bother for the same reason that most won't try running OCR on images to see if there's an email address in the image, despite that being very doable (it's just too resource heavy to be worth it). – Kat Jun 04 '15 at 17:15
  • It all depends on what kind of attacks _you_ feel are necessary against which to defend. There are plenty of existing strategies for fighting email scrapers, like putting it behind a captcha. Why not use one of those? See: http://stackoverflow.com/q/23002711/139010 and https://en.wikipedia.org/wiki/Email_address_harvesting#Anti-harvesting_methods – Matt Ball Jun 04 '15 at 17:15
  • My website needs to show some user information publicly, that's why I don't want to use captcha because it is impacting UX. – TrtG Jun 04 '15 at 17:18
  • 1
    @Mike: Search crawlers run some JavaScript, no reason to believe spam bots don't as well. Definitely a resource/reward equation, though. – T.J. Crowder Jun 04 '15 at 17:24
  • @TrtG: *"Ok, do you think that having to wait on each page would be a lost of efficiency for the bot..."* Probably not much. They can have the CPU rendering other pages during the time your page is sitting there in case something might happen, it's not like they have to busy-wait. – T.J. Crowder Jun 04 '15 at 17:25