2

A have a link:

<a href="http://domain.com/?=register">Register</a>

And I want to hide the URL from bots. I know that bots generally do not have Javascript enabled, so I am thinking an approach like this. In my HTML code, I have the URL reversed:

<a href="retsiger=?/moc.niamod//:ptth">Register</a>

Then using Javascript, I reverse it so the user sees the correct URL. How can I do this? Comparability is obviously essential.

To users who do not have JS enabled, I simply display a message that JS is required.

Gary Woods
  • 1,011
  • 1
  • 15
  • 33
  • @doniyor spam bots tend to ignore robots.txt and why is this a crazy idea? Bots crawl the page for a registration URL, this seems like a good way to hide it. – Gary Woods Sep 24 '14 at 14:53
  • 1
    Depending on the bots you're targeting I wouldn't assume they all can't read JS: http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html I haven't been able to find any hard stats on this however. – xDaevax Sep 24 '14 at 15:11
  • Also, you could use a delayed execution method. Since a bot will not likely trigger a mouse-move event, why not just populate the registration link on the mouse move or some other input event that a bot will not trigger? That way the link isn't in the markup at all. – xDaevax Sep 24 '14 at 15:14
  • Just have a text (span) that says that Javascript is needed to register, and add the entire registration link using Javascript. That way, the page won't contain any invalid links for non-JS users, and you won't risk any page rank penalties for having an invalid link. – GolezTrol Sep 24 '14 at 15:16

4 Answers4

3

You can do what you're asking with this bit of script...

$("a").each(function() {
    var href = $(this).attr("href");
    href = href.split("").reverse().join("");
    $(this).attr("href", href);
});

It converts the href value into an array which can be reversed easily due to the Array.reverse() function, and then joins it again to return a string.

Obviously put it in a document ready handler, as in your example.

Here's a jsfiddle example...

http://jsfiddle.net/1u0wtv0f/


Attributes vs Properties

In this case it is important that we use the href attribute value, as opposed to the href property. The reason for this is that if you get the href property then it is converted into an absolute URL. In the case of this example, the href value

retsiger=?/moc.niamod//:ptth\

would become

http://domain.com/retsiger=?/moc.niamod//:ptth

By using the attribute value we use the value that was used when the link was created.

Reinstate Monica Cellio
  • 25,975
  • 6
  • 51
  • 67
  • No problem - back in a mo. – Reinstate Monica Cellio Sep 24 '14 at 15:07
  • Excellent, also, if you could highlight browser compatibility, that would be very helpful. – Gary Woods Sep 24 '14 at 15:08
  • I ended up having to change it, but it was mainly due to being in jsfiddle. `this.href` gets an absolute URL, wherease `$(this).attr("href")` gets the actual value of the attribute as you've created it. I'll update the answer code and add a link. – Reinstate Monica Cellio Sep 24 '14 at 15:10
  • I can't see there being any compatibility issues with this, especially since it's heavily using jQuery. – Reinstate Monica Cellio Sep 24 '14 at 15:13
  • As a side note, I would favor `prop("href",href);` instead of `attr("href", href);`. http://stackoverflow.com/questions/5874652/prop-vs-attr – xDaevax Sep 24 '14 at 15:27
  • 1
    @xDaevax in this particular case I ruled out using the property as it is automatically converted into an absolute URL. Since the href value is not a real URL it would get prefixed with the protocol and domain name, therefore breaking the link when the value is reversed. In this case it specifically MUST be the attribute and not the property. I'll put that in the answer to make it clear. – Reinstate Monica Cellio Sep 24 '14 at 15:37
  • Ah, I see. That's interesting behavior of the `prop` method. Thanks for that tidbit. – xDaevax Sep 24 '14 at 15:42
1

ok, sorry, it is not a crazy idea, but anyways this way is better.

<script type="text/javascript">
$(document).ready(function() {
    $('a[data-href]').attr('href', $(this).attr('data-href'));
  });
});
</script>

The construct your links in the following fashion.

<a href="" rel="nofollow" data-href="http://domain.com/?=register">register</a>

all hrefs will be loaded once Dom is ready. so during the hunting times of bots, hrefs are empty.

OR

<a href="" rel="nofollow" id="reg">register</a>

and jquery

<script type="text/javascript">
$(document).ready(function() {
    $('#reg').attr('href', 'http://domain.com/?=register');
  });
});
</script>
doniyor
  • 36,596
  • 57
  • 175
  • 260
  • 1
    `$(this).data('href')` is shorter and neater – hjpotter92 Sep 24 '14 at 14:56
  • 1
    The problem with this is that the register URL is still in the HTML and will be grabbed by the spam crawler. I know this for a fact because I have seen one running live. It downloads the page (doesn't have JS enabled), then grabs all the URL in the file, and finally looks for register type of words. Anyway to use the above code without the URL in the data-href? – Gary Woods Sep 24 '14 at 14:58
  • @GaryWoods what about the second solution? – doniyor Sep 24 '14 at 15:01
  • Great update, but unfortunately, the URL itself will still be there available in the script. As long as the word 'register' is **anywhere** within the script, then it is vulnerable. This is why I was asking for a Javascript reverse type of solution. I hope you understand the core issue. – Henrik Petterson Sep 24 '14 at 15:03
  • 2
    If I wrote a bot I'd scan all sources with a regex to detect URLs. This approach would still be vulnerable then. – Ke Vin Sep 24 '14 at 15:03
  • @HenrikPetterson what if you import the js file? – doniyor Sep 24 '14 at 15:05
  • Regardless, the URL will still be there in the JS file :) Do you know how to reverse the URL string using JS? – Gary Woods Sep 24 '14 at 15:06
0

If the bot specifically looks for the "href" part of an anchor tag, then you can fool them in a different way. This code is an HTML construct construct that looks like an ordinary link when the browser displays it, and acts like an ordinary link, but isn't an ordinary link, because it calls a JavaScript function to DO the thing a link normally does:

<a onclick="linkfunc(1);"
 style="color:#0000ff;text-decoration:underline;cursor:pointer;">
 A Page Being Linked</a>

In your block of JavaScript code, you would do the rest:

function linkfunc(wch)
{ var h="http://", r="?=register";
  window.name="ThisPage";
  switch (wch)
  { case 1:
      window.open(h+"domain.com/"+r, "ThisPage");
      break;
    case 2:
      //window.open(whereever you want linkfunc(2) to go, "ThisPage");
      break;
  }
  return;
}      
  • This will not work because the URL will still be in the JS file, which will be downloaded and scrapped by the bot. – Gary Woods Sep 24 '14 at 15:09
  • You can break the URL into finer pieces than I've done here, say, 3 letters per variable. – vernonner3voltazim Sep 24 '14 at 15:10
  • More, imagine one letter per variable, and assembling the URL as h+t+t+p+ ... --except you use different variable-letters for different letters, scrambling the appearance: q+e+e+n+ ... – vernonner3voltazim Sep 24 '14 at 15:17
0

You could alternatively not put the link text in the markup at all and load it from Ajax. In addition, you could only place it in the markup once a particular user event happens that a bot will not trigger.

If you used something like this:

$(function () {
    $(document).on("mousemove", function (e, data) {
        //Make ajax call to get link
        var linkTarget = "http://www.test.com/register";
        $(".dyn-link").prop("href", linkTarget);
        $(".dyn-link").text("Register");
        console.log("fired");
      //Remove the event handler
        $(document).off("mousemove", null);
    });
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js"></script>
<a class="dyn-link" href=""></a>
xDaevax
  • 2,012
  • 2
  • 25
  • 36