4

Is it possible to add a target="_blank" to all <a> tags, with just one regular expression? I've been experimenting with negative and positive look aheads/behinds, to no avail. This should:

  • Match anchor tags whose href starts with http://
  • If the anchor tag does not have a target tag, adds a target="_blank" tag
  • If it does have a target tag, it checks if the target tag is not already set to "_blank", and if not it is replaced to target="_blank"

Is this possible? If not, what would be the least computationally intensive way to do this?

Smern
  • 18,746
  • 21
  • 72
  • 90
Karl Cassar
  • 6,043
  • 10
  • 47
  • 84
  • 3
    Are you trying to do this from a script (e.g. Javascript) embedded inside a html page or are you running a program against a html file? – Liv Apr 30 '13 at 12:54
  • you want a regular expression to set something? aren't they used to match strings with patterns ? – engma Apr 30 '13 at 12:54
  • 2
    [A regular expression to parse (and modify) HTML?](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – ssube Apr 30 '13 at 12:56
  • Keep in mind that in HTML, there's not a strict order for tag attributes. So that both `` and `` are valid. – Alessandro Vendruscolo Apr 30 '13 at 13:05
  • @peachykeen: Common people, stop pasting this link. I think everybody has already seen it and it's a bit unfair to gain your reputation by simply linking it... Although it's funny. – Dio F Apr 30 '13 at 13:27
  • 1
    @DioF: You don't gain rep from comment votes, so linking it is just good fun (and a warning to the OP to not actually try to use just regex). – ssube Apr 30 '13 at 13:40

4 Answers4

3

You didn't really specify whether this would be the html from the DOM (among other things). Assuming you want to modify the DOM... Using jQuery, you could do something like this:

$(document).ready( function() {
    $('a').filter( function() {
        return $(this).attr('href').substr(0, 7) == "http://";
    }).attr('target', '_blank');
});

here is a jsfiddle showing it works (you can inspect the elements to see that the a's with href's starting with http:// have target="_blank") : http://jsfiddle.net/amRrj/

Smern
  • 18,746
  • 21
  • 72
  • 90
  • It was originally intended to be server-side, and HTML from a CMS. However, this will be outputted to a page, and actually I like your idea of pushing the computation to the client-side, via JavaScript. – Karl Cassar Apr 30 '13 at 13:49
1

If the HTML is also well-formed XML then you would be better off using an XML tool with support for xpath and xslt. Take a look at XMLStarlet which provides tools similar to grep, sed etc for working with XML. I think this would do what you want, but I have not tried it:

xml ed -P -S -i //a -t attr -n target -v _blank somefile.html >somefile_2.html

xml ed invokes the XMLStarlet edit command

-P preserve formatting

-S preserve whitespace

-i //a insert at every tag

-t attr insert type is attribute

-n target name of attribute to insert

-v _blank value of attribute to insert

For more complex editing you can use XMLStarlet with an xslt transform.

Dave Kirby
  • 25,806
  • 5
  • 67
  • 84
0

Another option is adding target="_blank" to the <base href…>. This will globally affect the pages <a> tag targets rather than needing to add them individually. Target declarations on <a> tags will override this.

    var target = '_self';

    if(external){
        target = '_blank';
    }

    $('#base_tag').attr('target', target);

Fiddle

<base> Documentation

robstarbuck
  • 6,893
  • 2
  • 41
  • 40
0

I write this one maybe it helps:

Regex: /<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/gi

console.log(`<a href="https://www.stackoverflow.com/test" target="_parent">Stackoverflow</a>`.replace(/<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/gi, '<a href="$2" target="_blank">$4</a>'));
console.log(`<a href="https://www.stackoverflow.com/test" target="_target">Stackoverflow</a>`.replace(/<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/gi, '<a href="$2" target="_blank">$4</a>'));
console.log(`<a href="https://www.stackoverflow.com/test">Stackoverflow</a>`.replace(/<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/gi, '<a href="$2" target="_blank">$4</a>'));
Ali Hesari
  • 1,821
  • 5
  • 25
  • 51