1

In the HTML below:

<a href="link1.php">1</a><a href="link2.php">2</a><a href="link3.php">3</a>

How do I extract link1.php,link2.php,link3.php and push them into an array using regex? (There could be N number of <a> tags in the text)

[Edit] I'm aware the regex for this is something like href="([^"])*". But I'd like to know how to go about this in Actionscript. Specifically, how can they be pushed into an array in Actionscript?

jonathanasdf
  • 2,844
  • 2
  • 23
  • 26
Yeti
  • 5,628
  • 9
  • 45
  • 71

5 Answers5

1
var str:String = '<a href="link1.php">1</a><a href="link2.php">2</a><a href="link3.php">3</a>';
var result:Array = str.split(/<a[^>]*href="([^"]*)".*?>.*?<\/a>/);
for (var i:int = 1; i < result.length; i += 2) {
    trace(result[i]);  // link1.php, link2.php, link3.php
}
jonathanasdf
  • 2,844
  • 2
  • 23
  • 26
0
<a[^>]*href="([^"]*)"
Simon
  • 9,255
  • 4
  • 37
  • 54
0

The regex href="([^"])*" should work most of the time. It'd capture into \1. You may have to use this in a loop, and you may have to escape the ".

polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
0

Using RegExp.exec() will return an Object that you can access by index, like an array.

Also, you might find this and this post handy.

Community
  • 1
  • 1
George Profenza
  • 50,687
  • 19
  • 144
  • 218
0

Are you sure you want to use regex to parse html?

How about something like:

var anchors:String = '<a href="link1.php">1</a><a href="link2.php">2</a><a href="link3.php">3</a>';
var html = new XML('<p>' + anchors + '</p>');
var links:Array = [];
for each(var a:XML in html.a)
  links.push(String(a.@href));
Community
  • 1
  • 1
Amarghosh
  • 58,710
  • 11
  • 92
  • 121
  • **The text might be like this** : 1 There could be lots of text here2 These links are between lots of simple text3 So making this an XML may not be a very good idea? – Yeti May 06 '10 at 09:56
  • As long as it is valid xml and all anchor tags are at the top level (not inside another p or div), this will work fine. If anchor tags are nested, you can use `html..a` instead of `html.a` inside `for each` to access all the descendants that are anchors. – Amarghosh May 06 '10 at 14:52