3

This seems simple but I have never done a regex so just a simple question from regex experts of JS.

var str = '<a href="test">luckyy1</a> born on october 21, 1986 <a href="test">sdf2</a> born on september 22, 1985 <a href="test">erere</a> born on November 23, 1984 ';

I got values successfully as: luckyy1, sdf2, erere

But I need it as luckyy1+october+21+1986,sdf2+september+22+1985,erere+... and so on (may be i need regex right??)

Any help would be appreciated.

Luckyy
  • 1,021
  • 4
  • 15
  • 29

5 Answers5

1

Try this:

str = $("<div/>").html(str).text();
str = str.replace(/,?/g, '').replace(/born on?/g, '').split(' ').join('+')

DEMO

Ram
  • 143,282
  • 16
  • 168
  • 197
0

While I recommend not parsing HTML with regex, this is sufficiently simple enough you should be able to do it.

"test"\s*>(.+?\d{4})

That will capture anything after a "test"> tag, and end at 4 numbers (the year in your example).

Your info is space delimited in group 1. After that, I recommend splitting on spaces to get your individual elements to play with.

Play with the regex.

Community
  • 1
  • 1
David B
  • 2,688
  • 18
  • 25
  • 1
    Thanks for the awesome link David, but would like to clear : href is not gonna to be test always – Luckyy Aug 13 '12 at 16:07
  • You need something to anchor to (a delimiter): what would a string without the `href` look like? – David B Aug 13 '12 at 16:08
  • David, I meant href value will not always be "test" so it sould be matched in regex as "test" – Luckyy Aug 13 '12 at 16:17
  • @luckycool If your tags are going to become arbitrary, I would recommend using `JQuery` like some other answers or a full-fledged HTML parser. – David B Aug 13 '12 at 16:19
0

Basically, you want to strip the html tags?

Give this a try:

var StrippedString = OriginalString.replace(/(<([^>]+)>)/ig,"");

from http://css-tricks.com/snippets/javascript/strip-html-tags-in-javascript/

If you wish to get separate strings for each dom element (as your example suggests), you may transverse the DOM elements with jQuery and strip each one separately.

EDIT:

Something like this:

var $s = jQuery( the_string ); 
var result = [];
$s.each(function(i, item){ result.push( $(item).text().replace(/(<([^>]+)>)/ig,"") ); });
frnhr
  • 12,354
  • 9
  • 63
  • 90
0

Strictly with the markup you provided you can do something like this:

var values = $('<div><a href="test">luckyy1</a> born on october 21, '+
  '1986 <a href="test">sdf2</a> born on september 22, 1985 ' +
  '<a href="test">erere</a> born on November 23, 1984</div>')
  .contents()
  .map(function(){
    return $(this).text().replace('born on', '').trim();
}).get();

console.log(values); // ["luckyy1", "october 21, 1986", "sdf2", "september 22, 1985", "erere", "November 23, 1984"]

Only thing I changed was added a wrapping div to the string. You can then use values.join('+') to concat with a +, and more string replace on whitespace.

values.join('+').replace(/\s/g, '+'); // to make all whitespace `+` 
Josiah Ruddell
  • 29,697
  • 8
  • 65
  • 67
0

This is dirty solution but somehow may help you...

var str = '<a href="test">luckyy1</a> born on october 21, 1986 <a href="test">sdf2</a> born on september 22, 1985 <a href="test">erere</a> born on November 23, 1984 ';

var r= /<a[^>]*>(.*)<\/a> born on ([\w]*) ([\d]*), ([\d]*) <a[^>]*>(.*)<\/a> born on ([\w]*) ([\d]*), ([\d]*) <a[^>]*>(.*)<\/a> born on ([\w]*) ([\d]*), ([\d]*)/;

r.exec(str).splice(1).join('+');
Zango
  • 2,387
  • 3
  • 19
  • 33