0

Solved it in case anyone needs it here it is

var feed      =   feeds.entries[i].content;
var parsedFeed    =   feed.replace(/src=/gi, "tempsrc=");
var tmpHolder =   document.createElement('div');
tmpHolder.innerHTML=parsedFeed;

I have a string containing html markup which include <img src='path.jpg'/>

I would like to run a regex against the string to replace every src attr to tmpSrc

so

 <img src='path.jpg'/>

would turn into

 <img tmpSrc='path.jpg'/>

this is in javascript by the way

and here is the root issue posted in other places but has not been solved

Browser parse HTML for jQuery without loading resources

How to parse AJAX response without loading resources?

Thanks

Community
  • 1
  • 1
samccone
  • 10,746
  • 7
  • 43
  • 50
  • 4
    if HTML is the question, regexps ain't the answer... – Alnitak Jul 12 '11 at 21:38
  • its contained within a string ie.. var result = '
    .... ...
    ';
    – samccone Jul 12 '11 at 21:40
  • HTML or string, it's still HTML, it doesn't matter. As @Alnitak said, *regexps ain't the answer* and I totally agree with him. Now the question I would like to ask you is the following: Why would you want to turn a visibly valid HTML snippet (missing an `alt` attribute to be valid) into an invalid one (by using a `tmpSrc` attribute)? – Darin Dimitrov Jul 12 '11 at 21:43
  • ok... so I am getting some html via ajax.. but I want to extract the first image from each post.. the issue is when I pull the post it pulls the entire post each time (cant change this) .. then when I parse it with jquery it actually instantiates the elements into the dom and loads all of the resources for each post (LOTS OF IMAGES) thus very slow... changing the src to tmpSrc will prevent the dom from loading the images by default.. – samccone Jul 12 '11 at 21:46
  • hmm, I wonder if it'll try to load the images if you do the jQuery stuff I suggested like so: `$(myhtml).filter('img').each(...)` ? – Alnitak Jul 12 '11 at 21:49
  • the moment you call $(myhtml) it begins the load.. I have already tried – samccone Jul 12 '11 at 21:50
  • I don't suppose the content you're reading could be treated as valid XML? – Alnitak Jul 12 '11 at 21:58
  • I do suppose it could be... are you thinking of using the xml parser in jquery to deal with it? – samccone Jul 12 '11 at 22:01
  • @samccone yes, something like that. If it thinks it's XML it might not try to create DOM elements. – Alnitak Jul 12 '11 at 22:07
  • @Alnitak let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/1391/discussion-between-samccone-and-alnitak) – samccone Jul 12 '11 at 22:09
  • Yes this failed.. malformed XML as I expected – samccone Jul 12 '11 at 22:10
  • [Obligatory link to famous post about problems with using regexes to parse HTML.][1] [1]: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Elias Zamaria Jul 12 '11 at 22:34
  • @mike - one comment is sufficient to get your point across. Thanks. – Kev Jul 13 '11 at 00:17
  • @samccone - please don't add your solution to the question. Make this an answer referencing jfriend00's post. Thanks. – Kev Jul 13 '11 at 00:18
  • Sorry about this. I tried posting it as an answer and I didn't see it like I usually do so I tried a few more times. – Elias Zamaria Jul 13 '11 at 17:30

5 Answers5

2

If this is a string you control and not HTML retrieved from a web page, then you can indeed safely use a regex. To change all occurences of <img src= to <img tmpSrc=, you can use this operation:

var str = "<img src='path.jpg'/>";   // whatever your source string is
str = str.replace(/<img src=/gi, "<img tempSrc=");

What the other posters have been saying is that regex are not good to use on HTML retrieved from a web page because different browsers return different forms of HTML so it makes it hard to reliably match it. But, if the string you're trying to do a replace on is under your own control and you can know the format of it, then this regex should work fine.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
1

Manipulating HTML with RegExps is error prone.

If you can suffer to include jQuery in your page:

$('img').each(function() {
    var src = this.src;
    $(this).attr('tmpSrc', src).removeAttr(src);
});
Alnitak
  • 334,560
  • 70
  • 407
  • 495
0
function replace() {
    var images = document.getElementsByTagName('img'),
        srcValue,
        max
        i;

    for (i = 0, max = images.length; i < max; i++) {
       srcValue = images[i].getAttribute('src');
       images[i].setAttribute('tmpSrc', srcValue);
       images[i].removeAttribute('src');
    }
}
user278064
  • 9,982
  • 1
  • 33
  • 46
0

as string:

input = " <img src='path.jpg'/>"; 
output = input.replace("src","tmpSrc"); 

using DOM:

    e = document.getElementById("theimg");  
    e.tmpSrc = e.src;
   e.removeAttribute("src");
The Mask
  • 17,007
  • 37
  • 111
  • 185
0

Check out this post explaining the problems with using regexes to parse HTML. It is tricky and probably not worth the effort. I don't think there is any way that is guaranteed to work.

Community
  • 1
  • 1
Elias Zamaria
  • 96,623
  • 33
  • 114
  • 148