0

I'm writing a native javascript app for android, and it involves a short regex call. The following function should select the inner string from a block of html, shorten it if it's too long, then add it back into the html block. (Most of the time anyway -- I couldn't write a perfect html parser.)

My problem is that on certain inputs, this code crashes on the command "str.search(regex)". (It prints out the alert statement right before the command, "Pre-regex string: ", but not the one afterwards, "Pos: ".) Since the app is running on the android, I can't see what error is being thrown.

Under what circumstances could javascript code possibly crash when calling "search()" on a string? There's nothing wrong with the regex itself, because this works most of the time. I can't duplicate the problem either: If I copy the string character by character and feed it into the function outside of the app, the function doesn't crash. Inside the app, the function crashes on the same string.

Here is the function. I tabbed the alert calls differently to make them easier to see.

trimHtmlString: function(str, len, append) {

    append = (append || '');

    if(str.charAt(0) !== '<') {
      if(str.length > len) return str.substring(0, len) + append;
      return str;
    }

      alert('Pre-regex string: '+str);

    var regex = />.+(<|(^>)$)/;

    var innerStringPos = str.search(regex);
    if(innerStringPos == -1) return str;

      alert('Pos: '+innerStringPos);

    var innerStringArray = str.match(regex);

      alert('Array: '+innerStringArray);

    var innerString = innerStringArray[0];    

      alert('InnerString: '+innerString);

    var innerStringLen = innerString.length;

    innerString = innerString.substring(1, innerString.length-1);

      alert(innerString.length);

    if(innerString.length > len) innerString = innerString.substring(0, len) + append;

    return str.substring(0, innerStringPos+1)
            + innerString
            + str.substring(innerStringPos+innerStringLen-1, str.length);
  }
NcAdams
  • 2,521
  • 4
  • 21
  • 32

2 Answers2

0

First, do not parse HTML with regular expressions. You have been warned. Next, make sure you are always passing an actual string. Calling .search() on null or undefined will cause problems. Maybe you can provide an example input that is crashing?

Community
  • 1
  • 1
0

IMO, your regex generate an error because you use the begin anchor ^ after the begin of the string. For example:

<span>rabbit</span>   don't generate an error
<span>rabbit          generate an error

the reason is that the first use the first alternation, ie : <

and the second use the second alternation: (^>)$ that have no sense because your pattern has already begun with >.+

For example, if you want to obtain the word "rabbit" in the two precedent cases, you can use: /(?<=>)[^<]+/ instead

However, using a DOM way will be safer.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125