0

I want to replace a string of characters in an html tag using JavaScript. So in this example I want to remove everything between the <table and <tbody>. I'm using the replace function and a regular expression. The regular expression construction must be wrong somewhere. Here is what I currently have:

str = str.replace(/([<table]\w*\W*[<tbody>])/, "");

The regular expression logic as I see it is like this (correct me where I'm wrong):

  1. I'm looking for the string match of <table so I put that string in the brackets as I want that to match exactly as written.

  2. Then I place a \w*\W* because I expect 1 or more of both alphanumeric and non alphanumeric characters to follow.

  3. Finally I place the "< tbody>" in the brackets because I expect that format exactly.

So the results are not as I expected. There is no other <tbody> or <table in my string so I don't know what I'm doing wrong.

This is what the string looks like before I replace the characters with nothing.

"\n\t\t\t\t\t\t\n                                                <div>\n\t\t\t\t\t\t\t
<table id=\"gvStation_ctl19_gvExtRows\" style=\"border-collapse: collapse;\" border=\"1\" rules=\"all\" cellspacing=\"0\">
\n\t\t\t\t\t\t\t\t<tbody>
Mr Lister
  • 45,515
  • 15
  • 108
  • 150
captainduh
  • 81
  • 2
  • 11

1 Answers1

1
  1. The brackets find any character between in any order so you don't need it in this case. See http://www.w3schools.com/jsref/jsref_obj_regexp.asp.
  2. \w* and \W* don't match the whitespaces.

Here is the solution : /<\s*table(?:.|\s)*<\s*tbody\s*>/i

var str = '"\n\t\t\t\t\t\t\n < div>\n\t\t\t\t\t\t\t < table id=\"gvStation_ctl19_gvExtRows\" style=\"border-collapse: collapse;\" border=\"1\" rules=\"all\" cellspacing=\"0\"> \n\t\t\t\t\t\t\t\t< tbody>';

str = str.replace(/<\s*table(?:.|\s)*<\s*tbody\s*>/i, "");

alert(str);
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Maxime Gélinas
  • 2,202
  • 2
  • 18
  • 35