Regex to match line entry padded with whitespace in textarea

Question

I have an html textarea that will have entries that can look like the following:

google.com
   youtube.com   word word
  netflix.com   
 twitch.tv
  vimeo.com  word
soundcloud.com  word    word

I want to make a feature that will search through the list for a url and delete all entries of it. To do this, I first need a regex to find the first occurrence. Note that I only need and want to find the first occurrence.

The feature must only delete an exact match. That is,

DeleteEntry("youtube.com");

should NOT delete the second line, but

DeleteEntry("youtube.com   word word");

should.

So basically, I need to match this pattern

(beginningOfString OR newlineChar) then (anyWhiteSpaceExceptNewline) then (ENTRY) then (anyWhiteSpaceExceptNewline) then (endOfString OR newlineChar)

This is what I have so far

var expression = "\n|^[ \f\r\t\v]*" + entry + "[ \f\r\t\v]*\n|$";
var match = listbox.value.match(expression);

It doesn't seem to work the way I'm expecting it to.

You might try with `m` flag - `var expression = new RegExp("^[ \f\r\t\v]*" + entry + "[ \f\r\t\v]*$", "m");` — Wiktor Stribiżew, Jun 10 '17 at 19:14

ibrahim mahrir · Accepted Answer · 2017-06-10T19:36:48.280

Note: If you want to use \ inside a string, you'll have to escape it. "\some text" is wrong, but "\\some text" is correct.

var ta = document.getElementById("ta"),
    inp = document.getElementById("inp"),
    btn = document.getElementById("btn");
    
// escape text to be used inside RegeExp (from: https://stackoverflow.com/q/3115150/6647153)
function escape(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, '\\$&');
}

function deleteEntry(query) {
  var text = ta.value;
      
  var regexText = query.trim()                                    // remove surrounding spaces
                       .split(/\s+/)                              // split into tokens ("a   b c" becomes ["a", "b", "c"])
                       .map(escape)                               // escape the tokens ("a.com" becomes "a\\.c" so it'll be /a\.c/ where the '.' is regarded as litteral '.' not as the special character .)
                       .join("\\s+");                             // joins the tokens together with \s+ (["a", "b", "c"] becomes "a\\s+b\\s+c" so it'll be /a\s+b\s+c/)
      
  var regex = new RegExp("^\\s*" + regexText + "\\s*$", "gm");    // surrond regexText with ^\s* and \s*$ and use the g modifier for multiple matches and the m modifier for multiline text
  
  ta.value = text.replace(regex, "");                             // replace the matched text with "" and reassign it back to the textarea
}
    
btn.onclick = function() {
  deleteEntry(inp.value);                                         // calling deleteEntry passing to it the input's value as the query
}

textarea {
  display: block;
  width: 100%;
  height: 100px;
}

<textarea id="ta">
google.com
   youtube.com   word word
  netflix.com   
 twitch.tv
  vimeo.com  word
soundcloud.com  word    word
</textarea>
<input id="inp"><button id="btn">Delete</button>

What I got from this is to use the multi-line option, which simplifies the expression a lot. It seems that my original regex would have worked with some additional parenthesis like so: `"(^|\n)[ \f\r\t\v]*" + entry + "[ \f\r\t\v]*(\n|$)"` I'm not sure why the parens work, but I did get the expected behavior. — V. Rubinetti, Jun 11 '17 at 16:57
Additionally: For my purposes, I needed to get the start and end index of the match in the list string, because I wanted to select the match and run a delete command so the deletion would be added to the browser's undo/redo stack. So instead of using `replace`, I used `match()` and then `indexOf()` and `match.length` to get the start and end indexes. Using my alternative non-multi-line approach in the comment above, you can get the match string, match length, and start index all from one `match()` call. — V. Rubinetti, Jun 11 '17 at 17:06
This answer also includes escaping the special regex characters from the url first, which I did not mention I was doing in this original post. Be sure to do this. — V. Rubinetti, Jun 11 '17 at 17:06

Regex to match line entry padded with whitespace in textarea

1 Answers1