2

I'm trying to match all the words starting with # and words between 2 # (see example)

var str = "#The test# rain in #SPAIN stays mainly in the #plain"; 
var res = str.match(/(#)[^\s]+/gi);

The result will be ["#The", "#SPAIN", "#plain"] but it should be ["#The test#", "#SPAIN", "#plain"]

Extra: would be nice if the result would be without the #.

Does anyone has a solution for this?

Tushar
  • 85,780
  • 21
  • 159
  • 179
E. Fortes
  • 1,338
  • 12
  • 12
  • 1
    Well, try [`#\w+(?:(?: +\w+)*#)?`](https://regex101.com/r/mY5zL9/1). I doubt it is that easy to get rid of the trailing `#` with a regex. – Wiktor Stribiżew Feb 12 '16 at 12:41
  • @E.Fortes Answer by Wiktor not me, he has added answer with explanation, upvote and accept if that has solved the problem. – Tushar Feb 12 '16 at 12:55

2 Answers2

4

You can use

/#\w+(?:(?: +\w+)*#)?/g

See the demo here

The regex matches:

  • # - a hash symbol
  • \w+ - one or more alphanumeric and underscore characters
  • (?:(?: +\w+)*#)? - one or zero occurrence of:
    • (?: +\w+)* - zero or more occurrences of one or more spaces followed with one or more word characters followed with
    • # - a hash symbol

NOTE: If there can be characters other than word characters (those in the [A-Za-z0-9_] range), you can replace \w with [^ #]:

/#[^ #]+(?:(?: +[^ #]+)*#)?/g

See another demo

var re = /#[^ #]+(?:(?: +[^ #]+)*#)?/g;
var str = '#The test-mode# rain in #SPAIN stays mainly in the #plain #SPAIN has #the test# and more #here';
var m = str.match(re);
if (m) {

  // Using ES6 Arrow functions
  m = m.map(s => s.replace(/#$/g, ''));

  // ES5 Equivalent
  /*m = m.map(function(s) {
    return s.replace(/#$/g, '');
  });*/ // getting rid of the trailing #
  document.body.innerHTML = "<pre>" + JSON.stringify(m, 0, 4) + "</pre>";
}
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

You can also try this regex.

#(?:\b[\s\S]*?\b#|\w+)

See demo at regex101 (use with g global flag)

Community
  • 1
  • 1
bobble bubble
  • 16,888
  • 3
  • 27
  • 46