You must precisely define the groups that you want to extract before and after the word. If you define the group before the word as four or more non-whitespace characters, and the group after the word as one or more non-whitespace characters, you can use the following regular expression.
var re = new RegExp('(\\S{4,})\\s+(?:\\S{1,3}\\s+)*?' + word + '.*?(\\S+)', 'i');
var groups = re.exec(text);
if (groups !== null) {
var result = groups[1] + groups[2];
}
Let me break down the regular expression. Note that we have to escape the backslashes because we're writing a regular expression inside a string.
(\\S{4,})
captures a group of four or more non-whitespace characters
\\s+
matches one or more whitespace characters
(?:
indicates the start of a non-capturing group
\\S{1,3}
matches one to three non-whitespace characters
\\s+
matches one or more whitespace characters
)*?
makes the non-capturing group match zero or more times, as few times as possible
word
matches whatever was in the variable word
when the regular expression was compiled
.*?
matches any character zero or more times, as few times as possible
(\\S+)
captures one or more non-whitespace characters
- the
'i'
flag makes this a case-insensitive regular expression
Observe that our use of the ?
modifier allows us to capture the nearest groups before and after the word.
You can match the regular expression globally in the text by adding the g
flag. The snippet below demonstrates how to extract all matches.
function forward_and_backward(word, text) {
var re = new RegExp('(\\S{4,})\\s+(?:\\S{1,3}\\s+)*?' + word + '.*?(\\S+)', 'ig');
// Find all matches and make an array of results.
var results = [];
while (true) {
var groups = re.exec(text);
if (groups === null) {
return results;
}
var result = groups[1] + groups[2];
results.push(result);
}
}
var sampleText = " GPX 10.802.123/3843- 1 -- IDENTIFIER 48 A BC 444.2345.1.1/99x 28 - - Identifier 580 X Y Z 9.22.16.1043/73+ 0 *** identifier 6800";
results = forward_and_backward('IDENTIFIER', sampleText);
for (var i = 0; i < results.length; ++i) {
document.write('result ' + i + ': "' + results[i] + '"<br><br>');
}
body {
font-family: monospace;
}