0

I know there are plenty of questions regarding RegEx, but I have been searching for at least three days and I cannot find a solution for my problem.

Giving a title of a product I need to extract some information. So in order to do it I am provided with a list of words, so far so good. But the thing is I need to extract a number that will come before any of the words in the list.

Example of list:

const words = ['temp', 'temperature', 'temperatures', 'degrees', 'heat', 'heating']

So far what I have achieved is giving a regEx find some information:

const textToSearch = 'Hair Dryer, 32JVT slopehill Professional Salon Negative Ions Hair Blow Dryer Powerful 1800W for Fast Drying, Lightweight Bioceramic with 3 Heating / 2 Speed/Cool Button, Magnetic Concentrator and Diffuser'
const regex = /(\d+(temp|\s(temp)|temperature|\s(temperature)|degrees|\s(degrees)|heat|\s(heat)|heating|\s(heating)))/g 
const found = textToSearch.match(regex);
if (found) {
  console.log(found[0]); 
}

But the expected output is being for example '32JVT' and not 3 Heating Also I don't know how to enter the full list that a I am receiving from my API, as this list will vary and change. Other problems that might appear are that maybe the word is followed by a symbol like a / or any other and I don't know how this will mess with the regular expression.

ZetaPR
  • 964
  • 7
  • 32
  • Could you please add more samples of strings you're going to parse, list words you're going to use, as well as, the result you want to have. Now it's quite unclear what you really expect as a result of searching – Max Jan 30 '20 at 19:00
  • Why is `32JVT` not what you want but `3 Heating` is what you want? Based on the instances with and without `\s`, it looks like you want to allow appearances with and without a space. – apsillers Jan 30 '20 at 19:07

4 Answers4

1

You can create a RegExp dynamically from the array of words, like this:

const words = ['temp', 'temperature', 'temperatures', 'degrees', 'heat', 'heating']
const textToSearch = 'Hair Dryer, 32JVT slopehill Professional Salon Negative Ions Hair Blow Dryer Powerful 1800W for Fast Drying, Lightweight Bioceramic with 3 Heating / 2 Speed/Cool Button, Magnetic Concentrator and Diffuser'

const regex = RegExp("\\b(\\d+(\\.\\d+)?)\\s+(" + words.join("|") + ")\\b", "gi");

console.log(textToSearch.match(regex));

The backslashes are escaped because they appear in a string literal. This also matches numbers with decimals, and requires that the word that follows the number is not followed by more letters. So for instance, 3 temperament would not match, even though temp is in the word list.

If your word list would contain characters which have a special meaning in a regex, like &, |, ^, ..., then make sure to escape those. You can use an escape function for that.

trincot
  • 317,000
  • 35
  • 244
  • 286
  • I am accepting this answer, the only thing that I am missing on it is if there is a way of dynamically giving the word `temperature` to have in the regex something similar to `[tT]emperature` without having to put Temperature and temperature both in the list – ZetaPR Jan 30 '20 at 19:28
  • 1
    The `i` flag is already provided, so that it is a case insensitive match. – trincot Jan 30 '20 at 19:29
0

What I would try is using the following syntax:

([1-9]+ +[Hh]eating) for each word. It consists of one or two numbers (+ means one or more of the preceeding term) between 1-9 (effectively between 01 and 99), one or more spaces and the term Heating or heating.

This works fine for me with your example. You could do the same for the other words and should get a nice result.

PaulS
  • 850
  • 3
  • 17
0

You can use (\d*\s|) to match the numbers preceding the words. I think your searches are also case insensitive.

const words = ['temp', 'temperature', 'temperatures', 'degrees', 'heat', 'heating'];

const textToSearch = 'Hair Dryer, 32JVT slopehill Professional Salon Negative Ions Hair Blow Dryer Powerful 1800W for Fast Drying, Lightweight Bioceramic with 34 Heating / 2 Speed/Cool Button, Magnetic Concentrator and Diffuser, 87 degrees'

const regex = /(\d*\s|)(temp|temperature|temperatures|degrees|heat|heating)/gi;
const found = textToSearch.match(regex);
if (found) {
  console.log(found); 
}
Addis
  • 2,480
  • 2
  • 13
  • 21
0

const words = ['temp', 'temperature', 'temperatures', 'degrees', 'heat', 'heating'];
const words_re = words.join('|')

const textToSearch = 'Hair Dryer, 32JVT slopehill Professional Salon Negative Ions Hair Blow Dryer Powerful 1800W for Fast Drying, Lightweight Bioceramic with 3 Heating / 2 Speed/Cool Button, Magnetic Concentrator and Diffuser'
const regex = new RegExp('\\d+\\s*\\b(?:' + words_re + ')\\b', 'gi');
console.log(textToSearch.match(regex)[0]); 
Toto
  • 89,455
  • 62
  • 89
  • 125
  • Is there anything here that I did not include in the answer I posted 8 minutes before? – trincot Jan 30 '20 at 19:32
  • @trincot: sure not, I was posting my answer when you've posted yours and I avoid useless capturing groups. – Toto Jan 30 '20 at 19:33