1

I'm trying to extract date, percentage or number from string. Strings can be:

  1. the response value 10 (from here I want to extract 10)
  2. the response value 10/12/2014 (from here I want to extract 10/12/2014)
  3. the response value 08/2015 (from here I want to extract 08/2015)

I've written regex as (?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?) Regex is satisfying 12/12/2014, 10, 02/2012.

I'm also trying to modifying same regex to get 10, 08/2015 and 10/10/2015 but not getting how to get.

How can this be achieved?

  • Could you provide a list of values you want to match? Your current regex does not match `12/12/2014` as you claim, to match `12/12/2014`, you'd need `(\d{2})[\/](\d{2})[\/](\d{4})` – Chase Jun 17 '20 at 15:15
  • You could use an alternation to match all 3 formats `\b(?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?)\b` https://regex101.com/r/xqApkK/1 Note that 4 digits also match 9999 – The fourth bird Jun 17 '20 at 15:17

3 Answers3

2

To match your example data, you could use an alternation matching either 2 digits / 4 digits, or match 2 digits with an optional part that matches 2 digits and 4 digits.

\b(?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?)\b

Explanation

  • \b Word boundary, prevent the word char being part of a larger word
  • (?: Non capture group
    • \d{2}\/\d{4} Match 2 digits/4 digits
    • | Or
    • \d{2} Match 2 digits
    • (?:\/\d{2}\/\d{4})? Optionally match /2 digits/4 digits
  • ) Close group
  • \b Word boundary

Regex demo

Note that 2 and 4 digits could also match 99 and 9999. If you want to make your match more specific, this page can be helpful https://www.regular-expressions.info/dates.html

const pattern = /\b(?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?)\b/;
[
  "the response value 10",
  "the response value 10/12/2014",
  "the response value 08/2015"
].forEach(s => console.log(s.match(pattern)[0]));
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • 1
    I'm thinking, would `\b(?:(?:\d\d\/){1,2}\d{4}|\d\d)\b` be a good alternative? What are your thought on that? I'm not yet familiar enough with regex to tell difference in performance of nested non-capture groups. – JvdV Jun 17 '20 at 15:51
  • @JvdV That is shorter :-) – The fourth bird Jun 17 '20 at 15:54
  • 1
    If so, feel free to use it. Otherwise Ill post it as an alternative. I've also read somewhere that `\d\d` would beat `\d{2}` in performance. What are your thoughts on that? Not sure if this is where I should ask you these things. But I find it interesting =) – JvdV Jun 17 '20 at 15:56
  • 2
    @JvdV From the top of my mind I would not know. Engines can also pre optimize the pattern, so the number of steps is not always a performance guarantee. Most of the times I look at the data, and check what is the least amount of work, so not to trigger the engine if you know upfront that is will not give a match. This page has some interesting information about optimizations https://www.rexegg.com/regex-optimizations.html Also IMHO the book Mastering Regular Expressions chapter 4 - 6 contain nice information about that. – The fourth bird Jun 17 '20 at 16:07
  • 1
    I also believe the same when it comes to steps as the only benchmark(they are useful but not always). [This](https://stackoverflow.com/a/37979155/7571182) post might be useful. –  Jun 17 '20 at 18:34
1

Just for fun (regex is fun) an alternative to the accepted answer:

\b(?:(?:\d\d\/){1,2}\d{4}|\d\d)\b

See the Online Demo

  • \b - Match a word boundary.
  • (?: - 1st Non-capturing group.
    • (?: - 2nd Non-capturing group.
      • \d\d\/ - Match two digits and a literal forward slash.
      • ){1,2} - Close 2nd non-capturing group and use it once or twice.
    • \d{4} - Match four digits.
    • | - Alternation (OR).
    • \d\d) - Two digits and close 1st non capturing group.
  • \b - Match a word boundary.

enter image description here


Maybe we can do this even without alternation:

\b\d\d(?:(?:\/\d\d){1,2}\d\d)?\b

See the Online Demo

  • \b - Match a word boundary.
  • \d\d - Match two digits.
  • (?: - 1st Non-capturing group.
    • (?: - 2nd Non-capturing group.
      • \/\d\d - Match a literal slash and two digits.
      • ){1,2} - Close 2nd non-capturing group and use it once or twice.
    • \d\d - Match two digits.
    • )? - Close 1st non-capturing group and make it optional.
  • \b - Match a word boundary.

enter image description here

JvdV
  • 70,606
  • 8
  • 39
  • 70
0

Match method supports regExp and will return an array with the items you are looking for:

var date = "12/12/2014"

 var arr = date.match(/(\d{2})[\/](\d{2})[\/](\d{4})/);

 console.log(arr[0]);
 console.log(arr[1]);
 console.log(arr[2]);
 console.log(arr[3]);
sonEtLumiere
  • 4,461
  • 3
  • 8
  • 35