2

I want to have javascript regular expression to catch the text from .pdf file. e.g. want to catch this text: 20-40-55 FIG 20

I wrote the following expression for that:

/[\d{2}-\d{2}-\d{2} FIG \d+]/g

It is not catching the required text.

If I pass there any other text of .pdf file, it is showing it in found items.

Evan Carslake
  • 2,267
  • 15
  • 38
  • 56

2 Answers2

3

You should remove the square brackets:

/\d{2}-\d{2}-\d{2} FIG \d+/g

See demo

Square brackets created a character class and the regex matched just 1 character from that set. See Character Classes or Character Sets.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I did this also.. but it is not catching expected text.. if I give /welcome/g as regex, it is showing "25 instances found".. but if I give regex, its not showing anything.. – Mohammed Jamadar Sep 01 '15 at 07:05
  • And? Did you get expected results? Are you sure you have regular spaces around `FIG`? Are there hard spaces? – Wiktor Stribiżew Sep 01 '15 at 07:05
  • the required text in pdf is as : (REF 10-52-44 FIG 20 ...some text..) from here it has to catch.. but it is not catching.. I need here only "10-52-44 FIG 20" to be caught... – Mohammed Jamadar Sep 01 '15 at 07:08
  • there are regular spaces.. is there any method to hold hard spaces.. if you suggest me any other regex, then let me check it.. – Mohammed Jamadar Sep 01 '15 at 07:13
  • [Here is my post enumerating Unicode spaces](http://stackoverflow.com/a/28618108/3832970). A regex to match them can be `/[\u0020\u00A0\u1680\u180E\u2000-\u200B\u202F\u205F\u3000\uFEFF]/g`. Please check again with `/\d{2}-\d{2}-\d{2}[\s\u0020\u00A0\u1680\u180E\u2000-\u200B\u202F\u205F\u3000\uFEFF]+FIG[\s\u0020\u00A0\u1680\u180E\u2000-\u200B\u202F\u205F\u3000\uFEFF]+\d+/g`. – Wiktor Stribiżew Sep 01 '15 at 07:33
  • Use [pastebin.com](http://pastebin.com/) to share your file conents that should be matched. – Wiktor Stribiżew Sep 01 '15 at 11:02
0

In regex, the [...] square brackets is used to allow multiple options. For example, [abc] would be the same as writing a|b|c. Therefore, the brackets is changing the meaning of the expression within them. Remove the brackets.

var text = "test20-40-55 FIG 20test16-67-12 FIG 87";
text.match(/[0-9]{2}-[0-9]{2}-[0-9]{2} FIG [0-9]+/g);

Output:

["20-40-55 FIG 20", "16-67-12 FIG 87"]
Moishe Lipsker
  • 2,974
  • 2
  • 21
  • 29
  • I just checked by modifying the expression as below : /[0-9]{2}-[0-9]{2}-[0-9]{2} FIG [0-9]+/g but also it is not matching a single instance... – Mohammed Jamadar Sep 01 '15 at 06:58