0

Hi I need an ICU regex that I think it is pretty basic but I don't know how to build it right. The regex should match strings like:

font-size: 9pt;
font-size: 15pt;
font-size:2pt;
font-size:22pt;

I'm trying to make something like this but it doesn't work:

regex = \bfont\-size: [0-9]{3}pt;\b

I'm really new to regex so I'm not sure what am I doing wrong here. Any help is much appreciated.

P.S.: Does anyone know a good resource to get the hang of this fast?

Horatiu Paraschiv
  • 1,750
  • 2
  • 15
  • 37

2 Answers2

1

font\-size\: ?[0-9]{1,3}pt\;

Should do the trick. Essentially, escape all non-alphanumeric characters (just to be on the safe side). Also, {1,3} means repeating 0-9 from one to three times, instead of always three times.

Edit: Updated the above regex. The trailing \b was removed, and the space before the number was made optional using ?.

Python demonstration:

>>> import re
>>> s = """
... font-size: 9pt;
... font-size: 15pt;
... font-size:2pt;
... font-size:22pt;
... """
>>> re.findall("font\-size\: ?[0-9]{1,3}pt\;", s)
['font-size: 9pt;', 'font-size: 15pt;', 'font-size:2pt;', 'font-size:22pt;']
Håvard
  • 9,900
  • 1
  • 41
  • 46
  • great, thanks! And how do I change this to match this kind of strings but with decimal numbers? e.g. font-size: 9.5pt; I tried: font\-size\: ?[0-9]{1,3}(\.[0-9]+)?pt\; and it doesn't work. It is so frustrating... – Horatiu Paraschiv Dec 09 '10 at 18:24
  • A very simple approach would be `font\-size\: ?[0-9\.]{1,3}pt\;`, but it would also match stuff like 9.5.2pt. (**EDIT:** Actually it won't match 9.5.2 as it only matches 3 characters, heh.) There's nothing wrong with your regex, perhaps you're using it wrongly. – Håvard Dec 09 '10 at 18:32
  • Thank you for your time and help! You are right I was using it wrong. It works now. – Horatiu Paraschiv Dec 09 '10 at 18:36
0

Two problems I see with your regex:

  1. {3} matches exactly three things. You probably want {1,3} to match 1 to 3.

  2. I don't think \b is going to do what you want right after a semicolon. Perhaps you want something like \s* (zero or more whitespace).

If you want to learn regexes fast, your best bet might be to use a regex debugging tool and experiment.

Community
  • 1
  • 1
Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299