0

Is there any regex I can use to match blocks of exactly 10 digits? For instance, I have this:

/\d{10}(?!\d+)/g

And this matches 2154358383 when given 2154358383 fine, but is also matches 1213141516 when given 12345678910111213141516, which I don't want.

What I think I need is a look-behind assertion (in addition to my lookahead already in there), that checks to make sure the character preceding the match is not an integer, but I can't figure out how to do that.

I tried

/(?:[^\d]+)\d{10}(?!\d+)/g

But that broke my first match of 2154358383, which is bad.

How can I write this to only match groups of 10 integers (no more, no less) with unknown boundaries?

I should also note that I'm trying to extract these out of a much larger string, so ^ and $ are out of the question.

neezer
  • 19,720
  • 33
  • 121
  • 220
  • 3
    What do you mean by unknown boundaries? Should "abc1234567890def" match? – robert Jul 10 '12 at 19:49
  • Does it matter what it matches, or just that it matches in the first place? – robert Jul 10 '12 at 19:51
  • "unknown boundaries" in the sense that it's not wrapped in something predictable, like quotes or something... so that I could just match inside the quotes, or from the beginning and ending of the line, etc. Yes, your example `abc1234567890def` should match, but `abc123456789012334567890def` should not. – neezer Jul 10 '12 at 19:52
  • Amending my previous comment: the first given example should match as `1234567890`, not as-is. – neezer Jul 10 '12 at 20:31

8 Answers8

4

This should work: ([^\d]|^)\d{10}([^\d]|$)

aquinas
  • 23,318
  • 5
  • 58
  • 81
3

Could you do something like:

([^\d]|^)(\d{10})([^\d]|$)

In other words, the beginning of the string or a non-digit, ten digits, then the end of the string or a non-digit. That should solve the cases you looked for above.

You can use the regex like this:

var regex = /([^\d]|^)(\d{10})([^\d]|$)/;
var match = regex.exec(s);
var digits = match[2];
Moishe Lettvin
  • 8,462
  • 1
  • 26
  • 40
  • This matches the lead and trailing character before and after the match, so `abc1234567890def` returns as `c1234567890d`. I tried this `(?:[^\d]|^)(\d{10})(?:[^\d]|$)`, but I had the same problem. – neezer Jul 10 '12 at 20:30
1

This should match numbers at the beginning of the string (the ^) or in the middle/end (the [^\d] and the (?!\d). If you care about the exact match and not just that it matches in the first place, you'll need to grab the first group in the match.

/(?:[^\d]|^)(\d{10})(?!\d)/g

This would be easier if JavaScript regular expressions supported lookbehind.

robert
  • 33,242
  • 8
  • 53
  • 74
  • This still matches the leading character (not part of the match), so `abc1234567890def` comes back as `c1234567890`. I wouldn't think so, since you're wrapping it in `(?:)`, though... thoughts? – neezer Jul 10 '12 at 20:28
  • 1
    @neezer as I said, you need to grab the first group in the match rather than just using the match verbatim. See [How do you access the matched groups in a javascript regex?](http://stackoverflow.com/questions/432493/how-do-you-access-the-matched-groups-in-a-javascript-regex) for details on how to do this. – robert Jul 10 '12 at 20:39
1

What about the next?

perl -nle 'print if /(\b|\D)(\d{10})(\D|\b)/' <<EOF
123456789
x123456789
123456789x
1234567890
x1234567890
1234567890x
12345678901
x12345678901
x12345678901x
EOF

will print only

1234567890
x1234567890
1234567890x
cajwine
  • 3,100
  • 1
  • 20
  • 41
0

Try this

var re = /(?:^|[^\d])(\d{10})(?:$|[^\d])/g

re.exec ( "2154358383")
//["2154358383", "2154358383"]
re.exec ( "12345678910111213141516" )
//null
re.exec ( "abc1234567890def" )
//["c1234567890d", "1234567890"]

val = '1234567890 sdkjsdkjfsl 2234567890 323456789000 4234567890';
re.exec ( val )
//["1234567890 ", "1234567890"]
re.exec ( val )
//[" 2234567890 ", "2234567890"]
re.exec ( val )
//[" 4234567890", "4234567890"]
re.exec ( val )
//null
Esailija
  • 138,174
  • 23
  • 272
  • 326
0

I know you said "no ^" but maybe it's okay if you use it like this?:

rx = /(?:^|\D)(\d{10})(?!\d)/g

Here's a quick test:

> val = '1234567890 sdkjsdkjfsl 2234567890 323456789000 4234567890'
'1234567890 sdkjsdkjfsl 2234567890 323456789000 4234567890'
> rx.exec(val)[1]
'1234567890'
> rx.exec(val)[1]
'2234567890'
> rx.exec(val)[1]
'4234567890'
danfuzz
  • 4,253
  • 24
  • 34
-1

Simple with lookbehind:

/(?<!\d)\d{10}(?!\d)/g
Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145
  • JS doesn't seem to support lookbehinds `(?<!\d)`; doing that exits my tests with `SyntaxError: Invalid regular expression: /(?<!\d)\d{10}(?!\d)/: Invalid group` – neezer Jul 10 '12 at 19:57
-3

i would cheat and do something like

if (myvar.toString().substring(1, 10) = "1234567890") ....

:)

Losbear
  • 3,255
  • 1
  • 32
  • 28
  • 3
    That doesn't make any sense. Sorry. – Kobi Jul 10 '12 at 19:53
  • @Kobi maybe he's going for the [Peer Pressure badge](http://stackoverflow.com/badges/38/peer-pressure) – robert Jul 10 '12 at 19:56
  • oh poop - come on, i even added a smilie face! I was saying I would cheat by converting it to a string and then comparing the first 10 digits to whatever i was comparing it to. If it wasn't always the first 10 digits, I could do a IndexOf() on it. Dang - didn't expect to get downvoted :( – Losbear Jul 10 '12 at 20:01
  • I think you've missed the point. Completely. It's 10 arbitrary digits. Not 10 digits he knows in advance. – Andrew Cheong Jul 10 '12 at 21:06