1

Given a string, how do I extract all sequences of exactly 4 digits?

That is, for 1234 12 12345 1bc5 9876 I want to get [1234, 9876].

I got as far as re.findall('\D\d\d\d\d\D'), but that fails on text boundaries (when there's no character before/after a match).


Solution preferably using Python 2.7, but I guess this is pretty general, any language will do.

user124114
  • 8,372
  • 11
  • 41
  • 63

1 Answers1

7

The general answer is surprisingly complicated, see here for more info. However in this particular case, we can simply use a word-boundary assertion \b:

re.findall(r'\b\d{4}\b', ....)
Community
  • 1
  • 1
georg
  • 211,518
  • 52
  • 313
  • 390