1

Does python's regular expression have anything equivalent to match numbers in a given range?

For example in bash, you can match test19.txt, test20.txt, test21.txt by test{19..21}.txt

I am not looking for regular expression to match all digits like [1-2][0-9].

I want to match only a particular series of numbers starting from some number to another.

Update: The final aim is to create a regexp object with re.compile(), so that i can use it to search a big list of strings.

indiajoe
  • 1,291
  • 3
  • 13
  • 26
  • 3
    Note that what you show isn't a regex, but a shell expansion. Bash doesn't do regexes. – Marcin Oct 28 '13 at 21:48
  • @Marcin Thanks, True. I was looking for the regexp substitute for shell expansion. – indiajoe Oct 28 '13 at 22:14
  • 1
    @Marcin Bash does do regex, both for glob expansion (with extglob) and through `=~`. This -- brace expansion -- however, doesn't match anything. It merely iterates string combination. – that other guy Oct 28 '13 at 23:01
  • 1
    `extglob` does not use regexes, but so-called extended patterns. They provide some, but not all, of the functionality that regexes have compared to basic patterns. – chepner Oct 29 '13 at 12:38

4 Answers4

5
['text' + str(i) + '.txt' for i in range(19, 22)]

Will give you that list:

['test19.txt', 'test20.txt', 'test21.txt']

So you can list of the files that are in that list. For example if you have a list of words named words and want to filter those that match it:

r = ['text' + str(i) + '.txt' for i in range(19, 22)]
[x for x in words if x in r]

But if you really want a regexp:

re.compile('|'.join(['text' + str(i) + '.txt' for i in range(19, 22)]))
Maxime Chéramy
  • 17,761
  • 8
  • 54
  • 75
  • Actually, i wanted to generate a regular expression to match them. something like re.compile('the regexp magic'). So that i can use it for matching in a huge list of strings. – indiajoe Oct 28 '13 at 21:49
  • 1
    You can surely construct your regexp using the trick I gave and a regexp or. If you only have a list of strings and want to filter those that match `test{19..21}.txt`, then use `in ['text' + str(i) + '.txt' for i in range(19, 22)]` in a for loop or list-comprehension. – Maxime Chéramy Oct 28 '13 at 21:52
  • 1
    Thanks a lot. yes, I think i shall use the method you suggested to generate a list of regexps and use | to combine. – indiajoe Oct 28 '13 at 22:12
1

Although there is another similar question (Regular Expression: Numeric Range), whose answers recommend to use regular expressions only to match for the occurrence of a number using something along the lines of \d{1,3}, this answer points to the command line tool rgxg which can generate regular expressions that match a specified number range.

Community
  • 1
  • 1
ojdo
  • 8,280
  • 5
  • 37
  • 60
  • Thanks. rgxg looks very nice. but i was thinking of using only standard python. – indiajoe Oct 28 '13 at 22:13
  • Maybe you can use the code that forms this feature from the C source code and to implement it in a (I would assume quite) short Python function? – ojdo Oct 29 '13 at 11:52
1

Assume you have these files:

$ cd test
$ touch file{1..25}.txt
$ ls
file1.txt   file14.txt  file19.txt  file23.txt  file5.txt
file10.txt  file15.txt  file2.txt   file24.txt  file6.txt
file11.txt  file16.txt  file20.txt  file25.txt  file7.txt
file12.txt  file17.txt  file21.txt  file3.txt   file8.txt
file13.txt  file18.txt  file22.txt  file4.txt   file9.txt

You can use glob to match the grand pattern of file[numers].txt:

import glob
import os
import re

os.chdir('/Users/andrew/test')

print glob.glob('file[0-9]*.txt')
# ['file1.txt', 'file10.txt', 'file11.txt', 'file12.txt', 'file13.txt', 'file14.txt', 'file15.txt', 'file16.txt', 'file17.txt', 'file18.txt', 'file19.txt', 'file2.txt', 'file20.txt', 'file21.txt', 'file22.txt', 'file23.txt', 'file24.txt', 'file25.txt', 'file3.txt', 'file4.txt', 'file5.txt', 'file6.txt', 'file7.txt', 'file8.txt', 'file9.txt']

Then use a list comprehension with regex to narrow that list:

def expand(x,lo=0,hi=sys.maxint): 
    return lo<=int(re.search(r'\d+', x).group(0))<=hi

print [e for e in glob.glob('file[0-9]*.txt') if expand(e, 8,12)]
# ['file10.txt', 'file11.txt', 'file12.txt', 'file8.txt', 'file9.txt']

Or use filter:

print filter(lambda x: expand(x, 9, 12), glob.glob('file[0-9]*.txt'))
# ['file10.txt', 'file11.txt', 'file12.txt', 'file9.txt']
dawg
  • 98,345
  • 23
  • 131
  • 206
0

what are you looking for?

there is always range(19,22) which is depending on what you are looking close to curly expansion

Alexander Oh
  • 24,223
  • 14
  • 73
  • 76