Python matching some characters into a string

Question

I'm trying to extract/match data from a string using regular expression but I don't seem to get it.

I wan't to extract from the following string the i386 (The text between the last - and .iso):

/xubuntu/daily/current/lucid-alternate-i386.iso

This should also work in case of:

/xubuntu/daily/current/lucid-alternate-amd64.iso

And the result should be either i386 or amd64 given the case.

Thanks a lot for your help.

score 3 · Answer 1 · edited May 23 '17 at 12:07

3

You could also use split in this case (instead of regex):

>>> str = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> str.split(".iso")[0].split("-")[-1]
'i386'

split gives you a list of elements on which your string got 'split'. Then using Python's slicing syntax you can get to the appropriate parts.

edited May 23 '17 at 12:07

Community

1
1

answered May 27 '10 at 22:15

ChristopheD

112,638
29
165
179

`str.rsplit('.iso', 1)[0].rsplit('-', 1)[-1]` – jfs May 27 '10 at 22:44
`str.rpartition('.iso')[0].rpartition('-')[-1]` – jfs May 27 '10 at 22:45

score 1 · Answer 2 · answered May 27 '10 at 22:03

1

r"/([^-]*)\.iso/"

The bit you want will be in the first capture group.

answered May 27 '10 at 22:03

Amber

507,862
82
626
550

Were you trying to use `match()` or `search()`? Since this is a partial-match pattern, it should be used with `search()` not `match()` (since `match()` attempts to match the entire string, not just a portion). – Amber May 27 '10 at 22:30

score 1 · Answer 3 · answered May 27 '10 at 22:03

1

First off, let's make our life simpler and only get the file name.

>>> os.path.split("/xubuntu/daily/current/lucid-alternate-i386.iso")
('/xubuntu/daily/current', 'lucid-alternate-i386.iso')

Now it's just a matter of catching all the letters between the last dash and the '.iso'.

answered May 27 '10 at 22:03

badp

11,409
3
61
89

I still face the problem that can't get to extract the desired text :( (I've never been good with regexp) – user175259 May 27 '10 at 22:15

score 1 · Accepted Answer · answered May 27 '10 at 22:17

1

If you will be matching several of these lines using re.compile() and saving the resulting regular expression object for reuse is more efficient.

s1 = "/xubuntu/daily/current/lucid-alternate-i386.iso"
s2 = "/xubuntu/daily/current/lucid-alternate-amd64.iso"

pattern = re.compile(r'^.+-(.+)\..+$')

m = pattern.match(s1)
m.group(1)
'i386'

m = pattern.match(s2)
m.group(1)
'amd64'

answered May 27 '10 at 22:17

Peter McG

18,857
8
45
53

you don't need regexs for this http://stackoverflow.com/questions/2925306/python-matching-some-characters-into-a-string/2925399#2925399 – jfs May 27 '10 at 22:46
I know but it is tagged python and regex – Peter McG May 27 '10 at 23:03

score 0 · Answer 5 · answered May 27 '10 at 22:18

0

The expression should be without the leading trailing slashes.

import re

line = '/xubuntu/daily/current/lucid-alternate-i386.iso'
rex = re.compile(r"([^-]*)\.iso")
m = rex.search(line)
print m.group(1)

Yields 'i386'

answered May 27 '10 at 22:18

koblas

25,410
6
39
49

score 0 · Answer 6 · answered May 27 '10 at 22:19

0

reobj = re.compile(r"(\w+)\.iso$")
match = reobj.search(subject)
if match:
    result = match.group(1)
else:
    result = ""

Subject contains the filename and path.

answered May 27 '10 at 22:19

Turtle

1,320
10
11

score 0 · Answer 7 · answered May 27 '10 at 22:23

0

>>> import os
>>> path = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> file, ext = os.path.splitext(os.path.split(path)[1])
>>> processor = file[file.rfind("-") + 1:]
>>> processor
'i386'

answered May 27 '10 at 22:23

manifest

2,208
16
13

Python matching some characters into a string

7 Answers7