18

I have a string - Python :

string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"

Expected output is :

"Atlantis-GPS-coordinates"

I know that the expected output is ALWAYS surrounded by "/bar/" on the left and "/" on the right :

"/bar/Atlantis-GPS-coordinates/"

Proposed solution would look like :

a = string.find("/bar/")
b = string.find("/",a+5)
output=string[a+5,b]

This works, but I don't like it. Does someone know a beautiful function or tip ?

Vincent
  • 1,534
  • 3
  • 20
  • 42

4 Answers4

26

You can use split:

>>> string.split("/bar/")[1].split("/")[0]
'Atlantis-GPS-coordinates'

Some efficiency from adding a max split of 1 I suppose:

>>> string.split("/bar/", 1)[1].split("/", 1)[0]
'Atlantis-GPS-coordinates'

Or use partition:

>>> string.partition("/bar/")[2].partition("/")[0]
'Atlantis-GPS-coordinates'

Or a regex:

>>> re.search(r'/bar/([^/]+)', string).group(1)
'Atlantis-GPS-coordinates'

Depends on what speaks to you and your data.

dawg
  • 98,345
  • 23
  • 131
  • 206
  • Love your answer. I will validate it. What are the advantages / drawbacks of split and partition ? – Vincent Jan 17 '16 at 11:29
  • The main difference is how each handles the split if `/bar/` is not present. `partition` always produces a three element tuple with empty strings of the partition element is not found. `split` changes the number of elements in the list produced. It is easier to test whether `partition` did what it was supposed to do. I would use `split` if I knew the string would successfully split; `partition` or a regex if I needed to test. – dawg Jan 17 '16 at 16:14
4

What you haven't isn't all that bad. I'd write it as:

start = string.find('/bar/') + 5
end = string.find('/', start)
output = string[start:end]

as long as you know that /bar/WHAT-YOU-WANT/ is always going to be present. Otherwise, I would reach for the regular expression knife:

>>> import re
>>> PATTERN = re.compile('^.*/bar/([^/]*)/.*$')
>>> s = '/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/'
>>> match = PATTERN.match(s)
>>> match.group(1)
'Atlantis-GPS-coordinates'
D.Shawley
  • 58,213
  • 10
  • 98
  • 113
1
import re

pattern = '(?<=/bar/).+?/'
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"

result = re.search(pattern, string)
print string[result.start():result.end() - 1]
# "Atlantis-GPS-coordinates" 

That is a Python 2.x example. What it does first is: 1. (?<=/bar/) means only process the following regex if this precedes it (so that /bar/ must be before it) 2. '.+?/' means any amount of characters up until the next '/' char

Hope that helps some.

If you need to do this kind of search a bunch it is better to 'compile' this search for performance, but if you only need to do it once don't bother.

cmaceachern
  • 419
  • 4
  • 10
0

Using re (slower than other solutions):

>>> import re
>>> string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
>>> re.search(r'(?<=/bar/)[^/]+(?=/)', string).group()
'Atlantis-GPS-coordinates'
heemayl
  • 39,294
  • 7
  • 70
  • 76