2

Using Python I am currently trying to strip out part of a string that occurs between two characters. The string can be different lengths, so character counting will not work. An example of what I am looking for would be:

172.-.221 - - [07/-20-:16:36:27 -0500] Firefox/17.0" ** 0 s/ 950 ms **

The desired section of the string is 0 s/ 950 ms, and I have noticed that it occurs between the pairs of double asterisks (** **) consistently.

How would I grab just the part of the string between the two double asterisks (**)? How would I either output that to the screen or save that to a file?

thegrinner
  • 11,546
  • 5
  • 41
  • 64

5 Answers5

3

This is exactly the sort of thing that re is made for. e.g.

import re

TheString = '172.-.221 - - [07/-20-:16:36:27 -0500] Firefox/17.0" ** 0 s/ 950 ms **'

wc = re.compile(r'\*\*(.*)\*\*')
matches = wc.findall(TheString)

#returns ['0 s/ 950 ms ']
Steve Barnes
  • 27,618
  • 6
  • 63
  • 73
  • You can also rewrite the second line in order take the spaces into account. wc = re.compile(r'\\*\\* (.*) \\*\\*') – Raydel Miranda Dec 11 '13 at 21:20
  • You can if desired - personally I would us `r'\*\* *(.*) *\*\*' so as to allow for a variable number of spaces. – Steve Barnes Dec 11 '13 at 21:34
  • Yes you will match any number of spaces, and all of them will be collected inside "matches" variable. That's not what user want. (According the question). – Raydel Miranda Dec 11 '13 at 21:46
  • Oops use `r'\*\* *(.*) *\*\*'` lost a star - the match[1] will only give the part that excludes the **s and spaces. – Steve Barnes Dec 11 '13 at 21:50
  • You know what? I haven't tested ... but, that last pattern you post, don't look a little bit ambiguous to you? What if you change the last starts you add by +. Something like: r'\*\* +(.*) +\*\*' – Raydel Miranda Dec 11 '13 at 22:18
  • That would force a space be required rather than allowing 0 or more + you need to escape the *s to match them. – Steve Barnes Dec 11 '13 at 22:34
2
>>> s='172.-.221 - - [07/-20-:16:36:27 -0500] Firefox/17.0" ** 0 s/ 950 ms **'
>>> s.split('**')[1].strip()
'0 s/ 950 ms'
ndpu
  • 22,225
  • 6
  • 54
  • 69
1

You could use str.split to extract it:

myString.split("**")[1]

This creates a list of strings by splitting the string at each appearance of "**", then takes the second item, index 1, from that list.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
1

This is good as well :)

>>> import re
>>> string = '172.-.221 - - [07/-20-:16:36:27 -0500] Firefox/17.0" ** 0 s/ 950 ms **'
>>> re.search('\*{2}(.+)\*{2}', string).group(1)
' 0 s/ 950 ms '
GMPrazzoli
  • 239
  • 1
  • 6
0

You can use regular expressions (the sub method). Here is a good tutorial by Google: https://developers.google.com/edu/python/regular-expressions

Ali Rasim Kocal
  • 528
  • 3
  • 14