1

I have a string of time for example:

text = '2010; 04/20/2010; 04/2009'

I want to only find the first standalone '2010', but applying the following code:

re.findall(r'\d{4}', text)

will also find the second '2010' embedded in the mm/dd/yyyy format.

Is there a way to achieve this (not using the ';' sign)?

Darth BEHFANS
  • 409
  • 6
  • 10

2 Answers2

2

You can use re.search to find only the first occurrence:

>>> import re
>>> text = '2010; 04/20/2010; 04/2009'
>>> re.search('\d{4}', text)
<_sre.SRE_Match object; span=(0, 4), match='2010'>
>>> re.search('\d{4}', text).group()
'2010'
>>>

From the documentation:

re.search(pattern, string, flags=0)

Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding match object. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

Emphasis mine.

  • Hi, thank you for your help. Actually the standalone 2010 is randomly distributed in a long string and I need to locate them out and replace with 01/31/2010. I was intended to use (?<!...) but it only supports the fix length. I think I might break the string with each \d{4} and put it into a Pandas series and work on each row which might be viable. – Darth BEHFANS Aug 05 '17 at 18:24
1

I don't know if you have to use regex but .find() in Python3 will return the lowest index of the start of the string you are looking for. From there if you know the length of the string which I assume you do you can extrapolate it out with a slice of the string with another line of code. Not sure if it's better or worse than regex but it seems less complex version that does the same thing for this occurrence. Here is a stack overflow about it and here is the python docs on it

Matthew Barlowe
  • 2,229
  • 1
  • 14
  • 24
  • Hi,thank you for your help. But the real text is that such standalone case randomly distributed so it's not always the first case. I am thinking of transform the string into a Pandas Series and work on a row basis. – Darth BEHFANS Aug 05 '17 at 18:26