-1

Quick silly question... I am trying to create a pattern to match the following type of string '12/06/06 08:15'.

I tried r'/d{1-2}///d{1-2}///d{1-2} /d{1-2}:/d{1-2}' and '/d{1-2}///d{1-2}///d{1-2} /d{1-2}:/d{1-2}' but obviously it doesnt work... Can anyone give me a leg up?

Radar
  • 935
  • 3
  • 9
  • 23
  • 2
    Why not use a datetime lib? See [doc links on this answer](http://stackoverflow.com/a/466376/868044) – Dan Jan 25 '17 at 22:41
  • @Dan because some of the strings i want to test don't correspond to that format. That's precisely the goal, i.e. separate strings that match from those which don't – Radar Jan 25 '17 at 22:43
  • 1
    `/` is not the same as `\\`. – melpomene Jan 25 '17 at 22:48
  • 1
    `-` is not the same as `,`. – melpomene Jan 25 '17 at 22:48
  • @Radar Does error handling work? If the string format is different from the one you specify in your `datetime.strptime()` it will throw a `ValueError`. – spicypumpkin Jan 25 '17 at 22:50
  • @Dan How would you do error handling? For instance if one of the string encountered is `'30/06/06-01/07/06'` - (this is the error in the current state of affairs..) – Radar Jan 25 '17 at 23:11
  • @Radar if you get an exception (try...except), then try next most common pattern (e.g. day/month/year), if still an exception, write the line number out to an exceptions file and review those to see if there is a pattern. – Dan Jan 25 '17 at 23:22
  • 1
    But if you're using pandas (which was not stated in your original question), you can use something like `pd.to_datetime(df.Timestamp, format='%m/%d/%Y %H:%M', errors='raise')` to raise an exception then try parsing again with `to_datetime()` using `format='%d/%m/%Y %H:%M'`. You can experiment with `errors='coerce'` and `ignore` also. [See the docs](http://pandas.pydata.org/pandas-docs/version/0.19.2/generated/pandas.to_datetime.html). Note that pandas use strftime to that documentation is applicable to the format keyword argument – Dan Jan 25 '17 at 23:28

1 Answers1

2

This should work for the pattern shown:

r'\d{2}/\d{2}/\d{2} \d{2}:\d{2}'

But this really should be done with a datetime lib and just handle exceptions for non-matching lines.

Dan
  • 4,488
  • 5
  • 48
  • 75
  • Sorry, I was using grep for testing, and forward slashes do mean something in grep. I suppose this is not true of Python. – Dan Jan 25 '17 at 22:57
  • No, grep doesn't care about `/` either. – melpomene Jan 25 '17 at 23:00
  • 1
    @Dan Hmm ok. Yeah its true i was using the pandas `to_datetime()` and not the `strptime` or `strftime`. I will look into that. Thx! – Radar Jan 25 '17 at 23:00
  • 1
    @Dan regexr.com uses your browser's JavaScript regex engine (JavaScript uses slashes to delimit regex literals in its syntax, but they're not part of the regex per se; they work like `"` does for strings). It doesn't have anything to do with grep. – melpomene Jan 25 '17 at 23:11
  • @melpomene good to know. Next time I'll be sure to fire up a python interpreter instead. Thanks! – Dan Jan 25 '17 at 23:23
  • 1
    @Dan Yeah I love it. The python code generator sucks - it generates 30 lines where 3 would do. However, the rest of the site is top notch. – RobertB Jan 25 '17 at 23:36