1

Let's say I've got in several DataFrames one particular Serie like this :

serie_complete_days = pd.Series(['20190320','20190321','20190322', '20190323', '20190324', '20190325', '20190326', '20190327'])

I'm trying to retain only two parts of each string (the day and the month) and replace them in a european format, like this.

the_goal_is = pd.Series(['20-03','21-03','22-03', '23-03', '24-03', '25-03', '26-03', '27-03'])

I started to isolate each part with str.slice() function :

days_only = serie_complete_days.str.slice(start = 6, stop = 8)
months_only = serie_complete_days.str.slice(start = 4, stop = 6)

I thougth it was the easiest way, because I didn't change the index of my DF. But I missed something after, and I don't know which function is the best for that between str.join(), str.replace() or str.update()...

Thanks in advance !

EDIT : I want to keep this string as a string. No to_datetime(), please

Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
Raphadasilva
  • 565
  • 1
  • 6
  • 21
  • Does this answer your question? [Extracting just Month and Year separately from Pandas Datetime column](https://stackoverflow.com/questions/25146121/extracting-just-month-and-year-separately-from-pandas-datetime-column) – Ehsan May 16 '20 at 13:54
  • ```serie_complete_days.apply(lambda x : x[6:] + "-" + x[4:6])``` – sushanth May 16 '20 at 13:56

1 Answers1

2

You can use Series.str.replace:

result = serie_complete_days.str.replace(r'\d{4}(\d{2})(\d{2})', r'\g<2>-\g<1>')

Or you can use Series.dt.strftime:

result = pd.to_datetime(serie_complete_days).dt.strftime('%d-%m')

This returns a series as:

0    20-03
1    21-03
2    22-03
3    23-03
4    24-03
5    25-03
6    26-03
7    27-03
dtype: object
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53