0

i have this code to convert the date to date that I want:

df['issue_d'] = df['issue_d'].replace({'Jan-':'1-', 'Feb-':'2-', 'Mar-': '3-', 'Apr-': '4-', 'May-': '5-', 'Jun-': '6-', 'Jul-': '7-', 'Aug-':'8-', 'Sep-': '9-', 'Oct-': '10-', 'Nov-': '11-', 'Dec-': '12-'}, regex=True).apply(lambda x:dt.strptime('01-'+x,'%d-%m-%y').date())
df['issue_d'] = pd.to_datetime(df['issue_d'],  format = '%Y-%m-%d')

but when I run it, this error would appear:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_25672/2570429248.py in <module>
----> 1 df['issue_d'] = df['issue_d'].replace({'Jan-':'1-', 'Feb-':'2-', 'Mar-': '3-', 
'Apr-': '4-', 'May-': '5-', 'Jun-': '6-', 'Jul-': '7-', 'Aug-':'8-', 'Sep-': '9-', 'Oct- ': '10-', 'Nov-': '11-', 'Dec-': '12-'}, regex=True).apply(lambda x:dt.strptime('01-'+x,'%d-%m-%y').date())
  2 df['issue_d'] = pd.to_datetime(df['issue_d'],  format = '%Y-%m-%d')

~\anaconda3\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, 
args, **kwargs)
4355         dtype: float64
4356         """
-> 4357         return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
 4358 
 4359     def _reduce(

~\anaconda3\lib\site-packages\pandas\core\apply.py in apply(self)
 1041             return self.apply_str()
 1042 
 -> 1043         return self.apply_standard()
 1044 
 1045     def agg(self):

 ~\anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
 1096                 # List[Union[Callable[..., Any], str]]]]]"; expected
 1097                 # "Callable[[Any], Any]"
 -> 1098                 mapped = lib.map_infer(
 1099                     values,
 1100                     f,  # type: ignore[arg-type]

 ~\anaconda3\lib\site-packages\pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()

 ~\AppData\Local\Temp/ipykernel_25672/2570429248.py in <lambda>(x)
  ----> 1 df['issue_d'] = df['issue_d'].replace({'Jan-':'1-', 'Feb-':'2-', 'Mar-': '3-', 'Apr-': '4-', 'May-': '5-', 'Jun-': '6-', 'Jul-': '7-', 'Aug-':'8-', 'Sep-': '9-', 'Oct-': '10-', 'Nov-': '11-', 'Dec-': '12-'}, regex=True).apply(lambda x:dt.strptime('01-'+x,'%d-%m-%y').date())
  2 df['issue_d'] = pd.to_datetime(df['issue_d'],  format = '%Y-%m-%d')

  ~\anaconda3\lib\_strptime.py in _strptime_datetime(cls, data_string, format)
  566     """Return a class cls instance based on the input string and the
  567     format string."""
   --> 568     tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  569     tzname, gmtoff = tt[-2:]
  570     args = tt[:6] + (fraction,)

  ~\anaconda3\lib\_strptime.py in _strptime(data_string, format)
  347     found = format_regex.match(data_string)
  348     if not found:
  --> 349         raise ValueError("time data %r does not match format %r" %
  350                          (data_string, format))
  351     if len(data_string) != found.end():

   ValueError: time data '01-15-Dec' does not match format '%d-%m-%y'

****update:

my ['issue_d'] column's info is like :

issue_d              1048563 non-null  object

that includes years-months(names) like:

15-Dec
16-Jan
and etc.

We should first: change the month's names (Jan, Feb, Mar,...) to their numbers (01,02,03,...), so the output for the column be like:

15-12
16-01
and etc.

and add day (1) to them. so that my dates arrange be like:

01-01-15
01-02-15
01-03-15
and etc.

That ((apply)) part is that day 1 that I tried to add, the second is the months, and the third is the years.

you can see in my first line that i tried to do this:

df['issue_d'] = df['issue_d'].replace({'Jan-':'1-', 'Feb-':'2-', 'Mar-': '3-', 'Apr-': '4-', 'May-': '5-', 'Jun-': '6-', 'Jul-': '7-', 'Aug-':'8-', 'Sep-': '9-', 'Oct-': '10-', 'Nov-': '11-', 'Dec-': '12-'}, regex=True).apply(lambda x:dt.strptime('01-'+x,'%d-%m-%y').date())

I changed Abbreviated month names in the first line because pandas DateTime can't figure it out and makes it like DateTime. In the second line, I tried to change the arrangement in %Y-%m-%d format and change the column to a data frame to do further work on my dataset. But unfortunately, that error appeared. I'd appreciate it if you help me. Thank you

  • please add sample input and expected output – srinath Jul 28 '22 at 09:27
  • You need to post sample data the replicate your problem. From the error message: You are trying to use `strptime()` with `format='%d-%m-%y'` on the string `'01-15-Dec'` -- which obviously doesn't work, because `%y` can't deal with `'Dec'`. So, look again at the input format, and adjust the `.replace()` part accordingly. – Timus Jul 28 '22 at 13:12
  • Your next line `df['issue_d'] = pd.to_datetime(df['issue_d'], format = '%Y-%m-%d')` is puzzling: `df['issue_d']` has already datetime values, why do it again. And why use `format='%Y-%m-%d'` here, when there are no strings to parse. Also the format is not the same? – Timus Jul 28 '22 at 13:17
  • _" because in the second line of code I changed it"_: Well, it's the 2. line, and the exception is raised in the 1., so the 2. is never executed. – Timus Jul 28 '22 at 13:18
  • I've updated and added the input and output column – Mohammad Ghanbari Jul 28 '22 at 16:51
  • [You should not post code (or error/exception messages, data etc.) as an image](https://meta.stackoverflow.com/a/285557/14311263). Please provide a MRE (see [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/14311263)) that replicates your problem. – Timus Jul 28 '22 at 17:34
  • Your input only shows `15-Dec` and it is not clear how the output you show is related to that? – Timus Jul 28 '22 at 17:35
  • No, my database includes 1 million dates that is like 15-dec or 16-sep or... that first one is years and the next ones are the months.. – Mohammad Ghanbari Jul 29 '22 at 01:54
  • Dataset has entries like `15-Dec` and you are trying to match `Dec-`. The `replace()` above is not working, passing `15-Dec` as is to `apply()` function in line#1, which in turn prefixes `01` to it making it `01-15-Dec`. This does not match the format specifier `%d-%m-%y`. May be you want to replace `-Dec` with `-12` instead. – Azhar Khan Jul 31 '22 at 14:40

1 Answers1

0

Okay, next try:

If you're not using an English locale, then you could try:

df = pd.DataFrame({"issue_d": ["15-Dec", "16-Jan", "21-Oct"]})

mapping = {"Jan": "1", "Feb": "2", "Mar": "3", "Apr": "4", "May": "5", "Jun": "6",
           "Jul": "7", "Aug": "8", "Sep": "9", "Oct": "10", "Nov": "11", "Dec": "12"}
df["issue_d"] = pd.to_datetime(
    df["issue_d"].str[:-3] + df["issue_d"].str[-3:].replace(mapping),
    format="%y-%m"
).dt.strftime("%d-%m-%y")

Result:

    issue_d
0  01-12-15
1  01-01-16
2  01-10-21

You get the first of the month automatically.

If you are using an English locale, then this gives the same result:

df = pd.DataFrame({"issue_d": ["15-Dec", "16-Jan", "21-Oct"]})

df["issue_d"] = pd.to_datetime(df["issue_d"], format="%y-%b").dt.strftime("%d-%m-%y")

Regarding your question extension: If the issued_d column needs to be datetime for further processing then remove the .dt.strftime("%d-%m-%y") at the end (because this makes strings out of the datetimes), do what you need to do, and convert it to strings later. For example

...
df["issue_d"] = pd.to_datetime(
    df["issue_d"].str[:-3] + df["issue_d"].str[-3:].replace(mapping),
    format="%y-%m"
)
df["issue_y"] = df["issue_d"].dt.year
df["issue_d"] = df["issue_d"].dt.strftime("%d-%m-%y") 

results in

    issue_d  issue_y
0  01-12-15     2015
1  01-01-16     2016
2  01-10-21     2021
Timus
  • 10,974
  • 5
  • 14
  • 28
  • First I'd like to change month names to their month numbers that you can see from my first line of code that I'm trying to do this, then I want to change it's format to datetime that you can see in second line of my code that I'm trying to do... It's not just about December, sorry for miss understanding – Mohammad Ghanbari Jul 29 '22 at 01:57
  • @MohammadGhanbari I know, my proposal isn't restricted to `Dec` (I've included some other year-month combinations in the last edit). – Timus Jul 29 '22 at 06:52
  • Thanks my friend, i couldn't use the codes that you nicely shared with me, could you please edit my codes that i wrote in the first line, first I'd like to change the months names to their numbers, in that line i tried to add day (1) to each month, and after that , in the second line i tried to change the dates to date time with pandas – Mohammad Ghanbari Jul 30 '22 at 10:42
  • @MohammadGhanbari Please provide some data, not screenshots, that you are actually using. Do something like `df['issue_d'].head(10).to_dict()` and edit the result into a code block. And also provide the expected result! We ask for sample data/mre/etc. for a reason, not to bother you: Without them helping becomes a guessing game. (Again: [How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example).) – Timus Jul 30 '22 at 13:01
  • thank for your tips.. i updated the question and I'll be thrilled for your help – Mohammad Ghanbari Jul 31 '22 at 14:22
  • Thanks for you tips Dear Timus, yet, another error occurred – Mohammad Ghanbari Aug 03 '22 at 06:09