-2

This is my original df.info() Please see the attached image, its the df.info() for my original df.

I want to put in a date column %Y-%m-%d that is a date type (i.e. column 16, its currently marked as Day, I meant date, it needs to be based on the Data/Time which is a timestamp). My date columns keeps turning up as objects, I don't know how to make them dates. Once I have that column in there I need to groupby on that date column, and I want the resulting dataframe to know that column is a date as well. I don't know how to perform operations and ensure that pandas will know the resulting column is a date. If I can solve that i think all my issues go away.

So what I want is to get column 16, my date column, to be a date type. I created it using df["Day"]=df["Date/Time"].dt.date, thinking that would make it a date.

  • [Please don't post pictures of text](https://meta.stackoverflow.com/q/285551/4518341). Post the text itself please. – wjandrea Oct 25 '22 at 16:00
  • 1
    it looks like your date parsing here is unncessarily complicated and could maybe be affecting your filter, but is just looks unncessary no matter what. Why not just `rundate = datetime.date(int(get_year),int(get_month),int(get_day))` – scotscotmcc Oct 25 '22 at 16:01
  • What's the problem exactly? What inputs do you provide, what output do you expect, and what do you get instead? Please [edit] to clarify. For reference, see [mre]. See also [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Oct 25 '22 at 16:03
  • 1
    also, can you confirm the datatype of the date column in the df? doing `df.info()` will give the types of them all. You want your date column to be a `datetime` of some sort, not an `object` (which is basically a string) – scotscotmcc Oct 25 '22 at 16:04
  • please don't write a narrative about your solution process. just a very concise, clear description of the issue, how you're trying to solve it (in code), and what's not working (in tracebacks), please. have you tried [`pd.to_datetime`](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html)? it will raise an error if it's not able to convert to datetime, so there's no ambiguity involved. – Michael Delgado Oct 26 '22 at 03:43
  • I appreciate the need to want reproducible examples, but I am struggling with how to set this up, and now I think that its a straight forward question, I am going to stick to the picture. – Andrew Martin Oct 26 '22 at 03:47
  • Thanks, no I haven't tried that, I'm looking into it now... – Andrew Martin Oct 26 '22 at 03:51
  • I've found that if I export the grouped dataframe to excel, then read it back in with df2= pandas.read_excel("df.xlsx",parse_dates=["Day"]) then pandas will know that it is a date type. Either excel is doing the job of making it a date, or the parse dates is working to do that, or both. But I can't get it to know that its a date before the export – Andrew Martin Oct 26 '22 at 04:53

1 Answers1

0

The reason a reproducible example is really helpful: Create a new df

In [4]: df = pd.DataFrame({"x": pd.to_datetime("2022-10-26"), "y": 5}, index=[0])
Out[4]: 
           x  y
0 2022-10-26  5

In [6]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1 entries, 0 to 0
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   x       1 non-null      datetime64[ns]
 1   y       1 non-null      int64         
dtypes: datetime64[ns](1), int64(1)
memory usage: 24.0 bytes

and create a new column like you did

In [7]: df["z"] = df.x.dt.date

In [8]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   x       1 non-null      datetime64[ns]
 1   y       1 non-null      int64         
 2   z       1 non-null      object        
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 32.0+ bytes

You can immediately see that

In [10]: type(df.loc[0, 'z'])
Out[10]: datetime.date

The reason this does not show up in df.info() is that its type is datetime.date which is not a native pandas type and therefore simply displayed as object. The same thing also happens to a string.

how to make them dates

Just to add: python is using duck typing. It really does not matter much what pandas thinks. What is important is that you can call all methods for dates on the content of the column

maow
  • 2,712
  • 1
  • 11
  • 25
  • 1
    Thanks maow, you gave me a good tip on how to reframe my issue. I thought that the info() not showing a type type might have been my issue. But you've helped me reframe it – Andrew Martin Oct 26 '22 at 07:53