1

As I have seen here it is not interesting to iterate DataFrames if you want your code to be scalable...

So I am importing a .xlsx spreadsheet with a 'Date' column that pandas automatically recognizes as datetime.datetime format.

Here is an example of the code:

import pandas as pd
import datetime

df = pd.read_excel('Sheet.xlsx')

df['Date'][0].month == 1

Output:
True

If I try df['Date'].month it gives AttributeError: 'Series' object has no attribute 'month'

Whereas the df['Date'] input returns a Series with all datetime.datetime objects.

So my question is how can I get a Series with all booleans for a tested month without having to iterate all rows one-by-one?

I have also considered those methods to select rows with given value(s), but to be sincere I am stuck with this because I`m filtering objects.

Could also be wrong but I believe it would be much more efficient if I would have to iterate only for the month number rather than by each row...

Azgrom
  • 37
  • 6

1 Answers1

1

Use Series.dt.month, if extract attributes from column is necesary .dt:

df['Date'] = pd.to_datetime(df['Date'], errors='coerce')

mask = df['Date'].dt.month == 1

Or Series.eq for compare:

mask = df['Date'].dt.month.eq(1)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hey it worked!! Well... not right away as it got `AttributeError: Can only use .dt accessor with datetimelike values` error. But I only needed to `df['Data'] = pd.to_datetime(df['Data'], errors='coerce')` before using .dt accessor and everything worked fine. Thank you! – Azgrom May 11 '20 at 04:34
  • @Azgrom - Interesting, most time if datetimes are read from excel no converting necessary ;) – jezrael May 11 '20 at 04:36