4

I have the following dataset and function to try to detect if the columns are date types.

from dateutil.parser import parse
import pandas as pd

# function
def is_date(string, fuzzy=False):
    try:
        parse(string, fuzzy=fuzzy)
        return True

    except ValueError:
        return False

# data
df = pd.read_csv('https://data.calgary.ca/api/views/78gh-n26t/rows.csv?accessType=DOWNLOAD')

When I try the function on one of the columns is_date(crime['Date']) I get:

TypeError: Parser must be a string or character stream, not Series

How do I properly convert the column into the correct type to be able to loop through all values with the function?

Date column:

0        05/01/2020 12:00:00 AM
1        05/01/2020 12:00:00 AM
2        04/01/2020 12:00:00 AM
3        04/01/2020 12:00:00 AM
4        04/01/2020 12:00:00 AM

Other columns:

Sector     Community Name   Category
NORTHWEST  02E              Assault (Non-domestic)
WEST       ASPEN WOODS      Street Robbery
NORTHWEST  02E              Violence Other (Non-domestic)
NORTH      02K              Theft OF Vehicle
NORTHEAST  10E              Break & Enter - Commercial
bigbounty
  • 16,526
  • 5
  • 37
  • 65
Xin
  • 666
  • 4
  • 16

2 Answers2

1

You can do this

df = pd.Dataframe(your_data_set)
print (str(df['date'][0:len(df)]))
Seyi Daniel
  • 2,259
  • 2
  • 8
  • 18
1

You can use applymap to achieve this.

In [10]: df
Out[10]:
      Sector Community Name                       Category                    Date
0  NORTHWEST            02E         Assault (Non-domestic)  05/01/2020 12:00:00 AM
1       WEST    ASPEN WOODS                 Street Robbery  05/01/2020 12:00:00 AM
2  NORTHWEST            02E  Violence Other (Non-domestic)  05/01/2020 12:00:00 AM
3      NORTH            02K               Theft OF Vehicle  05/01/2020 12:00:00 AM
4  NORTHEAST            10E     Break & Enter - Commercial  05/01/2020 12:00:00 AM

In [11]: # function
    ...: def is_date(string, fuzzy=False):
    ...:     try:
    ...:         parse(string, fuzzy=fuzzy)
    ...:         return True
    ...:
    ...:     except ValueError:
    ...:         return False
    ...:

In [12]: df = df.astype(str)

In [13]: df[df.columns.tolist()].applymap(is_date)
Out[13]:
   Sector  Community Name  Category  Date
0   False           False     False  True
1   False           False     False  True
2   False           False     False  True
3   False           False     False  True
4   False           False     False  True

In [14]: df[df.columns.tolist()].applymap(is_date).any()
Out[14]:
Sector            False
Community Name    False
Category          False
Date               True
dtype: bool
bigbounty
  • 16,526
  • 5
  • 37
  • 65
  • Thanks and appreciate the reply, however now I am getting the error: ```TypeError: Parser must be a string or character stream, not float``` – Xin Jul 31 '20 at 03:18
  • 1
    @Xin you need to convert all columns to string first. `df = df.astype(str)`. Updated my answer as well – bigbounty Jul 31 '20 at 03:28