0

I have a dataframe created from a CSV where I parse dates in a column named DATE. I am looping through the rows and I want to skip everything that has the year of 1980. Here is my code:

for index, row in df.iterrows():

    if df['DATE'].dt.year == 1980:

        continue

I get the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I can print out the date fine. Any pointers would be appreciated.

MoreScratch
  • 2,933
  • 6
  • 34
  • 65
  • 2
    The fix is actually `row['DATE'].dt.year == 1980`, but please vectorise this. What are you actually trying to do? – cs95 Dec 07 '18 at 03:25
  • I am looping through rows and adding them to a database. I just want to skip all the records that occur in 1980. Can you elaborate on what vectorise means? – MoreScratch Dec 07 '18 at 03:30
  • What database? "vectorize" means you don't have to implement a loop; the code takes care of looping internally. – cs95 Dec 07 '18 at 03:30
  • My SQL but I have to run the rows through a middleware component that does some data enrichment. Still, I would like to see some examples using MySQL if you could point me to some. – MoreScratch Dec 07 '18 at 03:44
  • 1
    Please take a look at this: https://stackoverflow.com/questions/16476413/how-to-insert-pandas-dataframe-via-mysqldb-into-database You can call `to_sql` with a connection object—that should be enough. If you have to run your code through some component, you can use `apply` or some sort of list comprehension. `iterrows` are an anti-pattern with pandas. – cs95 Dec 07 '18 at 03:47
  • Got it! Thanks. – MoreScratch Dec 07 '18 at 03:51

0 Answers0