-1

I was trying to drop play count that is greater than 5? I noticed when I use the line of code below, the shape of my data increase from 130398 rows × 7 columns to 400730 rows × 7 columns, does any one know why?

df_final=df.drop(df[df.play_count> 5].index)

enter image description here

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Yara H
  • 11
  • 1
  • 2
    Welcome to Stack Overflow! Please take the [tour]. Are you using Pandas? If you are, please add the tag for it. Next, it'd help to provide a [mre] including a sample of your data. See [How to make good reproducible pandas examples](/q/20109391/4518341) for specifics. For more tips, see [ask]. You can [edit]. – wjandrea May 25 '22 at 03:20
  • 1
    Oh, and [please don't post pictures of text](https://meta.stackoverflow.com/q/285551/4518341). Instead, copy the text itself and use the formatting tools like [code formatting](/editing-help#code). – wjandrea May 25 '22 at 03:35

1 Answers1

1

drop() function in Pandas doesn't add new rows, it does the opposite as your understanding. (Documentation: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop.html)

From my speculation, the number 130398 is from df_final from the previous cell. While df_final=df.drop(df[df.play_count> 5].index), you use the original df instead of df_final that you observed the number of rows.

Try running the df_final again and making sure that it has the right number of rows, then probably try using:

df_final = df_final.drop(df_final[df_final.play_count> 5].index)
Fony Lew
  • 505
  • 4
  • 16