-1

I have a dataframe from a csv which contains information about Covid-19 by date and the amount of cases,deaths,recovered on that date. I need to find the date when the first case and death occurs. Im trying data.groupby(['Cases',]).agg({'Date': [np.min]}) but this gives me all the amount of cases and their corresponding date when it first happens (as you can see below) when I only need when the first case occurs not counting 0 obviously. Thanks!

Edit: Got the first part now there is another column which is states, how do I get the first case of each state?

    Date
amin
Cases   
0   2020-02-20
1   2020-02-20
2   2020-02-24
3   2020-02-27
4   2020-02-26
... ...
34188   2020-04-02
36249   2020-04-03
37584   2020-04-04
38723   2020-04-05
40469   2020-04-06
  • 3
    Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on how to ask a good question may also be useful. – yatu Apr 21 '20 at 21:39

2 Answers2

1

For pandas data frame df, first filter for rows with cases > 0, then select date column and get the minimum value:

 df[df["Cases"]>0]["Date"].min()
kjul
  • 156
  • 6
  • This worked thanks! There is another column which is states, how do I get the first case of each state? – user13375737 Apr 21 '20 at 21:48
  • take the filtered data frame (only cases > 0), group on state and use min as aggregation for the date column – kjul Apr 21 '20 at 21:50
0

Do a filter, find the index of the min date, then look for the entire row using iloc:

index = data.loc[data.Cases > 0 , 'Date'].idxmin()
data.iloc[index]
jcaliz
  • 3,891
  • 2
  • 9
  • 13