0

I have a large dataframe with observations repeated each month.

I want to extract the first apparition of each observations.

For instance, consider the following dataset :

Name date          value 
A.   June 2020.     15
A.   July 2020.     20
A.   August 2020.   10
B.   July 2020.     30
B.   August 2020.   40
C.   August 2020.   5

I want to obtain :

Name date          value
A.   June 2020.     15
B.   July 2020.     30
C.   August 2020.   5
MattDMo
  • 100,794
  • 21
  • 241
  • 231
  • 1
    Stack Overflow is not a code-writing or tutorial service. Please [edit] your question and post what you have tried so far, including example input, expected output, the actual output (if any), and the **full text** of any errors or tracebacks, all as formatted text in the question itself. Do not post images of text. – MattDMo Oct 16 '20 at 01:43
  • You can import into a Pandas dataframe and use pandas.DataFrame.groupby() – skibee Oct 16 '20 at 06:21

2 Answers2

0

Lead function in SQL would be suitable. So you might be use sqlite or pandas in python. By using pandas, you could search "pandas lead" in google or stack, like this Pandas equivalent of Oracle Lead/Lag function.

ElapsedSoul
  • 725
  • 6
  • 18
0

You can try drop_duplicates

df.drop_duplicates(subset=['Name'], keep='first', inplace=True)
Sky
  • 43
  • 10