Everyone.
I'm new to python and pandas, that I met a problem that I need to check whether a value of a certain columns changed over time(different rows). I totally have no idea that how to solve this problem.
I create a simple sample to illustrate it clearly:
df = pd.DataFrame({"Year":[2011,2012,2013,2014,2011,2014,2014,2014,2011,2012,2012,2015],"Product":['A','A','A','A','B','B','B','B','C','C','C','C'],"Value1":[1,1,1,0,1,1,0,1,0,1,1,1]},index=['A','A','A','A','B','B','B','B','C','C','C','C'])
df
Year Product Value1
A 2011 A 1
A 2012 A 1
A 2013 A 1
A 2014 A 0
B 2011 B 1
B 2014 B 1
B 2014 B 0
B 2014 B 1
C 2011 C 0
C 2012 C 1
C 2012 C 1
C 2015 C 1
I want to check the value change through the years depend on different products, here is the logic to check:
If the value1 is 0 for a product at first, just delete this row until value1 become 1. (e.g. Product C in 2011)
If the value1 for a product is 1, then set the result as 0.(e.g. Product A in 2011 etc.)
If the value1 change to 0 from 1 in a certain year then set the result as 1 and delete other rows of that product within same year (e.g. Product B in 2014)
So the final result should looks like this:
df2 = pd.DataFrame({"Year":[2011,2012,2013,2014,2011,2014,2012,2012,2015],"Product":['A','A','A','A','B','B','C','C','C'],"Value1":[1,1,1,0,1,0,1,1,1],'result':[0,0,0,1,0,1,0,0,0]},index=['A','A','A','A','B','B','C','C','C'])
df2
Year Product Value1 result
A 2011 A 1 0
A 2012 A 1 0
A 2013 A 1 0
A 2014 A 0 1
B 2011 B 1 0
B 2014 B 0 1
C 2012 C 1 0
C 2012 C 1 0
C 2015 C 1 0
I tried to use pd.apply
to solve this problem, but I have no idea how to get another row value and compare to give out a new value.
Can someone help me?