Example data follows:
id sy OC
13693 2017 1
13752 2017 5
13693 2017 4
44555 2018 3
What am I doing incorrectly for the following code?
SORs.pivot(index='id',columns="sy",values='OC').add_prefix('sy').reset_index()
I have never seen "pivot"ing used within R before, but I am eager to learn, once I get past this hurdle.
I wish for the final output to be something like the following:
id sy2017 sy2018
13693 1 na
13752 5 na
13693 4 na
44555 na 3
I adapted it from this Stack Overflow page.
I am also looking to get the summation of the values within the cells for the repeating ids (13693).
Update
First, please let me apologize for mixing R and Python. That was just silly on my part.
I am still having problems with the data even though I used some of the solutions:
Now this yields a df with over 200,000 records - but the logic works, and I am ready to spread the columns out.
I tried two different ways but neither worked.
First I tried:
reshape(dat2, idvar="id", timevar="sy", direction="wide").
All this yielded was a df with two columns. The first was the subjectkey and the next said DistinctOrderCound.2017:2018 - and the latter column is just a column of NAs.
Then I tried:
spread(dat2, key = sy, value=value).
This yielded a Error saying duplicate values for rows and a sample listing of the duplicates.
I think the reshape should work and work nicely. I do not think there are any issues with the summation any more as I took care of that with a pre-query.