-1

Example data follows:

 id   sy    OC
13693 2017  1
13752 2017  5
13693 2017  4
44555 2018  3

What am I doing incorrectly for the following code?

SORs.pivot(index='id',columns="sy",values='OC').add_prefix('sy').reset_index()

I have never seen "pivot"ing used within R before, but I am eager to learn, once I get past this hurdle.

I wish for the final output to be something like the following:

 id   sy2017  sy2018
13693 1       na
13752 5       na
13693 4       na
44555 na      3    

I adapted it from this Stack Overflow page.

I am also looking to get the summation of the values within the cells for the repeating ids (13693).

Update

First, please let me apologize for mixing R and Python. That was just silly on my part.

I am still having problems with the data even though I used some of the solutions:

Now this yields a df with over 200,000 records - but the logic works, and I am ready to spread the columns out.

I tried two different ways but neither worked.

First I tried:

reshape(dat2, idvar="id", timevar="sy", direction="wide").

All this yielded was a df with two columns. The first was the subjectkey and the next said DistinctOrderCound.2017:2018 - and the latter column is just a column of NAs.

Then I tried:

spread(dat2, key = sy, value=value).

This yielded a Error saying duplicate values for rows and a sample listing of the duplicates.

I think the reshape should work and work nicely. I do not think there are any issues with the summation any more as I took care of that with a pre-query.

halfer
  • 19,824
  • 17
  • 99
  • 186
Zach
  • 37
  • 8
  • 1
    You tried to use python code in R (the question you linked is python, not R). – Jan Boyer Aug 16 '18 at 22:45
  • 1
    *"I have never seen "pivot"ing used within R before"* Pivoting is a *very* common task in R (you will see at least a hand-full of questions about this every day here on SO); in the R domain it's more commonly known as "spreading" data, or "casting/reshaping data from long to wide". Some more popular methods to do this are `tidyr::spread`, `reshape::reshape`, `data.table::dcast`. I have never heard of `SORs.pivot` nor do I know which R package this function comes from. I recommend sticking to the more popular packages/methods. – Maurits Evers Aug 16 '18 at 22:45
  • [continued] Take a look at [How to reshape data from long to wide format](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format). – Maurits Evers Aug 16 '18 at 22:48
  • `library(reshape2); dcast(DF, OC + id ~ paste0("sy", sy))[-1]` – markus Aug 16 '18 at 23:01

2 Answers2

0

The R package tidyr uses the spread function for this task. In your case, you could try tidyr::spread(data, sy, OC) which should accomplish your goals. For more on tidyr::spread and tidyr::gather, see this blog post

Patton
  • 1
  • 1
0

dcast() solves everything. Kind of weird how simple it is.

Thank you to everyone!

Zach
  • 37
  • 8
  • Would you edit this to show how you used `dcast`, so this answer is as useful as possible for future readers? Thank you. – halfer Aug 23 '18 at 09:35