reshape long to wide

Question

I ran a series of regressions for 400.000 ID's. I have the following long dataset (example for 1 ID)

 ID             variable       measure          value

7.301004e+18    (Intercept)    Estimate         1.225463e+02
7.301004e+18    price          Estimate        -1.055974e+02
7.301004e+18    pricepromo     Estimate         3.085680e-01
7.301004e+18    feature        Estimate         1.629105e+00
7.301004e+18    display        Estimate         2.171643e+01
7.301004e+18    trend          Estimate        -1.148725e-02
7.301004e+18    addition_step  Estimate        -4.813033e-01
7.301004e+18    (Intercept)    Std. Error       1.674007e+01
7.301004e+18    price          Std. Error       1.724551e+01
7.301004e+18    pricepromo     Std. Error       2.051796e-01
7.301004e+18    feature        Std. Error       3.010596e+00
7.301004e+18    display        Std. Error       3.580683e+00
7.301004e+18    trend          Std. Error       1.297774e-02
7.301004e+18    addition_step  Std. Error       2.400008e+00
7.301004e+18    (Intercept)    Pr(>|t|)         1.022462e-12
7.301004e+18    price          Pr(>|t|)         1.885259e-09
7.301004e+18    pricepromo     Pr(>|t|)         1.332546e-01
7.301004e+18    feature        Pr(>|t|)         5.886688e-01
7.301004e+18    display        Pr(>|t|)         2.645076e-09
7.301004e+18    trend          Pr(>|t|)         3.765107e-01
7.301004e+18    addition_step  Pr(>|t|)         8.411398e-01

Which I would need to reshape wide such that I have 1 column for each combination of variable*measure. Variable has 8 levels and measure has 3 levels, so I create 21 columns.

I use the following code (based on reshape2) to do so:

reshaped <- dcast(mydata, ID ~ measure + variable, value.var = "value")

The problem is that in my output I have only 3.229 observations left (although I had 2885342 observations in the long dataset).

Also, I get the following error:

"Aggregation function missing: defaulting to length".

I do not see why I would want to aggregate. I want one observations per unique ID.

You have duplicated IDs. Generate a secondary ID column and try again. — A5C1D2H2I1M1N2O1R2T1, Feb 12 '16 at 16:11
thanks, that seems to be the problem. I created a secondary ID variable, but I still get the same result when I run the code — research111, Feb 12 '16 at 16:35
I hope you changed it to something like: `dcast(mydata, newID + ID ~ measure + variable, value.var = "value")`. — A5C1D2H2I1M1N2O1R2T1, Feb 12 '16 at 16:37
Ok yes, that works. I still have the same number of observations though, how do I get rid of the NAs? — research111, Feb 12 '16 at 16:55

reshape long to wide

0 Answers0