0

I am varily new to R and data wrangling. I would like to calculate a correlation matrix using the cor()-function and for that I have to reshape my data frame. It consists of 1500 different articles (StockCodes) of an online store, with a sale history of 365 days per article (Quantity per Date per StockCode).

I need to reshape it in the form that the Dates become the new row names and the StockCodes become the new column names.

Concider this short example:

my_df <- data.frame(Stockcode = c("A","A", "B","B", "C", "C"), Quantity = c(1,5,3,2,1,4), Date = c("2010-12-01","2010-12-02","2010-12-01","2010-12-02","2010-12-01","2010-12-02") )

looking like this:

  Stockcode Quantity       Date
1         A        1 2010-12-01
2         A        5 2010-12-02
3         B        3 2010-12-01
4         B        2 2010-12-02
5         C        1 2010-12-01
6         C        4 2010-12-02

And I want it to be transformed into:

df_reshaped <- data.frame(A = c(1,5), B = c(3,2), C = c(1,4), row.names = c("2010-12-01","2010-12-02"))

looking like this:

           A B C
2010-12-01 1 3 1
2010-12-02 5 2 4

I achieved this with a for-loop (and successfully calculated my correlation matrix), but the loop took "ages" to be executed (approx. 4 hours).

Is there a proper and faster way?

I would highly appreciate any help!

Sten
  • 3
  • 1
  • Try `reshape(my_df, idvar = "Date", timevar = "Stockcode", direction = "wide", new.row.names = unique(my_df$Date))[, -1]` or maybe `unstack(my_df[, 1:2], Quantity ~ Stockcode)` – markus Jan 17 '20 at 15:36
  • Thank you! "unstack" works and is very fast - it alters the column names a little and drops the invoice date/replaces it with id. But for the correlation matrix this does not matter. :-) – Sten Jan 17 '20 at 15:47

1 Answers1

0

Here is a way using pivot_wider from tidyr:

my_df %>%
   pivot_wider(names_from = Stockcode, values_from = Quantity) %>% ## pivot columns in wide format
   column_to_rownames(var = "Date") ## convert Date column to row names

#           A B C
#2010-12-01 1 3 1
#2010-12-02 5 2 4
sm925
  • 2,648
  • 1
  • 16
  • 28