How do I transform a data frame to the desired structure and form?

Question

My data is a .xlsx pivot table. There are several sheets there, but I need only one for my analysis. On this sheet I have a data frame which looks like this

df <- data.frame(ind = c("ind1", "ind1", "ind1", "ind1", 
                         "ind2", "ind2", "ind2", "ind2",
                         "ind3", "ind3", "ind3", "ind3",
                         "ind4", "ind4", "ind4", "ind4"),
                 shr = c(-0.23, 0, 0.12, 0.68,
                         -0.54, 0.80, 0.14, -0.23,
                          0.48, 0.94, -0.01, 0.31,
                          0.18, 0.11, 0.98, 0.05))

And other columns with different types of data. I don't need them, only these two I have presented in an example. So, the df is:

df
#    ind   shr
#1  ind1 -0.23
#2  ind1  0.00
#3  ind1  0.12
#4  ind1  0.68
#5  ind2 -0.54
#6  ind2  0.80
#7  ind2  0.14
#8  ind2 -0.23
#9  ind3  0.48
#10 ind3  0.94
#11 ind3 -0.01
#12 ind3  0.31
#13 ind4  0.18
#14 ind4  0.11
#15 ind4  0.98
#16 ind4  0.05

What I need is to transform this dataframe to this form:

df
#      shr
# ind1 -0.23 0.00 0.12 0.68
# ind2 -0.54 0.80 0.14 -0.23
# ind3 .....
# ind4 .....

Or even it would be also convenient if my data have looked like this:

df
# ind1   ind2   ind3   ind4
# -0.23   .      .
#  0.00   .      .
#  0.12   .      .
#  0.68   .      .

In short, I want to make my data compact and comfortable for further analysis. The main difficulties are that my initial file with data is the .xlsx with different sheets and pivot table.

(1) How do I extract data from .xlsx file with several sheets? (2) How can I get desirable df structure?

Possible duplicate : https://stackoverflow.com/questions/11322801/transpose-reshape-dataframe-without-timevar-from-long-to-wide-format — Ronak Shah, May 14 '20 at 12:20

score 2 · Accepted Answer · answered May 14 '20 at 12:17

Here's how to tranform your data. pivot_wider from tidyr requires an ID column. Here I create one using mutate(row = row_number()). To read the data from excel, I suggest the readxl package. The read_xlsx function allows you to specify the excel sheet and the range.

library(dplyr)
df %>%
  group_by(ind) %>%
  mutate(row = row_number()) %>%
  pivot_wider(names_from= ind, values_from = shr) %>%
  select(-row)

# A tibble: 4 x 4
   ind1  ind2  ind3  ind4
  <dbl> <dbl> <dbl> <dbl>
1 -0.23 -0.54  0.48  0.18
2  0     0.8   0.94  0.11
3  0.12  0.14 -0.01  0.98
4  0.68 -0.23  0.31  0.05

score 1 · Answer 2 · answered May 14 '20 at 12:16

you can use below code:

list1<-c(1:4)
df$col<-1:nrow(df)
df$remainder<-df$col%%4
df$col<-NULL

dcast(df,ind~remainder, value.var = "shr" )

>   ind     0     1    2     3
1 ind1  0.68 -0.23 0.00  0.12
2 ind2 -0.23 -0.54 0.80  0.14
3 ind3  0.31  0.48 0.94 -0.01
4 ind4  0.05  0.18 0.11  0.98

dcast(df,remainder~ind, value.var = "shr" )

>  remainder  ind1  ind2  ind3 ind4
1         0  0.68 -0.23  0.31 0.05
2         1 -0.23 -0.54  0.48 0.18
3         2  0.00  0.80  0.94 0.11
4         3  0.12  0.14 -0.01 0.98

`dcast(setDT(df), ind~rowid(ind), value.var = 'shr')` – Ronak Shah May 14 '20 at 12:19 — Ronak Shah, May 14 '20 at 12:19

How do I transform a data frame to the desired structure and form?

2 Answers2