0

I have a data structure like the following

stat_names values

stat1       0.3
stat2       0.5

stat_x      ...


stat 16     0.8

and I would like to convert it to a similar structure, so that every 4 rows a new column is created and those 4 rows are moved to the new column.

  A     B     C     D
stat1 stat5 stat9  stat13
stat2 stat6 stat10 stat11`
 ...  ...   ...    ...

Where "A", "B", "C", "D" is a a-priori user-defined column names vector.

Although this is trivial to do by hand in Excel, I would like to do this in R with a script that can be iterated across multiple inputs.

  • You could do `df1 %>% group_by(grp = as.integer(gl(nrow(.), 4, nrow(.)))) %>% mutate(n = row_number()) %>% spread(stat_names, values) %>% ungroup %>% select(-grp)` – akrun Jun 20 '18 at 15:19
  • I would assume that you want the values instead of the 'stat1', 'stat2', – akrun Jun 20 '18 at 15:21
  • Thanks. But there is an issue with this command, as it generates a column for each stat, where in each column all but one value (the corresponding one) are NA. –  Jun 20 '18 at 15:27
  • I didn't have a reproducible example – akrun Jun 20 '18 at 15:27
  • You should create a reproducible example, including desired output. Otherwise, we can't test code to see if it works for your purposes. Some guidance: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 – Frank Jun 20 '18 at 15:29

1 Answers1

2

Here, I cast the data frame as a matrix and then back to a data frame to produce the desired result.

# Dummy data frame
df <- data.frame(stat_names = paste("stat", 1:16, sep = " "),
                 values = runif(16))

# Cast as matrix with 4 rows and then as a data frame
df <- as.data.frame(matrix(df$stat_names, nrow = 4))

# Rename columns
names(df) <- LETTERS[1:ncol(df)]

#        A      B       C       D
# 1 stat 1 stat 5  stat 9 stat 13
# 2 stat 2 stat 6 stat 10 stat 14
# 3 stat 3 stat 7 stat 11 stat 15
# 4 stat 4 stat 8 stat 12 stat 16

If you actually want values in this format, change df$stat_names to df$values.

Dan
  • 11,370
  • 4
  • 43
  • 68
  • 1
    A similar idea: `data.frame(split(df$values, cut(1:nrow(df), 4, labels = LETTERS[1:4])))` – Frank Jun 20 '18 at 15:32