1

In this data, each person gave each of 3 items a score in columns 4-6.

> Sample.Score
  V1 V2 V3 V4 V5 V6
1  A  B  C 45 78 39
2  E  F  G 12 42 93
3  E  H  B 23 85 35
4  H  C  F 23 12 64

How do I convert it to 2 columns where it contains scores of all possible items, e.g.

> Sample.Score2
  V1 V2
1  A 45
2  B 78
3  C 39
4  E 12

There are repeating items, but I don't want to sum them up first.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Saftever
  • 685
  • 6
  • 9

3 Answers3

1

We could subset the dataset, transpose it, and convert to vector with c to create the columns of new dataset

data.frame(V1 = c(t(df[1:3])), V2 = c(t(df[4:6])))
#    V1 V2
#1   A 45
#2   B 78
#3   C 39
#4   E 12
#5   F 42
#6   G 93
#7   E 23
#8   H 85
#9   B 35
#10  H 23
#11  C 12
#12  F 64
akrun
  • 874,273
  • 37
  • 540
  • 662
0

In the following, I am simply using lapply twice. Although the letters are in column-wise order (unlike your example output), the letters have the correct value in the same row.

data.frame(V1 = unlist(lapply(df[, 1:3], cbind)),
           V2 = unlist(lapply(df[, 4:6], cbind)))

#      V1 V2
# V11  A 45
# V12  E 12
# V13  E 23
# V14  H 23
# V21  B 78
# V22  F 42
# V23  H 85
# V24  C 12
# V31  C 39
# V32  G 93
# V33  B 35
# V34  F 64

Data used

df <- read.table(text="V1 V2 V3 V4 V5 V6
     1  A  B  C 45 78 39
     2  E  F  G 12 42 93
     3  E  H  B 23 85 35
     4  H  C  F 23 12 64", head =TRUE, stringsAsFactors = FALSE)
kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
0

Here are several options, all with different ways of going from wide to long data. Each time, you need to do some operation twice, because you are turning the scores into long data, and turning the letters into long data. Beyond that, it's a matter of preference how you want to write your code.

First is with dplyr and tidyr, where you can use gather and pipe your data through in a single statement. The select lets you change the names to V1 and V2.

library(tidyverse)

df_long_gather <- df %>%
  gather(key = key1, value = letter, V1:V3) %>%
  gather(key = key2, value = score, V4:V6) %>%
  select(V1 = letter, V2 = score) %>%
  arrange(V1)
head(df_long_gather)
#> # A tibble: 6 x 2
#>   V1       V2
#>   <chr> <int>
#> 1 A        45
#> 2 A        78
#> 3 A        39
#> 4 B        45
#> 5 B        23
#> 6 B        78

You can also use melt from reshape2. You can do this in two steps (first version), or nest one call inside another to do in one step (second version). You can also use the %>% pipes to use both melts in a single statement.

Two steps:

library(reshape2)
melted1 <- melt(df, id.vars = c("V1", "V2", "V3"), 
    measure.vars = c("V4", "V5", "V6"), variable.name = "key1", 
    value.name = "score")
melted2 <- melt(melted1, measure.vars = c("V1", "V2", "V3"), 
    variable.name = "key2", value.name = "V1")
df_long_melt <- data.frame(V1 = melted2$V1, V2 = melted2$score)
head(df_long_melt)
#>   V1 V2
#> 1  A 45
#> 2  E 12
#> 3  E 23
#> 4  H 23
#> 5  A 78
#> 6  E 42

One step:

df_long_melt2 <- melt(
    melt(df, measure.vars = c("V1", "V2", "V3"), value.name = "key1"),
    measure.vars = c("V4", "V5", "V6"), value.name = "key2")[, c(2, 4)]
names(df_long_melt2) <- c("V1", "V2")
head(df_long_melt2)
#>   V1 V2
#> 1  A 45
#> 2  E 12
#> 3  E 23
#> 4  H 23
#> 5  B 45
#> 6  F 12
camille
  • 16,432
  • 18
  • 38
  • 60