0

I have this data table with two columns:

  1. gender - male or female
  2. score - between 0 and 100

I want create a new table with 2 columns:

  1. male
  2. female and the lines are the score

how can I do this in R?

enter image description here

Martin Gal
  • 16,640
  • 5
  • 21
  • 39
Lilach
  • 11
  • 1
  • 1
    From @TylerGroves: *Can you tell us more about the table itself, e.g. what datatype, as well as more about how it is structured? It is difficult to tell from your post.* – r2evans Aug 26 '21 at 16:54
  • 1
    FYI, your recent edit (replacing sample data with loose descriptions) actually lowered the quality and reproducibility of the question. Making it easy for people to answer is fairly important, as it will give you more more relevant answers, faster. I suggest you "rollback" your edit or add the sample data back into the question. Please read from the following for good discussions on *reproducibility* in SO questions: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. Thank you! – r2evans Aug 26 '21 at 17:05
  • 2
    Please do not post an image of code/data/errors: it breaks screen-readers and it cannot be copied or searched (ref: https://meta.stackoverflow.com/a/285557 and https://xkcd.com/2116/). Please just include the code, console output, or data (e.g., `data.frame(...)` or the output from `dput(head(x))`) directly. – r2evans Aug 26 '21 at 17:12
  • Please provide enough code so others can better understand or reproduce the problem. – Community Aug 27 '21 at 04:41

2 Answers2

2

Up front: the rn column I add is to provide an "id" of sorts, where the results will have only one row per "id". There might be ways around it, but it really simplifies things, both in execution and in visualizing what is happening (when you look at the long and wide versions with rn still present). I remove the column in both answers ([,-1] and %>% select(-rn)) since it's just a transient column.

base R

dat$rn <- ave(seq_len(nrow(dat)), dat$gender, FUN = seq_along)
reshape(dat, timevar = "gender", idvar = "rn", direction = "wide", v.names = "score")[,-1]
#   score.male score.female
# 1        100           90
# 3         80           98
# 5         75          100

dplyr

library(dplyr)
library(tidyr) # pivot_wider
dat %>%
  group_by(gender) %>%
  mutate(rn = row_number()) %>%
  pivot_wider(rn, names_from = gender, values_from = score) %>%
  select(-rn)
# # A tibble: 3 x 2
#    male female
#   <int>  <int>
# 1   100     90
# 2    80     98
# 3    75    100

Data

dat <- structure(list(score = c(100L, 90L, 98L, 80L, 75L, 100L), gender = c("male", "female", "female", "male", "male", "female"), rn = c(1L, 1L, 2L, 2L, 3L, 3L)), row.names = c(NA, -6L), class = "data.frame")
r2evans
  • 141,215
  • 6
  • 77
  • 149
2

We can use unstack in base R

unstack(df1, score ~ gender)
  female male
1     90  100
2     98   80
3    100   75

data

df1 <- structure(list(score = c(100L, 90L, 98L, 80L, 75L, 100L), gender = c("male", 
"female", "female", "male", "male", "female")), class = "data.frame", row.names = c(NA, 
-6L))
akrun
  • 874,273
  • 37
  • 540
  • 662