4

I have synthetic data in a 2*4 array with 500 observations:

datax = array(c(120, 181, 50, 43, 41, 33,24,8), dim=c(2,4))
dimnames(datax) = list(gender= c('male', 'female')
                    , punishment = c('None', 'Community_service', 'Youth_prison', 'Normal_prison'))

I'd like to produce a data.frame from the table that represents the "source" of the frequency table.

I can represent it through a "Freq" column (as.data.frame(as.table(datax)), also here) but I'd like to produce the data.frame with 500 rows and 2 columns (gender, punishment).

How would I do this in R?

ben_aaron
  • 1,504
  • 2
  • 19
  • 39

2 Answers2

7

Try this:

long <- as.data.frame.table(datax)
longer <- long[rep(1:nrow(long), long$Freq), -3]
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
1

Using dplyr:

as.data.frame.table(datax) %>%
    rowwise() %>%
    do(data.frame(rep(.$gender, .$Freq), .$punishment))

This constructs a new table for every row in your data, repeating Freq times, and concatenates them into one giant table.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214