0

I have a dataframe with a post stratification survey weight variable “n”. For an analysis purpose I will need this data frame where each row will represent an observation.

How do I get a new (i.e., long) dataframe?

structure(list(grupo = structure(c(1L, 1L, 1L, 1L), .Label = c("A", 
"B"), class = "factor"), Categorias = structure(c(1L, 1L, 2L, 
2L), .Label = c("7 a 10 anos", "11 a 15 anos", "16 a 21 anos", 
"Adulto"), class = "factor"), Atuacao = structure(c(1L, NA, 1L, 
3L), .Label = c("Não", "NA", "Sim"), class = "factor"), n = c(16L, 
2L, 14L, 2L)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame"))
Cristiano
  • 233
  • 1
  • 9
  • What's your desired result? Show it as a benchmark. – Peace Wang Apr 30 '21 at 13:33
  • The weighted column "n" would be converted into repeats. Thus, the first row would repeat the number of times of this weighted column. For example, the setting "A, 7 to 10 anos, Não" will have 16 rows; the setting "A, 7 to 10 anos, NA" will have 2 rows; etc. df will take a long form! – Cristiano Apr 30 '21 at 13:53

1 Answers1

1

Is this your desired long format?

It's not the commen long format in my mind (columns: variable names + value).

So your desired question is how to repeat each row with the given times (df$n). Refer to Repeat rows of a data.frame N times

df[rep(1:NROW(df),df$n),]

Data:

df <- structure(list(gr0up = structure(c(1L, 1L, 1L, 1L), .Label = c("A", 
"B"), class = "factor"), Categories = structure(c(1L, 1L, 2L, 
2L), .Label = c("7 a 10 anos", "11 a 15 anos", "16 a 21 anos", 
"Adulto"), class = "factor"), Atuacao = structure(c(1L, NA, 1L, 
3L), .Label = c("Não", "NA", "Sim"), class = "factor"), n = c(16L, 
2L, 14L, 2L)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame"))

df[rep(1:NROW(df),df$n),]
#> # A tibble: 34 x 4
#>    gr0up Categories  Atuacao     n
#>    <fct> <fct>       <fct>   <int>
#>  1 A     7 a 10 anos Não        16
#>  2 A     7 a 10 anos Não        16
#>  3 A     7 a 10 anos Não        16
#>  4 A     7 a 10 anos Não        16
#>  5 A     7 a 10 anos Não        16
#>  6 A     7 a 10 anos Não        16
#>  7 A     7 a 10 anos Não        16
#>  8 A     7 a 10 anos Não        16
#>  9 A     7 a 10 anos Não        16
#> 10 A     7 a 10 anos Não        16
#> # … with 24 more rows

Created on 2021-04-30 by the reprex package (v2.0.0)

Peace Wang
  • 2,399
  • 1
  • 8
  • 15