1

I have a list of data frames where the index indicates where one family ends and another begins. I would like to know how many categories there are in statepath column in each family.

In my below example I have two families, then I am trying to get a table wiht the frequency of each statepath category (233, 434, 323, etc) in each family.

My input:

List <- 
'$`1`
Chr  Start   End Family Statepath
1   187546286   187552094   father  233
3   108028534   108032021   father  434
1   4864403 4878685 mother  323 
1   18898657    18904908    mother 322
2   460238  461771  offspring   322
3   108028534   108032021   offspring   434
$’2’
Chr  Start   End Family Statepath
1   71481449    71532983    father  535
2   74507242    74511395    father  233
2   181864092   181864690   mother  322
1   71481449    71532983    offspring   535
2   181864092   181864690   offspring   322
3   160057791   160113642   offspring   335'

Thus, my expected output Freq_statepath would look like:

Freq_statepath <- ‘Statepath    Family_1    Family_2
233 1   1
434 2   0
323 1   0
322 2   2
535 0   2
335 0   1’
Tfg1005
  • 111
  • 5
  • Please use `dput` to provide us with your `list` so that this question is more [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – bouncyball Nov 15 '16 at 15:48

1 Answers1

0

I think you want something like this:

test <- list(data.frame(Statepath = c(233,434,323,322,322)),data.frame(Statepath = c(434,323,322,322)))
list_tables <- lapply(test, function(x) data.frame(table(x$Statepath)))
final_result <- Reduce(function(...) merge(..., by.x = "Var1", by.y = "Var1", all.x = T, all.y = T), list_tables)
final_result[is.na(final_result)] <- 0

> test
[[1]]
  Statepath
1       233
2       434
3       323
4       322
5       322

[[2]]
  Statepath
1       434
2       323
3       322
4       322

> final_result
  Var1 Freq.x Freq.y
1  233      1      0
2  322      2      2
3  323      1      1
4  434      1      1
Tobias Dekker
  • 980
  • 8
  • 19