3

I want to create a single contingency table given 3 columns of a data frame using xtabs() function(in R). The code below works fine for 2 columns:

xtabs(~B + C, data = theData) #contingency table for two columns

but when I add one more attribute, I get an error:

xtabs(~B + C + mean(A), data = theData)

Error in model.frame.default(formula = ~B + C +  : 
  variable lengths differ (found for 'mean(A)')

For example, for the data frame below

A   B   C
1   b1  c1
2   b1  c1
3   b1  c2
1   b1  c2
4   b2  c2
7   b2  c1

The output should be like:

B   C   A
b1  c1  1.5
    c2  2.0
b2  c1  7.0
    c2  4.0

What is the right way of creating a table with mean values of one column across the other (different) two columns? Thank you

Helen Grey
  • 439
  • 6
  • 16

1 Answers1

2

We can use xtabs after summarising the output with aggrregate

xtabs(A ~ B + C , data = aggregate(A ~ B + C, theData, FUN = mean))
#   C
#B     c1  c2
#  b1 1.5 2.0
#  b2 7.0 4.0

Or in this case, the output can be just aggregate

aggregate(A ~ B + C, theData, FUN = mean)
#   B  C   A
#1 b1 c1 1.5
#2 b2 c1 7.0
#3 b1 c2 2.0
#4 b2 c2 4.0

It is not recommended to change some values to blank "" as it can create issues with other steps

akrun
  • 874,273
  • 37
  • 540
  • 662
  • thank you for your prompt response. this snippet returns the overall average value for column A, instead of considering average A value given the values of B and C. Like for tuples with B=b1 and C=c1 average A value is a1 and so on. Is there a way to do that? – Helen Grey Apr 24 '20 at 00:29
  • post has been modified – Helen Grey Apr 24 '20 at 00:41