0

I have multiple R data sets from which I am pulling the frequency of the occurrence of each number, 1 through 8. The data sets are each only 5 values long though, so not all of the numbers are represented. Here is an example of what one of those lists looks like:

T1 
#1 2 3 4 7 
#1 1 1 1 1

I am generating multiple of these lists from different sets of 5 numbers, and will be using them in side-by-side figures. In order to standardize the graphing parameters between all these lists, I want them each to have all numbers 1:8 represented, even the missing ones. My ideal result would look like this:

T1 
#1 2 3 4 5 6 7 8
#1 1 1 1 0 0 1 0

I have attempted various methods, including:

  • Creating a blank list with 1:8 to merge or rbind the existing list with. Merging doesn't work and rbind required the same number of columns
  • Generating the list with a factor that includes levels = 1:8. This always resulted in a of values 1:8, but not populated with my data

I can't tell if I am trying the right methods but performing them incorrectly, or if there is a different approach. Any help would be appreciated!


Additional Context, per @onyambu:

I am pulling this data from a data.frame where each Column is a person and each of the 5 rows in a number 1-8. And example of the frame is:

      Layton Jared Jon Colby Brandon 
SC.1       7     4   2     5       3      
SC.2       3     7   4     6       1      
SC.3       1     8   3     5       4      
SC.4       4     3   1     5       8      
SC.5       2     8   1     3       7      

In order to get each column to a format compatible with a Pie Chart, I am using table(DF[n]) to create the following table:

table(DF[1])
Layton
1 2 3 4 7 
1 1 1 1 1 
table(DF[2])
Jared
3 4 7 8 
1 1 1 2 

In order to graph the Pie charts side-by-side with compatible colors and legends, I would like the final result to include missing numbers 1-8 as well. Something like this:

Layton
1 2 3 4 5 6 7 8
1 1 1 1 0 0 1 0 
Jared
1 2 3 4 5 6 7 8 
0 0 1 1 0 0 1 2 
OddWaller
  • 3
  • 2
  • 2
    Please give an example of the list that can be reproduced in R, ie just length 3,4 or even 5 is enough. Ensure that the list captures all the necessary content for your question – Onyambu Nov 15 '22 at 00:43
  • [this is](https://stackoverflow.com/questions/3402371/combine-two-data-frames-by-rows-rbind-when-they-have-different-sets-of-columns) what you are probably looking for. – Eric Nov 15 '22 at 01:25
  • @onyambu - I added more descriptive examples from my data set. Hopefully this is is sufficient – OddWaller Nov 15 '22 at 16:55
  • I've completely re-written the original answer based on your new information. – dcarlson Nov 15 '22 at 17:01
  • You should consider doing `t(table(stack(DF)))` – Onyambu Nov 15 '22 at 18:36

1 Answers1

0

This is a complete revision of the original answer based on more details from the OP. First make your example data easily available using dput():

DF <- structure(list(Layton = c(7L, 3L, 1L, 4L, 2L), Jared = c(4L, 
7L, 8L, 3L, 8L), Jon = c(2L, 4L, 3L, 1L, 1L), Colby = c(5L, 6L, 
5L, 5L, 3L), Brandon = c(3L, 1L, 4L, 8L, 7L)), class = "data.frame",
 row.names = c("SC.1", "SC.2", "SC.3", "SC.4", "SC.5"))

Now convert your numbers to factors:

DF.fact <- lapply(DF, factor, levels=1:8)
str(DF.fact)
# List of 5
#  $ Layton : Factor w/ 8 levels "1","2","3","4",..: 7 3 1 4 2
#  $ Jared  : Factor w/ 8 levels "1","2","3","4",..: 4 7 8 3 8
#  $ Jon    : Factor w/ 8 levels "1","2","3","4",..: 2 4 3 1 1
#  $ Colby  : Factor w/ 8 levels "1","2","3","4",..: 5 6 5 5 3
#  $ Brandon: Factor w/ 8 levels "1","2","3","4",..: 3 1 4 8 7

Now create the tables:

DF.tbls <- lapply(DF.fact, table)
DF.tbls
# $Layton
# 
# 1 2 3 4 5 6 7 8 
# 1 1 1 1 0 0 1 0 
# 
# $Jared
# 
# 1 2 3 4 5 6 7 8 
# 0 0 1 1 0 0 1 2 
# 
# $Jon
# 
# 1 2 3 4 5 6 7 8 
# 2 1 1 1 0 0 0 0 
# 
# $Colby
# 
# 1 2 3 4 5 6 7 8 
# 0 0 1 0 3 1 0 0 
# 
# $Brandon
# 
# 1 2 3 4 5 6 7 8 
# 1 0 1 1 0 0 1 1 

Or combine into a single table:

tbl <- do.call(rbind, DF.tbls)
tbl
#         1 2 3 4 5 6 7 8
# Layton  1 1 1 1 0 0 1 0
# Jared   0 0 1 1 0 0 1 2
# Jon     2 1 1 1 0 0 0 0
# Colby   0 0 1 0 3 1 0 0
# Brandon 1 0 1 1 0 0 1 1
dcarlson
  • 10,936
  • 2
  • 15
  • 18
  • This method works with the example set of data, but the data sets include many sets of 5 numbers where the numbers repeat. In your example, that value would be set to 1 instead of the N number of times that value appears. – OddWaller Nov 15 '22 at 16:21
  • @OddWaller - if a method works with the examples you give but not with your real data then you need to do a better job creating a reproducible example. – Dason Nov 15 '22 at 16:47
  • @Dason - Noted, I already amended the post earlier to provide better examples that cover a wider score of the data I am seeing – OddWaller Nov 15 '22 at 16:55
  • I do believe `t(table(stack(DF)))` will be easier and faster – Onyambu Nov 15 '22 at 18:36
  • Yes. Also `table(rep(colnames(DF), each=5), unlist(DF))` which would have the possible advantage of putting the rows in alphabetical order. – dcarlson Nov 15 '22 at 19:28