0

I have many contingency tables (100) in many sheets gathered in one excel file. Some of them have more attributes than the other. But at the end they all have the same columns and rows name. (either more attributes or less). For example let's say we have these 2 tables.

enter image description here

enter image description here

I want to add each cell that belong to the same class (row - Person1 type B and column B_1 cell in the first table will add up with row - Person1 type B and column B_1 cell in the second table and so on)

The final table will be like this. Notice D was not there in the first table so it is going to be there as it is.

enter image description here

I want to sum up (pool) all the contingency tables into one with all the available attributes. How to achieve this in R?

Thank you

Amal Nasir
  • 164
  • 15

1 Answers1

0

Since you did not provide a excel-file, i cooked up a file based on your images provided..

It looks like this
enter image description here enter image description here

library(tidyverse)
library(tidyxl)
library(readxl)
library(data.table)
library(unpivotr)

file_to_read <- "./testdata.xlsx"
# Get all names of sheets in the file
sheet_names <- readxl::excel_sheets(file_to_read)
# Loop through sheets
L <- lapply(sheet_names, function(x) {
  all_cells <-
    tidyxl::xlsx_cells(file_to_read, sheets = x) %>%
    dplyr::select(sheet, row, col, data_type, character, numeric)
  # Cells with the actual data  
  cells_data <-
    dplyr::filter(all_cells, row >= 3, col >= 3) %>%
    dplyr::transmute(row, col, sheet = sheet, value = numeric)
  # Select the headers
  person.number.up <-
    dplyr::filter(all_cells, row == 1) %>%
    dplyr::select(row, col, person.number.up = character)
  person.type.up <- 
    dplyr::filter(all_cells, row == 2) %>%
    dplyr::select(row, col, person.type.up = character)
  person.number.left <-
    dplyr::filter(all_cells, col == 1) %>%
    dplyr::select(row, col, person.number.left = character)
  person.type.left <- 
    dplyr::filter(all_cells, col == 2) %>%
    dplyr::select(row, col, person.type.left = character)
  #put together
  final.df <- cells_data %>%
    unpivotr::enhead(person.number.up, "up-ish") %>%
    unpivotr::enhead(person.type.up, "up-ish") %>%
    unpivotr::enhead(person.number.left, "left-ish") %>%
    unpivotr::enhead(person.type.left, "left-ish") %>%
    dplyr::select(-(1:2))
})
# Put together in a data.table
DT <- data.table::rbindlist(L, use.names = TRUE)
# Cast to wide, summing values in the process
ans <- dcast(DT, person.number.left + person.type.left ~ person.number.up + person.type.up, 
      value.var = "value", 
      fun.aggregate = sum, na.rm = TRUE)

enter image description here

Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • note: you might need `unpivotr::justify()` to set the headers to the upper left corners of your data to avoid ties... see `unpivotr::direction()` under the `-ish`-section – Wimpel May 18 '21 at 14:11
  • What is this line for? # Cells with the actual data cells_data <- dplyr::filter(all_cells, row >= 3, col >= 3) %>% dplyr::transmute(row, col, sheet = sheet, value = numeric) – Amal Nasir May 20 '21 at 17:42
  • Thanks for the preprocessing step. I added the headers manually for each table to refer to a specific person (just like what you did but manually to get more accurate results. I have like five types and each table is different). Then, I changed the numbers in this section (# Select the headers) to the names in that column instead. for example (person.number.up <- dplyr::filter(all_cells, row == "person1") %>% dplyr::select(row, col, person.number.up = character)). I got this error, (Error: Can't subset columns that don't exist. x Column `row` doesn't exist) – Amal Nasir May 20 '21 at 17:46
  • I updated the question with the data I have. – Amal Nasir May 20 '21 at 18:15
  • now you are asking a completely different question. better to start a new one. And make it reproducible (see FAQ). You will not get the best/fastest answers if you only post images of data. – Wimpel May 20 '21 at 18:38
  • Thank you so much. I updated the question here and it would be very appreciated if you help me out with this, https://stackoverflow.com/questions/67692461/sum-up-tables-results-from-multiple-sheets-into-one-table-in-r/67692975?noredirect=1#comment119651896_67692975 – Amal Nasir May 25 '21 at 17:59