1

I try to subset a table based on one category value. Assume we want to keep only adults from the Titanic data. What I do is:

data("Titanic")
subset(Titanic, Age == "Adult")

This results in the error object 'Age' not found. Using same logic with dataframes works: subset(as.data.frame(Titanic), Age == "Adult"). But how can we subset tables directly, i.e. without transforming them to a dataframe?

EDIT Here Adult is dimension number three. In my case I do not know which dimension it is, i.e. I would like to be able to subset by variable name as in subset(Titanic, Age == "Adult"). It can be any other base function, i.e. I am not stuck with subset. But I am looking for a base R solution.

My expected output is

structure(c(118, 154, 387, 670, 4, 13, 89, 3, 57, 14, 75, 192, 140, 80, 76, 20), .Dim = c(4L, 2L, 2L), .Dimnames = list(Class = c("1st", "2nd", "3rd", "Crew"), Sex = c("Male", "Female"), Survived = c("No", "Yes")), class = "table")
  • By the way, your expected output is wrong? – zx8754 Nov 23 '21 at 11:48
  • Looks like a duplicate of https://stackoverflow.com/q/14500707/680068 – zx8754 Nov 23 '21 at 12:00
  • 1
    @zx8754 Corrected data structure. Link: difference is that I don't know what dimension it is in beforehand but the answers in link await dimension as argument. –  Nov 23 '21 at 12:18

2 Answers2

2

You are not working on a 2 dimensional data-frame but on a 4 dimensional array.
Thus you must specify your condition in the right dimension, as follows:

Titanic[,,"Adult",]

When you display your array, you have the 4 following dimensions:
1- Class
2- Sex
3- Age
4- Survived

You can get the dimension names with "str()" or "dimnames()"

str(Titanic)
dimnames(Titanic)
Yacine Hajji
  • 1,124
  • 1
  • 3
  • 20
1

Get dimensions index by matching on dimnames, then use slice.index:

# user input
x = "Adult"

#get index
ix1 <- which(sapply(dimnames(Titanic), function(i) sum(i == x)) == 1)
ix2 <- which(dimnames(Titanic)[[ ix1 ]] == x)

#subset and restore dimensions
res <- Titanic[ slice.index(Titanic, ix1) == ix2 ]
dim(res) <- dim(Titanic)[ -ix1 ]

#test
all(Titanic[,,"Adult",] == res)
# [1] TRUE

# not identical as the names are missing
identical(Titanic[,,"Adult",], res)
# [1] FALSE

res
# , , 1
# 
#      [,1] [,2]
# [1,]  118    4
# [2,]  154   13
# [3,]  387   89
# [4,]  670    3
# 
# , , 2
# 
#      [,1] [,2]
# [1,]   57  140
# [2,]   14   80
# [3,]   75   76
# [4,]  192   20
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 1
    `slice.index` does the trick! Thank you! –  Nov 23 '21 at 12:19
  • 1
    @machine glad it helped, notice I am missing dimnames in the result, so not 100% same as your expected output. – zx8754 Nov 23 '21 at 12:24
  • I`ve noticed that, thanks. Working on will show whether this is a problem ;) –  Nov 23 '21 at 12:28