1

I have a categorical data frame in R as follows:

 Cat_0   Cat_1   Cat_2   Cat_3    Cat_4
 Baby    Travel  Bath    Towels   Age 0-1
 Baby    Travel  Bath    Towels   Age 1-2
 Baby    Travel  Box     NA       NA
 Baby    Chairs  Sit     NA       NA
 Animals Horse   Rider   Safety   Chaps
 Animals Horse   Rider   Caps     NA
 Animals pig     NA      NA       NA

I want to define the tree with the data.tree package, for future calculations, The tree should look like this.

                       |----Chairs----sit 
          |            |                                     |---age 0-1
          |---- Baby---|              |----Bath----Towels----|
          |            |----Travel----|                      |---age 1-2
          |                           |----Box
Product --|
          |                                |---safety----chaps
          |            |---Horse---rider---|
          |-- Animals--|                   |---caps
          |            |---Pig

I am able to create the tree as above, but there are NA appearing, I would like to delete the NA out of the data.tree. This is my code:

 tree$pathString <- paste("product", 
                      tree$Cat_0, 
                      tree$Cat_1,
                      tree$Cat_2,
                      tree$Cat_3,
                      tree$Cat_4,
                      sep = "/")

tree <- as.Node(tree)
print(tree)
user5424264
  • 105
  • 1
  • 9

1 Answers1

1

Using the data.tree package:

 library(data.tree)

The package author provided the answer, you must omit the NA's when pasting by using the alternative paste5 function provided in the stackoverflow answer below:

suppress NAs in paste()

paste5 <- function(..., sep = " ", collapse = NULL, na.rm = F) {
  if (na.rm == F)
    paste(..., sep = sep, collapse = collapse)
  else
    if (na.rm == T) {
      paste.na <- function(x, sep) {
        x <- gsub("^\\s+|\\s+$", "", x)
        ret <- paste(na.omit(x), collapse = sep)
        is.na(ret) <- ret == ""
        return(ret)
      }
      df <- data.frame(..., stringsAsFactors = F)
      ret <- apply(df, 1, FUN = function(x) paste.na(x, sep))

      if (is.null(collapse))
        ret
      else {
        paste.na(ret, sep = collapse)
      }
    }
}

Then

tree$pathString <- paste5("product", 
                          tree$Cat_0, 
                          tree$Cat_1,
                          tree$Cat_2,
                          tree$Cat_3,
                          tree$Cat_4,
                          sep = "/",
                          na.rm = TRUE)       

htree <- as.Node(tree, na.rm=TRUE)
print(htree)
Community
  • 1
  • 1
user25494
  • 1,289
  • 14
  • 27