2

I'm currently trying to convert polynomial variable strings into expressions in R and have seen this similar link. However, I'm struggling to conceptualize a method that would allow me to provide a list of variables and develop a string that would appear correct for a ggplot legend. For example, I have the below string:

variable.names <- c("x", "y", "yy", "xxx", "xxy", "yyy", 
                    "xxxxx", "xxxxy", "xxxyy", "xxyyy", "yyyyy")

and would like to convert it automatically to:

new.variable.names <- c("x", "y", expression(y^2),
                    expression(x^3), expression(paste(x^2, y)), expression(y^3),
                    expression(x^5), expression(paste(x^4, y)), expression(paste(x^3, y)),
                    expression(paste(x^2, y^3)), expression(y^5))

This is one example, but I am hoping to write a function that can do this with possibly more variables in variable.names. I was thinking there was possibly a way to use regular expressions, but don't know how to develop a function that would determine the pattern of letters, and put them in the right position for the expression and paste functions to create the names automatically.

Thanks in advance.

AW27
  • 481
  • 3
  • 15
  • How are your variable names created? Can you use that process to create expressions instead? – Roland Nov 24 '20 at 14:33
  • The variable names are created using a similar method with ```strsplit```, they were created and reordered during an initial expansion of polynomials. the Method you've provided below seems to work how I'd want it. – AW27 Nov 24 '20 at 14:38
  • "expansion of polynomials" I would first create the expressions and than the variable names from the expressions, not the other way around. I dislike the need for parsing. – Roland Nov 24 '20 at 14:41
  • The expansion was used based on the `polym` function in `R`. It generated the variable names that based on the numbers provided by the resulting columns. This was a way to keep the columns in order based on polynomial degree rather than alphabetical. – AW27 Nov 24 '20 at 14:53
  • @Roland is it possible to get the same result in tidy? – AW27 Nov 24 '20 at 15:04
  • Probably. I have never felt the need for using the tidyverse. Thus, I can't help you there. – Roland Nov 24 '20 at 15:05
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/225038/discussion-between-aw27-and-roland). – AW27 Nov 24 '20 at 15:06

2 Answers2

2
variable.names <- c("x", "y", "yy", "xxx", "xxy", "yyy", 
                    "xxxxx", "xxxxy", "xxxyy", "xxyyy", "yyyyy")

foo <- function(x) {
  runlength <- rle(x)
  string <- gsub("^1", "", 
                 paste0(paste(runlength$values, 
                              runlength$lengths, sep = "^"), 
                        collapse = "*"), 
                 fixed = TRUE)
  parse(text = string)
}

res <- lapply(strsplit(variable.names, ""), foo)



plot.new()

plot.window(c(0, 2), c(0, 11))
text(1, 10:0, do.call(c, res))

resulting plot

Roland
  • 127,288
  • 10
  • 191
  • 288
1

Here's a tidy solution UPDATED 2020-11-24T13:35:33 to work for any number of variables and to not include the ^ and output expression:

purrr::map(strsplit(variable.names, ""), ~{
    
    .str <- .x
    .vars <- purrr::map(unique(.str), ~{
        .instances <- sum(.str %in% .x)
        if (.instances == 0) .v <- NULL else if (.instances == 1) .v <- .x else if  (.instances > 1) .v <- paste0(.x,"^",.instances)
    })
    
    parse(text = paste0(.vars, collapse = "*"))  
    
})
  • The only problem I'm facing with the tidy solution is that the variables don't look right in the legend. For example, I wouldn't want the ```^``` symbol in the legend. – AW27 Nov 24 '20 at 14:39
  • So you would want x2y2 for x^2y^2? – Synchronicity Nov 24 '20 at 16:03
  • @AW27 I updated the above to do what I think you want it to do? – Synchronicity Nov 24 '20 at 16:15
  • Based on Roland's answer, I want the result as an ```expression```. The updated tidy solution gives a string still, but I need it in an ```expression``` such as ```expression(y^5)```. Thanks – AW27 Nov 24 '20 at 16:33
  • 1
    @AW27 You just need to update the output line to give an expression: `rlang::parse_expr(do.call(paste0,.vars))` as i've done so above. – Synchronicity Nov 24 '20 at 18:27
  • I get the error ```Error: `x` must be a character vector or an R connection``` when attempting to change ```paste0``` with ```expression``` – AW27 Nov 24 '20 at 21:30
  • @AW27 The objects that output from the above code are already expressions... – Synchronicity Nov 24 '20 at 21:55
  • I'm having a hard time getting the same results as Roland did with his method. With the updated code for tidy, I get `y5` rather than `expression(y^5)`. The latter allows me to pass into `ggplot` and return what Roland shows. – AW27 Nov 24 '20 at 23:41
  • 1
    @AW27 Ok, your first comment said >"I wouldn't want the `^` in the legend so I thought you didnt want that. It's updated to give you the `expression` call – Synchronicity Nov 25 '20 at 01:19