0

I am using the R programming language. I am trying to replicate the answer provided in this stackoverflow post (visualizing the optimization path till convergence in R) for visualizing the results of an optimization algorithm.

First, I load the libraries and create some data:

#load libraries
library(dplyr)
library(optimization)


# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,100,5)
c1 = sample.int(1000, 1000, replace = TRUE)
train_data = data.frame(a1,b1,c1)

Then, I define the function I want to optimize :

fitness <- function(x) {
    #bin data according to random criteria
    train_data <- train_data %>%
        mutate(cat = ifelse(a1 <= x[1] & b1 <= x[3], "a",
                            ifelse(a1 <= x[2] & b1 <= x[4], "b", "c")))
    
    train_data$cat = as.factor(train_data$cat)
    
    #new splits
    a_table = train_data %>%
        filter(cat == "a") %>%
        select(a1, b1, c1, cat)
    
    b_table = train_data %>%
        filter(cat == "b") %>%
        select(a1, b1, c1, cat)
    
    c_table = train_data %>%
        filter(cat == "c") %>%
        select(a1, b1, c1, cat)
    
    
    
    
    #calculate  quantile ("quant") for each bin
    
    table_a = data.frame(a_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[5],1,0 )))
    
    table_b = data.frame(b_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[6],1,0 )))
    
    table_c = data.frame(c_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[7],1,0 )))
    
    
    
    
    #group all tables
    
    final_table = rbind(table_a, table_b, table_c)
    # calculate the total mean : this is what needs to be optimized
    mean = mean(final_table$quant)
}

From here, I run the optimization algorithm:

#optimization algorithm:
Output <- optim_nm(fitness, k = 7, trace = TRUE)

From the optimization algorithm, I extract the necessary information needed for the visualization, and place this information into a data frame:

#extract information from the optimization and place into data frame
a = Output$trace
b = a[,c(1,2,3,5)]
b = data.frame(b)

Problem : Here is where things start to go wrong. When I try to plot the "static" visualization, several warnings are produced along with an incomplete plot:

#first part of the visualization
p <- ggplot(b, aes(x_1, x_3)) + 
    geom_raster(aes(fill = function_value)) +
    geom_contour(aes(z = function_value), col = 'grey40') +
    geom_path(data = b, col = 'white') +
    geom_point(data = b, col = 'white') +
    scale_fill_viridis_c() +
    coord_cartesian(expand = FALSE)

#view plot
p

enter image description here

Warning messages:
1: stat_contour(): Zero contours were generated 
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
4: Raster pixels are placed at uneven horizontal intervals and will be shifted. Consider using geom_tile() instead. 
5: Raster pixels are placed at uneven vertical intervals and will be shifted. Consider using geom_tile() instead.

Then, when I try to animate the results, the following error is produced:

library(gganimate)
final <- p + transition_reveal(iteration)

#view animation
final

Error in `$<-.data.frame`(`*tmp*`, "group", value = "") : 
  replacement has 1 row, data has 0

A list of warnings is also produced:

#view warnings 
warnings()

Warning messages:
1: stat_contour(): Zero contours were generated
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

Can someone please tell me what I am doing wrong? Is there a way to fix this error?

Thanks

stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • 1
    It is very unclear what type of plot you are trying to make with that combination of geoms. You seem to only have 35 points and they don't really fill the plotting area. Trying to fill the area using so few points that aren't even on a regular grid doesn't really seem a like a good idea. – MrFlick Jul 10 '21 at 03:27
  • 1
    Perhaps this is what you had in mind? https://stackoverflow.com/questions/19339296/plotting-contours-on-an-irregular-grid – MrFlick Jul 10 '21 at 04:02
  • @MrFlick : thank you for your reply! I increased the size of the data frame ("train_data") to 10,000 rows ... but the the code still is not working. Am I doing something wrong? – stats_noob Jul 10 '21 at 04:29

0 Answers0