1

In running my code - I have x data and y data with a column identifier that I call FILLCOL. In my dataframe I have only two datasets so the column FILLCOL only has two unique identifiers. Yet - when I generate the plot, an extra label shows up in the legend. Appreciate any insight.

So - I've been working on this all morning. I was able to resolve the issue by removing empty data cells. I guess the modification of this question is how to ignore those cells without modifying the dataframe? I'm essentially eliminating outliers for that treatment but I don't want to delete the raw data. Thanks.

p<-ggplot(SCAT_PLOT.summary, aes(x=(SCAT_PLOT.summary$X_VALUE), y=(SCAT_PLOT.summary$Y_VALUE),colour=as.factor(SCAT_PLOT.summary$FILLCOL),  fill=as.factor(SCAT_PLOT.summary$FILLCOL))) +

  geom_point(shape=21,  size = 4, alpha = 0.5, show.legend = TRUE) + 
  scale_color_brewer(type='div', palette=2)+

  geom_smooth(method="glm", se=TRUE, fill = "blue", alpha = .05,          
             formula=formula)+
  stat_poly_eq(formula = formula, size = 4,hjust = -.05,vjust = .5,
               eq.with.lhs = "italic(y)~`=`~",
               eq.x.rhs = "~italic(x)", color = "blue",
               aes(label = paste(..eq.label.., ..rr.label.., sep = "*plain(\",\")~")), 
               parse = TRUE) +
   
  theme_classic()+
 
  xlab(x_axis_label)+
  ylab(y_axis_label)+
  theme(axis.text.x = element_text(color = "black", size = 15, angle = 0, hjust = .5, vjust = .5, face = "plain"),
        axis.text.y = element_text(color = "black", size = 15, angle = 0, hjust = 1, vjust = 0, face = "plain"),  
        axis.title.x = element_text(color = "black", size = 15, angle = 0, hjust = .5, vjust = 0, face = "plain"),
        axis.title.y = element_text(color = "black", size = 15, angle = 90, hjust = .5, vjust = .5, face = "plain"))+
  theme(panel.grid.minor = element_line(size = 0.25, linetype = 3,colour = "green"),
        panel.grid.major = element_line(size = 0.25, linetype = 3,colour = "green"))+
  theme( axis.line = element_line(colour = "darkblue", size = 1, linetype = "solid"))+
  
 
 scale_fill_discrete(name = legend_label,  labels = legend_text, na.translate = FALSE)+
 scale_color_discrete(name = legend_label, labels = legend_text,guide = FALSE, na.translate = FALSE)+
  

Plot of data showing legend issue

  • 1
    Hi OP. Welcome to SO. Can you share your data frame `SCAT_PLOT.summary`, please? Just type `dput(SCAT_PLOT.summary)` into the console. The output should start with `structure(...` and you can copy and paste that list of text and numbers as code into your question. If it's too long, you can take a sample of the data: for example: `dput(SCAT_PLOT.summary[sample(1:nrow(SCAT_PLOT.summary),20),])`. – chemdork123 Jul 18 '20 at 16:19
  • Please visit [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – UseR10085 Jul 18 '20 at 16:20
  • You can remove all instances of `SCAT_PLOT.summary$`. You don't need to and should not restate the the name of the data frame when referring to column names inside `aes()`. Also, `scale_color_discrete` is overriding `scale_color_brewer()`. – eipi10 Jul 18 '20 at 16:30
  • 1
    Thank you - I'm pretty new to R-coding so I tend to keep the long forms just to help me track what I'm doing. I'll definitely go back and clean it up. Appreciate the insight. – Kurt Haunreiter Jul 18 '20 at 16:45
  • X_VALUE Y_VALUE FILLCOL 15.137 0.961 Dataset1 13.792 0.856 Dataset1 12.8563 0.976 Dataset1 26.451 1.236 Dataset1 16.575 1.03 Dataset1 29.479 1.348 Dataset1 18.264 0.945 Dataset1 27.756 1.166 Dataset1 28.134 1.252 Dataset1 42.739 1.767 Dataset1 43.35 1.8 Dataset1 51.566 2.049 Dataset1 40.93 1.673 Dataset1 13.504 1.076 Dataset2 13.261 1.053 Dataset2 11.966 0.86 Dataset2 13.608 1.053 Dataset2 17.747 1.05 Dataset2 19.717 1.15 Dataset2 19.927 1.135 Dataset2 – – Kurt Haunreiter Jul 18 '20 at 17:10

1 Answers1

1

You most likely have a missing value or another strange value in FILLCOL variable column. Hence, an extra item in the label. Prior to plotting you can subset the data to another dataframe or data.table, after eliminating outliers, and then plot using this new dataframe.

YBS
  • 19,324
  • 2
  • 9
  • 27