0

I have a dataframe like this:

item <- c("item1", "item2", "item3", "item1", "item2", "item3", "item1", "item2", "item3")
group <- c("A", "B", "C", "A", "B", "C", "A", "B", "C")
level <- c(NA, NA, 40, NA, 25, NA, 30, NA, NA)

data <- cbind(item, group, level)
data <- as_tibble(data)
data <- type.convert(data, as.is = TRUE)

Which appears as follows:

item  group level

item1    A    NA
item2    B    NA
item3    C    40
item1    A    NA
item2    B    25
item3    C    NA
item1    A    30
item2    B    NA
item3    C    NA

Every item is univocally associated to a specific group ( item1 is always linked to group A, item2 always to group B , etc.)

To plot the graph of the data, I use this code:

   graph <- data %>%
     ggplot(aes(x=group, y=level)) +
     geom_point(colour="blue", size=3, na.rm=TRUE)

which shows this result:

enter image description here

Now, I would like to display in red the point with value 25 , selecting it by the item. I.e., "if the item2 (which corresponds to group B ) has a value != NA, display its value in red, keeping all the other values in blue".

I have thought to a if in a for-cycle , but I don't know if it is the right reasoning.

Thank you for helping!

CLOSED: SOLUTION FOUND

I have created a subset of the dataframe by item:

my_dot <- subset(data, item=="item2")
my_dot <- type.convert(my_dot, as.is = TRUE)

and added a line to the ggplot which load the subsetted dataframe "my_plot" in geom_point(data=my_dot, ...)

    graph <- data %>%
      ggplot(aes(x=group, y=level)) +
      geom_point(colour="blue", size=3, na.rm=TRUE) +
      geom_point(data=my_dot, aes(x=group, y=level), colour="red", size=5, na.rm=TRUE)

Here the result I was looking for:

enter image description here

Vega
  • 27,856
  • 27
  • 95
  • 103
EmaK
  • 154
  • 1
  • 9
  • 2
    I agree with @stefan's link. Regardless, I see no reason for a `for` loop or `if`-conditional to do this: the single plot with different point highlights should be done in a single step. – r2evans Jun 02 '21 at 21:40
  • I closed it as a dupe, since the other answer (which has much more traffic, more answers, more variety) really addresses it fully. I provided an answer anyway just to personalize it for you, you can still accept it if you like, but there is no requirement. If I missed something and the duplicate (and my answer) is insufficient, ping me and we can discuss and reopen. – r2evans Jun 02 '21 at 21:52

1 Answers1

2

This is a dupe, but for your data, a few options:

  1. aes(color=(group == "B")), so we should always have two colors.

    ggplot(data, aes(x=group, y=level)) +
      geom_point(aes(color=(group == "B")), size=3, na.rm=TRUE) +
      scale_color_manual(values=c("blue", "red"))
    

    enter image description here

  2. aes(color=group), where we can specify different colors for each group. We'll name the color-vector:

    ggplot(data, aes(x=group, y=level)) +
      geom_point(aes(color=group), size=3, na.rm=TRUE) + 
      scale_color_manual(values=c(A="blue", B="red", C="blue"))
    

    enter image description here

  3. Overlay a new point over the old:

    ggplot(data, aes(x=group, y=level)) +
      geom_point(color="blue", size=3, na.rm=TRUE) +
      geom_point(color="red", size=3, na.rm=TRUE, data = ~subset(., group == "B"))
    

    enter image description here

    (This is generally not the "canonical" method of ggplot2; note that it does not add a legend, because we've not assigned a field to the color aesthetic.)

    (The data= here uses ~ subset(.). The tilde ~ and dot . are important. You can use dplyr::filter(., ...) if you prefer.)

(For those with legends, the legend can be removed with guide=NULL or renamed with name="..", added to the scale_color_manual call.)

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Many thanks for your help r2evans. Yes, your advice is a "little" dupe and actually my dataframe is more complex... I have only simplified it to explain my issue. I think the linked answer doesn't really fix my problem... Moreover, I didn't want to select the "red point" using the label "group", but using the "item" linked to the group. Despite this, I have found useful your last suggestion (3.)! At the end, I have used the function ```subsect()``` to select a section of my dataframe by "item" and then I came to the solution I was looking for! – EmaK Jun 02 '21 at 22:47
  • You can still use the first two, changing `color=(group=="B")` to `color=(item=="item2")` for the same effect, there is no restriction on using a variable that is already assign to an axis/aesthetic. Glad it worked, regardless. – r2evans Jun 02 '21 at 23:11