1

This is not a duplicate of this, or this, or this.

I have a data.table that looks something like this:

animal_frame

first_and_last    animal    color
c(1, 2)           dog       red
c(2, 2)           cat       red
c(4, 2)           dog       green
c(3, 1)           dog       red
c(4, 6)           pig       green
c(3, 3)           cat       red
c(4, 2)           pig       red

animal_frame$num_entry = sample(1:nrow(animal_frame), nrow(animal_frame), replace=FALSE) gives me an indexing column.

Here, the x-axis is num_entry and the y-axis is first_and_last, resulting in two points for every tick on the x-axis. Each of these points is to be connected with a vertical line as per this question:

ggplot(data=animal_frame, aes(x=num_entry, y=first_and_last)) +
  geom_line(aes(group=num_entry, color=color)) + 
  scale_color_manual(values = c("green"="green", "red"="red"))

This works well. Now, I'd like to facet this same plot according to animal, but I want an indexing column (beginning from 1) for each animal. So, using dplyr, I run:

animal_frame %<>%
  group_by(animal) %>%
  mutate(facet_num_entry = sample(1:n(), n(), replace=FALSE)) %>%
  ungroup()

Now, I try:

ggplot(data=animal_frame, aes(x=facet_num_entry, y=first_and_last)) +
  geom_line(aes(group=facet_num_entry, color=color)) + 
  scale_color_manual(values = c("green"="green", "red"="red")) +
  facet_grid(animal ~ .)

But receive geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?

When I look at the data frames, it looks like when I add the num_entry column, there are two entries for every sampled number (I suspect this comes from the fact that each entry in first_and_last is a vector. This appropriately gives me two observations to group by—and thus two points to draw a vertical line between.) On the other hand, when I add the facet_num_entry column, there's only one entry for every sampled number. I think there may be something going on with the collapsing of first_and_last? But I've been screwing with this for a while and can't figure it out.

Also, if there's an easier way to structure my data such that these vertical lines are possible, feel free to suggest it. I couldn't find anything as easy as making first_and_last a column of vectors.

AmagicalFishy
  • 1,249
  • 1
  • 12
  • 36
  • what's the downside of separating `first_` and `_last` ? – Stephen Henderson Apr 27 '18 at 13:37
  • @StephenHenderson I thought that might be easier, but I wasn't able to get a vertical line segment through those two points in the same way—so, I suppose, the downside would be that I wouldn't know how to connect them w/ a vertical line segment using `geom_line` (or whatever else might be used). – AmagicalFishy Apr 27 '18 at 13:44
  • Does your index need to be random, the way it would be from `sample`? Or do you want to just number the rows, which you could do for each group independently? – camille Apr 27 '18 at 14:53
  • @camille It doesn't *have* to be random, but if I don't randomize it, R orders them according to the first element of `first_and_last`, giving the impression of a relationship that isn't actually there (going from smallest to largest). (On a related note, though, I've solved my problem using `geom_segment`! I'll write up an answer when I finish) – AmagicalFishy Apr 27 '18 at 14:57
  • Okay. Try `df %>% group_by(animal) %>% mutate(index = row_number())`. That gives a numbered index within each group – camille Apr 27 '18 at 15:00

0 Answers0