0

I am making a series of bar plots using ggplot, the plots are of (for e.g) number of nests on a given day for a number of different years. I can make the plots no problem and use cowplot to arrange them in one figure, however, I want them all to be comparable and start on the same day. This is an example plot and here are the data.

dput(ringday2015) structure(list(Var1 = structure(1:37, .Label = c("42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "69", "70", "72", "73", "74", "79", "83", "85", "88", "89", "91"), class = "factor"), Freq = c(1L, 1L, 1L, 2L, 5L, 6L, 7L, 12L, 15L, 22L, 12L, 19L, 17L, 26L, 16L, 17L, 13L, 13L, 13L, 9L, 1L, 5L, 4L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, -37L ))

This plot starts on day 42, but for the next plot it starts at day 33 and then another at day 31. SO I wanted to have them all start on day 30. I've tried using scale_x_continous, scale_x_discrete, xlim and any other suggestion I've seen but none seem to have any effect on the plot, i.e. the x axis does not change or changes so much that the plot is useless. What am I missing?

Code:

ringdayplot2015<- ggplot(data = ringday2015, aes(Var1, Freq )) + geom_bar(stat = "identity") +
  xlab("April Day (days after April 1st)") + ylab("No. of nests") + ggtitle("No. of nests ringed 2015")

Above is the code I've used to make the plot included

McMahok
  • 348
  • 2
  • 13
  • Oh I've also tried *coord_cartesian* – McMahok Mar 22 '21 at 14:47
  • The data and code you posted was to build one plot? you have the same thing for several plots? Also, please post your data in a way we can run code, paste the output from `dput()` function. – Ricardo Semião e Castro Mar 22 '21 at 14:52
  • Ricardo, Ionly posted one of the plots for simplicity, they are all the same type of data, but for different years – McMahok Mar 22 '21 at 15:00
  • Ok. One more thing, in the `dput()` you posted, `Var1` is a factor, shouldn't it be a numeric? – Ricardo Semião e Castro Mar 22 '21 at 15:05
  • I tried changing it to numeric but then the data change completely `ringday2015$Var1<- as.numeric(ringday2015$Var1)` dput(ringday2015$Var1) c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37) – McMahok Mar 22 '21 at 15:10
  • Ok I think I got it, I didn't convert to character before converting to numeric and now it's working how I wanted. – McMahok Mar 22 '21 at 15:16

2 Answers2

1

From what you explained, i think the problem is because transforming a factor into numeric in R with as.numeric() normally changes the data, you need to first transform it to character then into numeric:

ringday2015$Var1 = as.numeric(as.character(ringday2015$Var1))

Then i simply added xlim(30, max(ringday2015$Freq)) and got what you wanted.

I imagine that it'd be more efficient to correct the class of Var1 at the moment of creation of the datasets, if possible.

On a different matter, it can be better to, instead of creating a plot for each dataset, merge all your datasets together and create a faceted plot. One way of doing it would be:

df = numeric() #Empty vector to create the big dataset
data.names = c("ringday2015", "ringday2016") #The names of your small datasets

#Rbinding it and adding a column saying the name of the dataset
for(i in data.names){
  df = rbind(df, cbind(get(i), Graph=i))} 


#Then plotting it:
ggplot(df, aes(Var1, Freq)) +
  geom_bar(stat = "identity") +
  facet_grid(~Graph) +
  xlim(30, max(df$Freq)) +
  xlab("April Day (days after April 1st)") +
  ylab("No. of nests") + ggtitle("No. of nests ringed 2015")

Even if you don't choose that approach, you still don't need to create each graph by repeating the ggplot code and changing the dataset, you could do:

for(i in data.names){
   assign(paste0("plot", i),
          ggplot(get(i), aes(...) + ...}

One last thing, creating a bunch of variables can make your coding poorly organized, that's why using the functions like get() and assign() should normally be a last resort. I'd be better if, instead of one variable for each dataset, you had a list containing them all, and them the loops go by items in that list, instead of by variable names. If you like, you can give me more context about how you get those datasets and i can help you with that.

  • Thanks Ricardo I will try that. – McMahok Mar 22 '21 at 15:21
  • You're welcome @McMahok, i edited the post with some new info – Ricardo Semião e Castro Mar 22 '21 at 15:29
  • I tried running your code and got this error *Error in rep(xi, length.out = nvar) : attempt to replicate an object of type 'closure'* – McMahok Mar 22 '21 at 15:30
  • In which step did that happened? "object of type closure" means that you supplied a function name where a variable name was expected, that can happen in `i` (in the loop) is a function name, so this breaks `get(i)`. As i said in my edit, it'd be better if your data frames were in a list, so the loop didn't needed to use `get(i)`. – Ricardo Semião e Castro Mar 22 '21 at 15:37
  • This happened when I tried to run `#Rbinding it and adding a column saying the name of the dataset for(i in data.names){ df = rbind(df, cbind(get(i), Graph=i))} ` – McMahok Mar 22 '21 at 15:57
  • 1
    anyway I am happy with the results, so don't worry about this part. For loops are for another day, thanks for your time and help. – McMahok Mar 22 '21 at 15:58
0

So @ Ricardo Semião e Castro was correct to point out that Var1 should have been numeric, but I did not convert to character before converting to numeric see Changing values when converting column type to numeric for the answer to this.

Once I did this the changes I wanted using xlim worked

McMahok
  • 348
  • 2
  • 13