Most of the first arguments to the ggplot2
layer functions are reserved for the mapping
argument, which is from aes
.
So in your function definition you have a dataframe "data" being implicitly assigned to the mapping
variable.
To get around this, explicitly assign data = data
in your function definitions.
for example
output <- ggplot() +
geom_bar(data = data, mapping = mapping,
stat="identity",
position='stack')
}
EDIT:
There are many ways to do this and it really depends on how complex you want your function to be. If you are gonna stick to a global aesthetic mapping, then you can leave the mapping in the main ggplot call and assign data = NULL
, then specify which data frame will be associated with which layer.
Consider the following reproducible example
library(ggplot2)
data1 <- data.frame(v1=rnorm(10, 50, 20), v2=rnorm(10,30,5))
data2 <- data.frame(v1=rnorm(10, 100, 20), v2=rnorm(10,50,10))
plot_custom_ggplot <- function(df1, df2, mapping) {
ggplot(data = NULL, mapping = mapping) +
geom_point(data = df1, color = "blue") +
geom_line(data = df2, color = "red")
}
plot_custom_ggplot(data1,data2, aes(x = v1,y = v2))
In this example, the mapping variable for each of the geom_*
layer functions are left blank and instead the mapping is inherited from the main ggplot
call.
This is usually how each layer function knows what data to use, because generally it is inherited in the main ggplot
function. Whenever you specify a data
argument or a mapping
argument, you are generally overriding the inherited values. Any missing required aes
mappings are attempted to be found in the main call.
library(ggplot2)
data1 <- data.frame(v1=rnorm(10, 50, 20), v2=rnorm(10,30,5))
data2 <- data.frame(v1=rnorm(10, 100, 20), v2=rnorm(10,50,10), z = c("A","B"))
plot_custom_ggplot <- function(df1, df2, mapping) {
ggplot(data = NULL, mapping = mapping) +
geom_point(data = df1, color = "blue") +
geom_line(data = df2, mapping = aes(color = z)) #inherits x and y mapping from main ggplot call.
}
plot_custom_ggplot(data1,data2, aes(x = v1,y = v2))
But adding additional aes mappings is risky if you are also specifying data
. This is because you data
variable may not always contain the correct columns.
plot_custom_ggplot(df1 = data2, df2 = data1, aes(x = v1, y = v2))
#Error in FUN(X[[i]], ...) : object 'z' not found
#
#the column z is not present in data1 object -
#R then looked globally for a z object and didnt find anything.
I believe it is best practices to use tidy data when working with ggplot because things become so much easier. There is usually no reason to use multiple data frames. Especially if you plan to use one set of mapping for all data frames. A good exception is if you are writing a plotting function for a custom R object, in which you know how it is defined.
Otherwise, consider and compare how these two functions work in this example:
data1 <- data.frame(v1=rnorm(20, 50, 20), v2=rnorm(20,30,5), letters= letters[1:20], id = "df1")
data2 <- data.frame(v1=rnorm(20, 100, 20), v2=rnorm(20,50,10), letters = letters[17:26], id = "df2")
set.seed(76)
plot_custom_ggplot2 <- function(df, mapping) {
ggplot(data = df, mapping = mapping) +
geom_bar(stat = "identity",
position="stack")
}
plot_custom_ggplot <- function(df1, df2, mapping) {
ggplot(data = NULL, mapping = mapping) +
geom_bar(data = df1, stat = "identity",
position="stack") +
geom_bar(data = df2, stat = "identity",
position="stack")
}
plot_custom_ggplot(data1,data2, aes(x = letters,y = v2, fill = id))
plot_custom_ggplot2(rbind(data1,data2), aes(x = letters, y = v2, fill = id))
In the first plot, the red bars for q, r, s, and t are hidden behind the blue bars. This is because they are added on top of each other as layers
. In the second plot, these values actually stack because these values were added together in a single layer rather than two separate ones.
I hope this gives you enough information to write your ggplot function.