0

Example datasetI am new to R programming and I have a data frame df which has categorical variables w, x, y, z.

I need to use ggplot to plot bar graphs for x and y excluding all zeros.

I created a table using the code

graph1 <- table(df$x, df$y) 

and I used the code

barplot(unlist(graph1[1, graph1[1, ] != 0]) 

to plot a bar chart for all non zero entries. How can I use ggplot2 to get the same result?

  • 4
    Can you include a sample of your data or a similar dataset? – nd37255 Nov 22 '22 at 12:15
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Nov 22 '22 at 14:46
  • Thank you. I have just added a picture of an example of the dataset I am working with. – Julius Julius Nov 22 '22 at 23:23

1 Answers1

0

I made a simple example to see if I could follow what you are doing. Whether or not x or y is a zero is not provided in the table itself but rather the dimension names of the tables (the function table outputs an array). So you need to select only the dimensions that are not zero. Below is my code:

df <- data.frame(x = floor(runif(1000) * 3), y = floor(runif(1000) * 4))
graph1 <- table(df$x, df$y)

barplot(unlist(graph1))
graph1

keepRow <- dimnames(graph1)[[1]] != 0
keepCol <- dimnames(graph1)[[2]] != 0

graph2 <- graph1[keepRow, keepCol]
graph2
barplot(unlist(graph2))

To do the comparable in ggplot, I find it more straightforward to work with the df directly but to create factor variables from the numeric ones:

df$xx <- as.factor(df$x)
df$yy <- as.factor(df$y)

ggplot(df, aes(x = yy)) +
  geom_bar() +
  geom_bar(aes(fill = xx))

nrow(df)
df2 <- df[as.logical(((df$x != 0) * (df$y != 0))), ]
nrow(df2)

head(cbind(df, ((df$x != 0) * (df$y != 0))))
ggplot(df2, aes(x = yy)) +
  geom_bar() +
  geom_bar(aes(fill = xx))
  • Thanks for this. My problem which made me use the table() function is that all my variables are categorical and non numeric()qualitative. – Julius Julius Nov 22 '22 at 20:38