1

I have a tibble data frame; (for simplicity's sake, [9x4]).

I want to output a ggplot (or unique dataframes to then pipe to a plot function) for each group of rows with matching variables in column A

#Example dataset (simplified):
#(headings across columns are "A", "B" "C" "D")

Excel<- read_excel("C:/...)

#Excel file:
[A   B    C    D   ]
[100 a100 a200 a300]
[110 a110 a220 a330]
[111 a111 a222 a333]
[100 b100 b200 b300]
[110 b110 b220 b330]
[111 b111 b222 b333]
[100 c100 c200 c300]
[110 c110 c220 c330]
[111 c111 c222 c333]

--- 

#For example, I want to output all rows where A=100
#Then pipe to a three-part line graph using ggplot:
[100 a100 a200 a300]
[100 b100 b200 b300]
[100 c100 c200 c300]

The catch is, I cannot manually type in A=100, A=110, A=111... because I have over 9000 variations of A to plot.

I was thinking to subset by matching A values, but I dont know how to do that without typing in all the A-values. Or the filter command, but again, I dont know how to do that without typing all the values of A...

E1i
  • 13
  • 5
  • 1
    You can do `split(df1, df1$A)` to split it to a `list` of data.frames – akrun Oct 04 '20 at 21:31
  • You can also use `by(df1, df1$A, my_plot_function)` to split and pass splits into function that receives a df as parameter. – Parfait Oct 04 '20 at 22:56

3 Answers3

0

group_by from dplyr will work with ggplot.

The example below uses iris and creates plots for each unique value in the species column with Petal.Width on the x-axis and Petal.Length on the y-axis

library(dplyr)
library(ggplot2)

input <- iris

output <-input %>% 
    group_by(Species) %>% 
    do(plots=ggplot(data=.) +
    aes(x=Petal.Width, y=Petal.Length) + 
    geom_point() + 
    ggtitle(unique(.$Species)))

group_by column A and this method will let you pipe the groups that are created into your desired plot.

Unable to provide a plot specific example without more information, I can amend this if you provide more details.

jomcgi
  • 96
  • 2
0

Plotting with ggplot2 works well when you have data in long format. Since A is your key column here we can keep that column as it is and bring rest of the column in longer format.

library(tidyverse)
long_data <- df %>% pivot_longer(cols = -A)

As far as plotting is concerned the question is not specific about what exactly needs to be done but you can create a line graph with different color for each value in A.

ggplot(long_data) + aes(name, value, color = A) + geom_line()

Or for more complicated stuff we can split the data for each unique value in A and then create a list of plots with map.

plot_list <- long_data %>% group_split(A) %>% map(~ggplot(.) + more plot code)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
-1

It looks like you want to filter and transpose your data:

library(dplyr)


have <- data.frame(column_a = c(100, 110, 111, 100, 110, 111, 100, 110, 111),
                   column_b = c("a100", "a110", "a111", "b100", "b110", "b111", "c100", "c110", "c111"),
                   column_c = c("a200", "a220", "a220", "b200", "b220", "b222", "c200", "c220", "c222"),
                   column_d = c("a300", "a330", "a333", "b300", "b330", "b333", "c300", "c330", "c333")
                   )


filtered_df <- have %>%
  dplyr::filter(column_a == 100)
  
transposed_df <- as.data.frame(t(filtered_df)) # t() is a function from base R

transposed_df now contains:

enter image description here

The reason you want your data in this format is so that you can now plot each row as a series, and in order to do that it's much easier when they are represented as columns.

Alternatively, without transposing, all you need to do is the filter step, giving you filtered_df:

enter image description here

Davide Lorino
  • 875
  • 1
  • 9
  • 27
  • 1
    I'm not sure how they want to plot it, but since they mentioned ggplot, I don't think what you've said about plotting each row as a series makes sense. Also, you're filtering manually for a specific value of A, but they said they *don't* want to do that: " I cannot manually type in A=100, A=110, A=111... because I have over 9000 variations of A to plot." – camille Oct 04 '20 at 23:09
  • Thanks for the solution. It wont work for our situation because we cannot input all the A values. There are too many to type in. – E1i Oct 12 '20 at 20:37