1

I've got a copy of a figure that needs error bars added to it. My data is coming from one CSV file and I've tried separating this into 4 distinct, workable sections for 4 depth profiles (ligand and logK; station 5 and station 9). The following code is for one of these profiles in hopes that I'll be able to recreate this solution 3 more times. Right now, my code looks like this, where I've read in a CSV file, transformed it from wide to long, and plotted using the ggplot function. However, I'm having trouble visualizing how to add horizontal error bars without ggplot thinking the columns for error bars are actual points I want to plot on the graph. I have a feeling it has something to do with my data wrangling in the beginning, but I'm not sure what. (Note: this is my first post here, so if it's not an actual reprex, please let me know!! I tried my best to be clear, but if it's not I will try and amend).

What I have so far...without error bars

Note: the actual plot I have has many of the aesthetics adjusted as well, but to try and cut down on code, I left those lines out.

Figure with error bar data plotted as points on the graph This figure has an adjusted line for gather() function where the error columns are included as a part of the measurement column of the long data; listed below for reference

station5_L1_long <- gather(station5_L1, col_names, measurement, dFe:L1_diff_from_mean, factor_key=TRUE)

Figure with adjusted aesthetics

library(ggplot2) #using the ggplot package to plot
library(magrittr) #using the magrittr package to pipe
library(tidyr) #using the tidyr package to convert between wide and long data forms

dput(ligand_data[1:10, ]) #ligand data frame including both station 5 and 9

#dput() result
structure(list(Station = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("5", "9"), class = "factor"), Depth = c(2L, 
2700L, 3000L, 30L, 3300L, 3600L, 3900L, 4200L), dFe = c(0.31, 
0.65, 0.66, 0.3, 0.65, 0.62, 0.61, 0.61), L1ship_nM = c(1.265, 
1.46, NA, 1.365, NA, NA, 1.33, NA), L1lab_nM = c(1.32, 1.93, 
1.92, 1.35, 2.23, 1.99, 1.8, 2.4), L1A_nM = c(1.18, 1.37, NA, 
1.39, NA, NA, 1.36, NA), L1B_nM = c(1.35, 1.55, NA, 1.34, NA, 
NA, 1.3, NA), L1all_nM = c(1.283333333, 1.616666667, NA, 1.36, 
NA, NA, 1.486666667, NA), L1freeze2013_nM = c(1.52, NA, NA, NA, 
NA, NA, NA, NA), L1_allfreeze_nM = c(1.42, 1.93, NA, 1.35, NA, 
NA, 1.8, NA), L1_ALL_nM = c(1.3425, 1.616666667, NA, 1.36, NA, 
NA, 1.486666667, NA), L1shipSD_nM = c("0.120208153", "0.127279221", 
"", "0.035355339", "", "", "0.042426407", ""), L1allSD_nM = c(0.090737717, 
0.285890422, NA, 0.026457513, NA, NA, 0.273007936, NA), L1_allfreezeSD_nM = c(0.141421356, 
NA, NA, NA, NA, NA, NA, NA), L1_ALL_SD_nM = c(0.139612559, 0.285890422, 
NA, 0.026457513, NA, NA, 0.273007936, NA)), row.names = c(NA, 
8L), class = "data.frame")

##################### CLEANING DATA ###########################
ligand_data <- merge(base3_cols, ligand, by.x = 0, by.y = 0, all.x = TRUE) %>%
  select(-Row.names) 

#filtering for depth profiles
station5_L1 <- ligand_data %>% filter(Station == 5) #filtering ligand df to include just station 5
#changing from wide to long
station5_L1_long <- gather(station5_L1, col_names, measurement, dFe:L1_diff_from_mean, factor_key=TRUE) 

################################ PLOTS w/ggplot ##################################
station5L1_depth_profile <- ggplot(data = station5_L1_long,  
                                        aes(color = col_names, 
                                            shape = col_names,
                                            fill = col_names,
                                            size = 0.25)
) + 
  geom_point(mapping = aes(
    x = as.numeric(measurement),
    y = as.numeric(Depth),
    size = 0.25
  )) +
  scale_y_reverse() + 
  guides(size = FALSE) + 
  scale_x_continuous(position = "top", breaks = scales::breaks_width(0.5)) + #moves x-axis to top 
  expand_limits(x = c(0, 2.5)) + 
  scale_shape_manual(values=c(4, 21, 21, 23, 22, 25, 24, 13, 5))+
  scale_fill_manual(#labels = c("dFe", "A", "B", "ship", "lab", "2013", "all pre-2013", "all"),
                    values=c("#000000", "#74ADD1", "#D73027", "#4575B4", "#ABD9E9", 
                             "#E0F3F8", "#F46D43", "#313695", "#000001")) +
  scale_color_manual(values=c("#000000", "#000000", "#000000", "#000000", "#000000", 
                            "#000000", "#000000", "#000000", "#000000"))
station5L1_depth_profile 
  • 3
    Rather than sharing the output of `head()`, use something like `dput()` and skip the `read.csv` parts we can't run because we won't have the file. See [how to create a reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more info. How exactly do you want to calculate the error bar distances? – MrFlick Apr 06 '21 at 19:28
  • Hi @MrFlick! Thank you for your response. I've updated the post to include the adjustments you mentioned. Hopefully this lets your run the code on your computer. I'd like the error bars to come from pre-calculated columns (these include diff_from_mean in the column name). This was part of the issue I was having since the error bars are coming from data already in the df and ggplot was trying to plot these data as actual points if used in the long format. – Calyn Crawford Apr 16 '21 at 17:18

1 Answers1

2

hope I understand your problem correctly. Here would be my solutions. First I split the data again into two dataframes, as it easier to handle the pivoting. Next up I pivot them into long format and prepare them for joining.

After joining I can plot it and calculate a line range.

Please try to also explain your dataset next time, as its hard do unterstand without any further information.

library(tidyverse)

# make everything numeric
# as far as I can see this makes sense
df <- df %>%
  mutate(
    across(everything(), as.numeric)
  )
# For easier manipulating we split the df
main_df <- df %>% select(Station:L1_ALL_nM)
sd_df   <- df %>% select(Station:Depth, L1shipSD_nM:L1_ALL_SD_nM)
# now we pivot longer
main_df <- main_df %>% 
  pivot_longer(cols = dFe:L1_ALL_nM, names_to = "col_names", values_to = "val")
sd_df <- sd_df %>%
  pivot_longer(cols = L1shipSD_nM:L1_ALL_SD_nM, names_to = "col_names", values_to = "sd") %>% 
  mutate(
    # remove SD from string, we dont need it
    col_names = str_replace_all(col_names, "SD", "")
  )
# join the tables
plot_df <- main_df %>% full_join(sd_df)

# Plot our result
plot_df %>% 
  ggplot(
    aes(y = Depth, x = val, color = col_names, shape = col_names, fill = col_names)
  ) +
  geom_pointrange(
    aes(xmin = val - sd, xmax = val + sd)
    )

enter image description here