0

I'm looking to create a plot of multiple histograms in R. However, rather than each box plot being defined by a category on the x-axis, I want it to be defined by a data range (specifically 20 different ranges of water depth from approximately 0 to 0.6m).

I would like the series of box plots to display water depth range on the x-axis (x-x1, x1-x2, x2-x3, etc) and electrical conductivity on the y axis.

I have already created quantile based breakpoints using the following code:

x=dataset$Depth
ncells=20
breaks=quantile(x, seq, (0,1,by=1/ncells))
table(cut(x, breaks=breaks))

(0.155,0.158] (0.158,0.162] (0.162,0.166] (0.166,0.168]  (0.168,0.17] 
         1116          1116          1116          1116          1116 
 (0.17,0.171] (0.171,0.173] (0.173,0.174] (0.174,0.175] (0.175,0.176] 
         1116          1117          1116          1116          1116 
(0.176,0.177] (0.177,0.179] (0.179,0.185] (0.185,0.187] (0.187,0.189] 
         1116          1116          1116          1117          1116 
(0.189,0.191] (0.191,0.194] (0.194,0.206] (0.206,0.244] (0.244,0.592] 
         1116          1116          1116          1116          1117 

These ranges are what I'd like to use to define the limits of each boxplot in the series.

Do I need to somehow label these so they are seen as discrete categories in the dataset (e.g.flow1,flow2,flow3...etc) or is there someway of using these data ranges as they are to define the data included in each box plot in the series?

Ideally I'd like to plot in ggplot2 if that makes any difference.

Many thanks in advance!

Mark
  • 7,785
  • 2
  • 14
  • 34
Elle
  • 3
  • 2
  • 2
    Hi Elle! Welcome! please provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Mark Jun 26 '23 at 12:13
  • Thanks very much. I didn't know how to do this but have just read up so should be able to next time! Thanks for doing it for me this time! – Elle Jun 26 '23 at 14:43
  • The edits are nice for formatting, but it is certainly not reproducible! (You want electrical conductivity on the y axis, no data for that is in your question!) It would be great if you could edit your question to include something like `dput(dataset[1:20, c("Depth", "your_electrical_conductivity_column_name")]` so that we have sample data from the relevant columns. – Gregor Thomas Jun 26 '23 at 14:46
  • But generally, add the `cut` categories as a column in your data frame, then `ggplot(your_data, aes(x = your_cut_categories, y = electrical_conductivity)) + geom_boxplot()` – Gregor Thomas Jun 26 '23 at 14:47

0 Answers0