1

I'm trying to build several graphs of element x Al2O3 using one column (Sample) for the shape and another (Feature) for the color of the points.

This is the database with 19 rows and only two elements (Al3O3 and MgO):

  dput(SenAssuncao)
structure(list(Sample = c("SA-2021-04", "SA-2021-04", "SA-2021-04", 
"SA-2021-04", "SA-2021-04", "SA-2021-03", "SA-2021-03", "SA-2021-03", 
"SA-2021-03", "SA-2021-03", "SA-2021-02", "SA-2021-02", "SA-2021-02", 
"SA-2021-02", "SA-2021-02", "SA-2021-01", "SA-2021-01", "SA-2021-01", 
"SA-2021-01"), Feature = c("Light", "Light", "Mix", "Mix", "Layer", 
"Mix Light I Ti", "Mix II", "Main Ca", "Mix II", "Mix Light I", 
"Dark", "Blur", "Dark Cl", "Fracture Light", "Mix Mass", "Be pure", 
"Be pure", "Be rock", "Be rock"), Al2O3 = c(1.0381, 0.8721, 8.7012, 
11.3049, 1.2254, 10.7386, 15.025, 3.72, 17.7018, 11.258, 2.6827, 
14.9632, 2.253, 3.2849, 0.9544, 22.1522, 20.7351, 20.5441, 22.6549
), MgO = c(1e-06, 1.327, 1e-06, 1e-06, 1e-06, 1.1713, 1.0625, 
1e-06, 1.6261, 1.5062, 1e-06, 1e-06, 1e-06, 1.4719, 1.4962, 1.237, 
2.0311, 0.8032, 1e-06), LegendGroup = c("SA-2021-04 Light", "SA-2021-04 Light", 
"SA-2021-04 Mix", "SA-2021-04 Mix", "SA-2021-04 Layer", "SA-2021-03 Mix Light I Ti", 
"SA-2021-03 Mix II", "SA-2021-03 Main Ca", "SA-2021-03 Mix II", 
"SA-2021-03 Mix Light I", "SA-2021-02 Dark", "SA-2021-02 Blur", 
"SA-2021-02 Dark Cl", "SA-2021-02 Fracture Light", "SA-2021-02 Mix Mass", 
"SA-2021-01 Be pure", "SA-2021-01 Be pure", "SA-2021-01 Be rock", 
"SA-2021-01 Be rock")), row.names = c(NA, -19L), class = c("tbl_df", 
"tbl", "data.frame"))

Using this code, only 9 out of 19 rows are plotted:

SenAssuncao$LegendGroup <- paste(SenAssuncao$Sample, SenAssuncao$Feature, sep = " ")

LableSymbols <- c(17,17,17,17,17, # 5 SA-2021-04
                  15,15,15,15,15, # 5 SA-2021-03
                  18,18,18,18,18, # 5 SA-2021-02
                  20,20,20,20) # 4 SA-2021-01

LabelFeatures <- as.character(SenAssuncao$LegendGroup)
  
    print(ggplot(SenAssuncao) + 
            geom_point(aes(x= Al2O3, y= MgO,
                           shape = LegendGroup, colour = LegendGroup)) + 
            ylab("MgO (wt%)") + xlab("Al2O3 (wt%)") +
            scale_y_continuous(labels=function(x) format(x, scientific = FALSE)) +
            scale_color_manual(name = "Sample and Feature",
                               labels = LabelFeatures,
                               values = colorRampPalette(brewer.pal(19, "Set1"))(length(SenAssuncao$LegendGroup))) +
            scale_shape_manual(name = "Sample and Feature",
                               labels = LabelFeatures,
                               values = LableSymbols) +
            guides(color = guide_legend(ncol=2)) +
            theme(legend.position="right", legend.text = element_text(size = 6),
                  legend.title = element_text(size = 7)))

So I'd like to know what I need to do to plot the entire dataset.

aosmith
  • 34,856
  • 9
  • 84
  • 118
jlima.geol
  • 13
  • 3
  • 3
    Welcome to SO! It seems like your problem is unrelated to the looping through columns, so I'd recommend simplifying the code to just one y variable; maybe MgO because it comes 1st. :) Add an example dataset for the columns needed for that plot. See some ideas on how to do this [here](https://stackoverflow.com/a/5963610/2461552). I think the `dput()` option may help you (also see `datapasta::df_paste()`, which I love). The gist of your question appears to be about combining shape and color legends, so only provide a handful of rows that demonstrate the problem, not the whole thing. – aosmith Oct 06 '21 at 14:51
  • Hi, thank you for the suggestions. Since I'm new in programming, I'm afraid if I write the code without looping it, I won't be able to write it again afterward. But I provided a part of the dataset as you said. And since it seems to work only for the first 16 rows, I put 20 of them – jlima.geol Oct 07 '21 at 11:17
  • Can you `dput()` the data or something similar? The key is getting the data in a format so someone can take it and paste it right in to R. Also, definitely keep your original code! :-D Making code just for a SO question means writing a little bit of separate code. It can feel like work at first when it seems easier to just to put all your code in, but making simple examples really can increase folks interest in helping. It's the classic "help us help you" phenom. :) – aosmith Oct 07 '21 at 14:21
  • Hi! Thank you again for your comments. If I understood them well, I guess now I have what you suggested =D – jlima.geol Oct 20 '21 at 10:59
  • Great, good job adding the dataset and simple example! I see right now you are making a legend that has multiple entries for the same value. Like "SA-2021-04 Light" is repeated twice, etc. What is your ideal legend? I understand you want samples to have different shapes but to color by feature. Does this mean, e.g., all "SA-2021-04" will have the same shape across features and "Light" will all have the same color across samples? – aosmith Oct 20 '21 at 14:55
  • Thank you! My ideal legend is actually what I get from the code, but the problem is that it doesn't plot all the features. Yes, I want samples with different shapes and features with different colors. But for example, the feature "Light" is only for "SA-2021-02". I've just seen that the dataset I put here had a restricted number of features, which makes them repeated. When I choose other rows with different features (I updated the dataset), it gives me exactly what I want. However, when I do the same inside the loop, for all the elements in the database, is doesn't plot all the features.... – jlima.geol Oct 21 '21 at 08:44
  • I found out that when I put the code inside the loop, it only plots the first 16 rows of each element. 16 is the number of features in the entire dataset. But the 16 first rows only encompass samples SA-2021-04 and SA-2021-03... I don't know what the problem might be – jlima.geol Oct 22 '21 at 09:03
  • Maybe you need to set the `limits` of your scales based on all label features? Like `limits = LabelFeatures` from the whole dataset. Then you can remove `labels`. Setting limits will allow extra legend items that aren't in your dataset. This does not repeat items though, so will only, e.g., only show "SA-2021-04 Light" in the legend once. You'll need to check that your symbols line up right (but it appears that they might). – aosmith Oct 22 '21 at 13:45
  • 1
    aosmith, it worked now! Thank you SO MUCH for your patience and enormous help. I really appreciate it =D =D =D – jlima.geol Oct 23 '21 at 11:41

0 Answers0