0

I have a dataframe of students that includes their unique ids, names and test scores.

I am trying to plot the student ID and their test scores in GGPlot, sorted by student id (The graph should have student name as the x axis and test scores as the y and should be in ascending order of the ids).

Also note that there are duplicates of certain test scores (i.e. 2 students could get the same grade on a test).

I know how to plot it already I am just trying to order it by the single ID column, in ascending order. How can I go about doing this? Thanks!

Example:

ID  name test1score test2score
1   ted     92         94
2   jan     95         89
3   rob     92         96
4   jenny   92         94
5   risa    83         94
6   blake   80         90
7   court   77         89
8   aaron   98         83
9   austin  83         84
aport550
  • 119
  • 1
  • 9
  • Can't you order the data before plotting? – Rui Barradas Feb 07 '20 at 06:17
  • 1
    Hi aport550, have a look at https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1, to create a greate reproducible example ;-). Adding your current code, is a big, useful step forward. To answer your question, you should add more detail to your question, do you want to sort a (bar)plot? In that case, mutate your id column to an ordered factor (ordered as desired in the visual) – Arcoutte Feb 07 '20 at 06:18
  • Welcome to SO! Please, [edit] your question and show what you have tried so far, i.e., show the code and result. Thank you. – Uwe Feb 07 '20 at 06:45

1 Answers1

3

To order your bargraph based on the student ID and have student names in the right order, you can first rearrange your dataframe based on the Student ID and then fix factor levels of the variable "name".

There is multiple wyas of doing it, here, I'm doing it using dplyr package. (NB: I used pivot_longer function from tidyr package to reshape your dataframe into a longer format more suitable to ggplot2)

library(tidyverse)
df %>% arrange(ID) %>%
  mutate(name = factor(name, unique(name))) %>%
  pivot_longer(.,-c(ID,name), names_to = "var", values_to = "val") 

# A tibble: 18 x 4
      ID name   var          val
   <int> <fct>  <chr>      <int>
 1     1 ted    test1score    92
 2     1 ted    test2score    94
 3     2 jan    test1score    95
 4     2 jan    test2score    89
 5     3 rob    test1score    92
 6     3 rob    test2score    96
 7     4 jenny  test1score    92
 8     4 jenny  test2score    94
 9     5 risa   test1score    83
10     5 risa   test2score    94
11     6 blake  test1score    80
12     6 blake  test2score    90
13     7 court  test1score    77
14     7 court  test2score    89
15     8 aaron  test1score    98
16     8 aaron  test2score    83
17     9 austin test1score    83
18     9 austin test2score    84

Then, passing everything up to ggplot2 give the following plot:

library(tidyverse)
df %>% arrange(ID) %>%
  mutate(name = factor(name, unique(name))) %>%
  pivot_longer(.,-c(ID,name), names_to = "var", values_to = "val") %>%
  ggplot(aes(x = name, y = val, fill = var))+
  geom_col(position = position_dodge())

enter image description here

Does it answer your question ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • Hi clarified the question a bit. ggplot is still sorting it in a strange order even though ids are the first column – aport550 Feb 07 '20 at 06:45
  • I think I understand now your issue, I edited my answer accordingly. Does it answer your question now ? – dc37 Feb 07 '20 at 06:56