-1

I am new to R and I have some data as below and I want to draw a histogram same as this with pkg::ggplot2 in R program (linux or Rstudio).

As you can see it is the letters from A to Z in the X axis (Function class) and the frequencies as numbers in the Y and the important point is this that each bar has its own unique color.

In addition, there is a "color help = legend" that describe each class by the same color of each bars that I am not sure if it is some characteristics of R ggplot2 package or not?

I have checked some online helps but I do not know how to insert my data in the ggplot2 and assign a unique color for each class.

my data sample:

A   5   RNA processing and modification 
B   2   Chromatin structure and dynamics 
C   18  Energy production and conversion 
D   26  Cell cycle control, cell division, chromosome partitioning
E   15  Amino acid transport and metabolism 
F   5   Nucleotide transport and metabolism 
G   13  Carbohydrate transport and metabolism 
H   6   Coenzyme transport and metabolism 
I   15  Lipid transport and metabolism 
J   20  Translation, ribosomal structure and biogenesis 
K   24  Transcription 
L   28  Replication, recombination and repair
M   18  Cell wall/membrane/envelope biogenesis 
N   1   Cell motility 
O   29  Posttranslational modification, protein turnover, chaperones 
P   19  Inorganic ion transport and metabolism 
Q   16  Secondary metabolites biosynthesis, transport and catabolism 
R   85  General function prediction only 
S   20  Function unknown 
T   32  Signal transduction mechanisms 
U   14  Intracellular trafficking, secretion, and vesicular transport 
V   6   Defense mechanisms 
Z   14  Cytoskeleton 
Farbod
  • 67
  • 3
  • 12
  • Here's an example with fake data. Just substitute the name of your real data frame and real column names. `dat = data.frame(v1=LETTERS[1:10], v2=11:20, v3=state.name[1:10]); ggplot(dat, aes(v3, v2, fill=v3)) + geom_bar(stat="identity")`. – eipi10 Sep 18 '16 at 22:09
  • Your question is essentially : howto draw barplot in ggplt2. A histogram cross-classifies items while that step has already been done. Seems clear from documentation: "There are two types of bar charts, determined by what is mapped to bar height. By default, geom_bar uses stat="count" which makes the height of the bar proportion to the number of cases in each group (or if the weight aesthetic is supplied, the sum of the weights). If you want the heights of the bars to represent values in the data, use stat="identity" and map a variable to the y aesthetic." – IRTFM Sep 18 '16 at 23:18

1 Answers1

1

Are those numbers next to the letters A-Z the height of the bars? If so, you're better of using a barplot:

library(ggplot2)
# Simulate some data
obs = rpois(10, 5)
group = factor(1:10)
df = data.frame(obs, group)
ggplot(data = df, aes(x = group, y = obs, color = group, fill = group)) + 
  geom_bar(stat = 'identity')

To get the colored bars in general, make sure your dataframe has a grouping variable (must be a factor, otherwise you get continuous color assignment) and then in aes assign color/fill the value of the column name.

Vandenman
  • 3,046
  • 20
  • 33
  • Dear Vandenman and other friends, Hi and thank you for your help: 1- Is there any way putting my data in a txt file and refer the R to that ? 2- As I am new with R I could not understand some functions same as "obs = rpois(10, 5)" as I have about 26 characters and here the numbers are 10 and 5. – Farbod Sep 19 '16 at 03:56
  • 1) see `?read.table` 2) `rpois` is a function for simulating random data from a poisson distribution with lambda = 5. I sample 10 random values from this distribution. I did this to create a small reproducible example, the actual plotting is only dependent on the ggplot part. So instead of using `rpois` try to make the dataframe `df` with your own data. – Vandenman Sep 19 '16 at 08:00
  • Dear Vandenman, Hi. I have draw the plot using barplot() that Diego has provided and the result was OK, but I do not know how to draw the "color guid" ? e.g RED(I mean a red square) = RNA processing and modification. – Farbod Sep 19 '16 at 17:20
  • You can add a legend via `legend`. See this post for an example. http://stackoverflow.com/questions/14883238/adding-simple-legend-to-plot-in-r – Vandenman Sep 19 '16 at 18:46
  • Dear Vandenman, Hi and thank you. I really do not like to ask so many questions but this R plotting task is full of issues. Now i have used the legend but first I do not know what I must use for color ? (it was as "col=c(fill=rainbow(25))" in the barplot. the second "bad-luck" is this that the legend is located on the plot !! (this is my script for legend: legend("topright", legend = c("A:RNA processing and modification", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "Y", "Z"), fill = c("?")) – Farbod Sep 19 '16 at 20:01
  • Hi, I have used "fill = rainbow(25))" and the legend color seems OK, but its location is still unsuitable. – Farbod Sep 19 '16 at 20:11
  • You can change the size of the legend with the `cex` argument. Or you can make the legend consist of multiple columns with the argument `ncol`. You could also increase the space for the legend by adjusting the plot height via `barplot(..., ylim=c(0, 400))`. Try to use those three things and then just play around with it a lot. It is kinda difficult to have a legend that consists of 26 things and is still readable. – Vandenman Sep 19 '16 at 20:33
  • Dear Vandenman, Hi. Finally I have used "ggplot2" and the problem of "Legend" has solved automatically but two other problems has occurred: 1) I can not add the main title to the plot (looks funny but I have used many types of "theme" and "opts" function-maybe Rstudio will not accept them?), and 2) the X axis label is now "FunctionClass" and when I try to put a space between them (Function Class) it shows :Error: unexpected symbol in "p <- ggplot(data=dat, aes(x=Function Class" . My script is longer than comment character size and how i can show you my script? – Farbod Sep 20 '16 at 19:23
  • You can add a title via the function `ggtitle`. Axis labels you can manually adjust via the function `scale_x_discrete`. For further information on how to use ggplot, see this website: http://www.cookbook-r.com/Graphs/ . It has many examples on many things. – Vandenman Sep 20 '16 at 19:31
  • I have these scripts : p <- ggplot(data=dat, aes(x=FunctionClass, y=Frequency, fill=legend))+ geom_bar(stat="identity", position=position_dodge(), colour="seashell") and then at the end of my program, I have this script "p + guides (fill = guide_legend(ncol = 1))" for showing the legend in one column (and I need it). when I insert the script "p + ggtitle ("Plant growth")+" befor that, there is no main title. when I put this script after that, there is a title but the one column legend transform to a ugly two column legend! and the X axis naming is still exist. – Farbod Sep 20 '16 at 19:53
  • It is unclear to me what the problem is. Perhaps ask a new question on stackoverflow? – Vandenman Sep 21 '16 at 08:46
  • OK thank for all your knowledge and kindness. – Farbod Sep 21 '16 at 09:20