1

I'm doing auto Binning Histogram for my second time, but it looks elementary. I'm seeking help to improve it.

what I have tried is

> DAta <- read.table(text="Species DNA LINE LTR SINE Helitron Unclassified Unmasked
+ darius 2.68 10.37 18.00 1.52 3.64 0.03 63.79 
+ Derian 2.74 10.59 16.61 1.56 4.24 0.03 64.23
+ rats 2.77 10.97 15.20 1.57 4.69 0.03 64.77
+ Mouos 2.53 10.42 17.33 1.42 3.68 0.02 64.6", header=TRUE)
> library(reshape2)
> DF1 <- melt(DF, id.var="Rank")
> DF1 <- melt(DAta, id.var="Species")
> library(ggplot2)
> ggplot(DF1, aes(x = Species, y = value, fill = variable)) + 
+ geom_bar(stat = "identity") 

Output:

enter image description here

How can I make the species name in Italic?

The order of the histogram should be as the same as the input? start from left to right (darius, Derian, rats and Mouos)

Colours and style to look better and reasonable.

James Z
  • 12,209
  • 10
  • 24
  • 44
BioInfo
  • 134
  • 9

1 Answers1

0

There are 3 questions here:

  • To change the axis labels to italics, one needs adjust the x.axis.text, see the question/answers referenced at the bottom.
  • To change the ordering of the axis labels, you need to specify the variable Species as a factor variable defining the desire order of the levels.
  • Finally, to change the color scheme, use the scale_fill_ function. I like the colorBrewer package with several good color schemes available. There are few other define scale_fill options available.

Note: this a barchart and not a histogram.

See the comments for additional details:

DAta <- read.table(text="Species DNA LINE LTR SINE Helitron Unclassified Unmasked
darius 2.68 10.37 18.00 1.52 3.64 0.03 63.79 
Derian 2.74 10.59 16.61 1.56 4.24 0.03 64.23
rats 2.77 10.97 15.20 1.57 4.69 0.03 64.77
Mouos 2.53 10.42 17.33 1.42 3.68 0.02 64.6", header=TRUE)

#updated method to reshape data.  tidyr is replacement for reshape2
library(tidyr)
library(tidyr)
DF1 <- pivot_longer(DAta, cols=-1, names_to = "Classification", values_to = "Value" )

#Set Species as factors defining the order of the labels
DF1$Species<-factor(DF1$Species, levels=c("darius", "Derian", "rats", "Mouos"))
library(ggplot2)
ggplot(DF1, aes(x = Species, y = Value, fill = Classification)) + 
  geom_bar(stat = "identity") +
  scale_fill_brewer(palette = "Pastel1") + 
  theme(axis.text.x = element_text(face="italic"))

Option: If the number of columns or the naming of the columns can change then here is a potential option for maintaining the proper ordering of the Species names:

#retrieves column names from original dataframe the 2nd to the end
# assumes the columns are "Species" and then only the species names
DF1$Species<-factor(DF1$Species, levels= names(DAta)[-1]) 

enter image description here To adjust the axis labels here is a good reference: Changing font size and direction of axes text in ggplot2

Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • Thank you, you saved my time. Currently, I am not able to try it I have some troubles installing the tidyr package – BioInfo Mar 20 '20 at 23:59
  • I have managed to install the required packages. after running it the output is different I have only three species – BioInfo Mar 21 '20 at 00:43
  • Can you clarify what the difference is? In the factor statement which defines the factor levels need to match your data frame. The spelling and capitalization need to match. – Dave2e Mar 21 '20 at 01:14
  • its worked for me now. if i wanna change the fill name. ggplot(DF1, aes(x = Species, y = value, fill = name)) + TO fill = classification? – BioInfo Mar 21 '20 at 09:09
  • when i try to change it an error occurs " Error in FUN(X[[i]], ...) : object 'Classification' not found " – BioInfo Mar 21 '20 at 09:11
  • When data frames get reshaped the column names are changed. For the `pivot_longer` function the default naming is change to "name" and "value" for the pivoted columns. I edited the script above to clearly define the column names. The new names are now: "Classification" and "Value". I have updated the `ggplot` function to match. – Dave2e Mar 21 '20 at 13:45