0

I have a column from a data frame (that contains a set of estimated proportions of cell counts) for which class() returns "factor" and column (that contains the actual cell counts) from another for which class() returns "numeric". As I have to plot these against one another to see if there's a relationship between them. Hence I have to convert the factor entities to numerics:

   > class(proportions$Neutrophils)
   [1] "factor"
   > head(proportions$Neutrophils)
   [1]           2.3  14.9          
   178 Levels:  #VALUE! 0.0 0.4 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 ... abs neutrophils
   > head(as.numeric(proportions$Neutrophils)) #notice that the numbers are completely trans
   [1]  1  1 82 57  1  1
   > head(as.numeric(proportions$Neutrophils))
   [1]  1  1 82 57  1  1
   > max(as.numeric(proportions$Neutrophils)) #factors converted to numeric
   [1] 176

I have arranged the patterns of the columns in such a way that the corresponding values align:

   ptr<-match(sample.details$barcode1, proportions$barcode2)
   proportions<-proportions[ptr,]

I convert the factor column to numeric and plot:

   plot(proportions$Neutrophils, as.numeric(SPVs[,7]), pch=19, ylab= "proportion estimates", xlab="counts", main="Neutrophils Proportions Validation")

enter image description here

When I don't convert it to numeric though:

   plot(proportions1$Neutrophils, SPVs[,7], pch=19, ylab= "proportion estimates", xlab="counts", main="Neutrophils Proportions Validation")

enter image description here

What is worrying about the graphs are analogous and yet the x-axis on the second graph is not arranged in ascending order...

All I want is an estimate of whether the two columns are related but if the order is mixed up there is no way of telling this...

How do I ensure that the order of the x-axis values are ascending?

johnny utah
  • 269
  • 3
  • 17
  • How did you wind up with a factor? That seems wrong. They are being sorted "alphabetically". You can convert back to numeric with `proportions$Neutrophils <- as.numeric(as.character(proportions$Neutrophils))` – MrFlick Apr 27 '16 at 17:56
  • Basically a duplicate of: http://stackoverflow.com/questions/3418128/how-to-convert-a-factor-to-an-integer-numeric-without-a-loss-of-information – MrFlick Apr 27 '16 at 17:57
  • Hi MrFlick, thanks for the suggestion I used both yours and the recommended post's suggestion but in both there were still NA values... is this inevitable with this kind of operation? – johnny utah Apr 27 '16 at 19:02
  • What values were changed to NA? I'm guessing you had some non-numeric values in there. From the output above it looks like you had at least one level name "#VALUE!" which cannot very easily be turned into a number. You really should provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more precise assistance. – MrFlick Apr 27 '16 at 19:04
  • I read the proportions data from an excel file with "proportions<-read.xls("expression_data.xlsx", 1)" I have now added "stringsAsFactors=FALSE" but it didn't make a difference... otherwise I didn't do anything to the data... – johnny utah Apr 27 '16 at 19:29

0 Answers0