0

I took in an open access data, but I'm unable to convert column 5 to 50 to numeric.

file_url <- "https://genelab-data.ndc.nasa.gov/genelab/static/media/dataset/GLDS-138_metabolomics_mx%20367428_NASA_bacteria%20cells_09-2017_submit%20.csv?version=1"
dst1 = 'GLDS-138_metabolomics_mx 367428_NASA_bacteria cells_09-2017_submit.csv'
download.file(file_url, dst1)
Bdata <- read.csv(dst1, stringsAsFactors = FALSE)

Bdata <- t(Bdata)
Bdata <- Bdata[-c(2:7, 114:118),]
Bdata <- Bdata[,-c(1,2,6,8)]
Bdata[1,1:4] <- Bdata[2,1:4]
Bdata <- Bdata[-c(2),]

columnName <- Bdata[1,]
rowName    <- Bdata[,1]
colnames(Bdata) <- columnName
rownames(Bdata) <- rowName
Bdata <- Bdata[-1 ,]
Bdata <- as.data.frame(Bdata)

Bdata[,5:50] <- as.numeric(as.character(Bdata[,5:50]))

I've tried numerous method, most of which either coerces NAs or changes the information.

Anyone know how to solve this?

strki
  • 27
  • 4
  • Hey, thanks for providing a reproducible example. The example would have been even better, if it had been a **minimal** reproducible example. 700 variables x 100 observations is a little overkill. ;) – Georgery Feb 20 '20 at 08:34
  • Oh right, I'll keep that in mind next time! Thanks – strki Feb 20 '20 at 08:36

2 Answers2

1

You need this instead of your last line of code:

Bdata[,5:50] <- as.data.frame(lapply(Bdata[,5:50], function(x) as.numeric(as.character(x))))

The reason the code above didn't work is that it tries to turn a dataframe (Bdata[,5:50]) into a numeric vector. But what you actually want is to turn each column into a numeric vector. lapply() applies the passed function to every element of a list (a dataframe actually is a special list) and returns a list. That's why afterwards it needs to be converted into a dataframe again.

Georgery
  • 7,643
  • 1
  • 19
  • 52
0

You can try the following code

Bdata[,5:50]<- `class<-`(as.matrix(Bdata[,5:50]),"numeric")
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81