0

I am trying to read the simplest of csv files but there seems to be a problem when trying to convert the data to numerical. This is the minimal code and output:

initialsoltext=read.csv('initialsol.txt', header = FALSE, sep = "")
initialsoltext

output:  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 [1  0  0  0  0  1  0  0  1  0]
as.numeric(initialsoltext)
output:  [1] 1 0 0 0 0 1 0 0 1 1

The csv file is not the issue since the text is read properly - what can possibly happen during the conversion of the last 0? I tried replacing the "[" and "]" thinking that maybe the last bracket was read as a 1 but it doesn't make a difference. Thanks to anyone who can help,

s_scolary
  • 1,361
  • 10
  • 21
Victoire
  • 3
  • 2
  • Is this exactly what you are seeing in R? This doesn't look right. If you indent your code input/output with 4 spaces it should look more like it does in R. – MrFlick Mar 29 '17 at 19:55
  • `as.numeric` should never add an additional element to a vector it's called on. Can you provide the file you're reading in so we can try to reproduce? – Pdubbs Mar 29 '17 at 19:57
  • 1
    It will be easier for us to help you if you provide us with [a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). That way we can run the code and explore your issue. – Julian Wittische Mar 29 '17 at 19:58
  • I am new to this and cannot find how to attach a file in an easy way. It is a text file with just the following : [1 0 0 0 0 1 0 0 1 0] – Victoire Mar 29 '17 at 20:00
  • It is reading the first and last values as factors (as they are attached to the square brackets), and the `as.numeric` is then reproducing the factor number, which is 1, as there is only one value. Try `str(initialsoltext)` to check the variable types. – Andrew Gustar Mar 29 '17 at 20:06
  • Are you reading in a csv? If so you should change `sep = ","`. That might be having an impact on your file import – s_scolary Mar 29 '17 at 20:16
  • @AndrewGustar : correct, it is reading the first and last one as factors. The brackets are the problem. Assuming there is no way to write that file without the brackets, how can I read the file without the brackets? I am exploring the read.csv options as well as gsub function but so far no luck. – Victoire Mar 29 '17 at 20:20
  • It is probably easier to read it in as you are and then remove the brackets with something like `initialsoltext2 <- as.numeric(gsub("[\\[\\]]","",as.character(initialsoltext)))` - this produces a vector of integers that you can then convert to a data.frame if you wish – Andrew Gustar Mar 29 '17 at 20:26

2 Answers2

1

It is reading the first and last values as factors (as they are attached to the square brackets), and the as.numeric is then reproducing the factor number, which is 1, as there is only one value. Try str(initialsoltext) to check the variable types. To avoid this, add stringsAsFactors=FALSE to the read.csv arguments.

It is probably easier to read it in as you are and then remove the brackets with something like initialsoltext2 <- as.numeric(gsub("[]\\[]","",as.character(initialsoltext))) - this produces a vector of integers that you can then convert to a data.frame if you wish

Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32
1

You can use lapply to apply a function to all columns of a data.frame. To read the file using read.csv and then apply the gsub function Andrew Gustar wrote to all columns, try:

initialsoltext <- read.csv('initialsol.txt', header = FALSE, sep = "",
  stringsAsFactors = FALSE)
initialsoltext[] <- lapply(initialsoltext, function(x) {
  as.numeric(gsub(pattern = "[\\[\\]]", replacement = "", x, perl = TRUE))})

> initialsoltext
  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1  1  0  0  0  0  1  0  0  1   0
Community
  • 1
  • 1