3

I have a CSV file, with some char fiels and numeric fields and some NaN's located in the file. I want to read the numeric fields as numeric and character fields as characters.

For example my CSV file monthly.csv is presently like this

Datum,Index,D12,E12,b/m,tbl,AAA
187101,4.44,0.2600,0.4000,NaN,NaN,NaN
187102,4.50,0.2600,0.4000,NaN,NaN,NaN
...
...
...

I am reading this with the following code

monthly <- read.csv2("monthly.csv", sep=',', header = T, na.strings = "NaN", stringsAsFactors=F)

After reading when I view the contents of monthly variable, I still see the type as

> str(monthly)
'data.frame':   1620 obs. of  7 variables:
 $ Datum     : int  187101 187102 187103 187104 187105 187106 187107 187108 187109 187110 ...
 $ Index     : chr  "4.44" "4.50" "4.61" "4.74" ...
 $ D12       : chr  "0.2600" "0.2600" "0.2600" "0.2600" ...
 $ E12       : chr  "0.4000" "0.4000" "0.4000" "0.4000" ...
 $ b.m       : chr  NA NA NA NA ...
 $ tbl       : chr  NA NA NA NA ...
 $ AAA       : chr  NA NA NA NA ...

Basically only the first field is getting converted to an int and the rest all are still chr . How do make the others too as int

Anoop
  • 5,540
  • 7
  • 35
  • 52
  • `colClasses = c("numeric", 8)` – Vlo Aug 20 '14 at 23:11
  • @Vlo, please see the edit, It says NA now, but is still a `chr` – Anoop Aug 20 '14 at 23:14
  • @Anoop Your data fit `read.csv` better than `read.csv2`, the problem might be that you're specifying both field separator and decimal marker as a comma (when the decimal marker should be a period, according to your example data). Try using `read.csv` and report back. – tkmckenzie Aug 20 '14 at 23:24
  • should it be 8 or 7 ? shouldn't it be the number of columns ? Its not working with either but.. `` – Anoop Aug 20 '14 at 23:24
  • Does this not work? `monthly <- read.table("monthly.csv", sep=',', header = T, na.strings = "NaN", stringsAsFactors=F, colClasses = rep("numeric", 7))` – Vlo Aug 20 '14 at 23:33
  • when I used `read.csv`, instead of `read.csv2` worked perfectly . Thanks – Anoop Aug 20 '14 at 23:35

1 Answers1

1

For people who face the same issue, I am posting the answer which has been answered in the comments . .

By changing read.csv2 to read.csv it worked as expected and I am getting the expected description .

> str(monthly)
'data.frame':   1620 obs. of 7 variables:
 $ Datum     : int  187101 187102 187103 187104 187105 187106 187107 187108 187109 187110 ...
 $ Index     : num  4.44 4.5 4.61 4.74 4.86 4.82 4.73 4.79 4.84 4.59 ...
 $ D12       : num  0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26 ...
 $ E12       : num  0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 ...
 $ b.m       : num  NA NA NA NA NA NA NA NA NA NA ...
 $ tbl       : num  NA NA NA NA NA NA NA NA NA NA ...
 $ AAA       : num  NA NA NA NA NA NA NA NA NA NA ...
Anoop
  • 5,540
  • 7
  • 35
  • 52
  • read.csv vs read.csv2: http://stackoverflow.com/questions/22970091/difference-between-read-csv-and-read-csv2-in-r – Vlo Aug 21 '14 at 00:00