-2

Today I download dataset in csv format from the Eurostat website. I load this dataset to the rstudio by read.csv command and by subseting get data I need. Now I am in situation that I have 12 observation with around 9 variables. One of the variables is value I am interested in, but the problem is value is coded as factor variable (with 754 levels).

It would be easily overcome by as.numeric command, but problem is that the numbers are in the format like this "48,478", so Rstudio don't see one number (just my guess) and if I use as.numeric command I don't get 48478 but some different number, maybe mean or else but definitely not 48478 as a number. After few minutes I realize that problem is probably with the "," and start looking for solution how to remove it.

One solution I found is that use edit command and erase it manually, but I am planning to use more subsets from the original dataset and I hope it's not necessary to every time I will make new dataset to use edit command and manually erase symbol that make me mad there.

halfer
  • 19,824
  • 17
  • 99
  • 186
Cejkaad
  • 1
  • 1

1 Answers1

0

You can read the data in and then replace the "," before converting string to numeric:

  1. Read the dataset with stringsAsFactors=FALSE:

    raw <- read.csv("a.csv",stringsAsFactors=FALSE)

  2. Converte the string to numeric (same logic as you replace the "," in editor):

    raw$number <- as.numeric(gsub(",","",raw$numberAsString)) # converte the numberAsString to numeric after substituting ","

Sixiang.Hu
  • 1,009
  • 10
  • 21