2

Here is my task:

There is some structual issues with this dataset, write a function, named 'convert_number' that will accomplish the following: - change the numbers in a column so that the ',' is a '.' - convert that column to a double


convert_number <- function(data, col) {
countries[col] <- as.character(countries[col])
  countries[col] <- scan(text=countries[col], dec=",", sep=".") (countries[col] <- as.double())
}

convert_number("countries", "Net.migration")

I first did the following:

countries$Net.migration <- sub("^$", "0", countries$Net.migration)

in order to change all the blanks to "0"s so that I can switch out the comma, but I realize a question further down in my assignment asks the amount of NAs in a column so I can't have "0"s in those cells. I am guessing there is a better way to do it than scan(text=...)?

I'm a beginner (especially with functions) and I think I am overlooking a simpler way to do this.

Here is a sample: tail(countries, 5)

tail(countries, 5)

user12554068
  • 31
  • 1
  • 4
  • 1
    What have you tried so far? Also, please tag your question with the relevant language (Java? Python?) – David Brossard Dec 17 '19 at 18:33
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Normally you would replace character with something like `gsub()` in R. – MrFlick Dec 17 '19 at 18:39
  • I have edited my question to address your comments – user12554068 Dec 17 '19 at 18:51

3 Answers3

3
convert_number <- function(x){
    x <- as.character(x)
    x <- gsub(pattern = ",", replacement = ".",x = x, fixed = TRUE)
    x <- as.numeric(x)
    return(x)
}

This function will be vectorized so you could call it like this

data$Coastline..coast.area.ratio <- convert_number(data$Coastline..coast.area.ratio)

or you could call it with an apply to all columns

data <- apply(data, 2, convert_number)
Justin Landis
  • 1,981
  • 7
  • 9
0

Here are some steps to accomplish this without creating your own function

#random numbers with commans
charnum <- c("324,34","345435,50","234324",NA)

#switch commas with .
charnum3 <- gsub(",",".",charnum)
#change to numeric
charnum3 <- as.numeric(charnum3)
#sum missing values
sum(is.na(charnum3))

update to put in a function

df <- data.frame(charnum, stringsAsFactors = FALSE)
convert_number <- function(data,col) {
  x1 <- gsub(",",".",data[[col]])
  x2 <- as.numeric(x1)
  return(x2)
}

df$charnum2 <- convert_number(df,"charnum")
Mike
  • 3,797
  • 1
  • 11
  • 30
  • Unfortunately my assignment requires a function which is the part I'm having trouble with :/ Any tips on how to incorporate this code with a function? – user12554068 Dec 17 '19 at 18:56
  • 1
    @user12554068 I updated the answer to address your issue (hopefully) – Mike Dec 17 '19 at 19:00
-1

Here is an answer with a reproducible example

df <- data.frame("V1" = c("2,78", "3,54", "1,09", "0,08"),
                 "V2" = c("2,78", NA, NA, "0,08"),
                 "V3" = c("23,78", "31,54", "11,09", "88,08"))
my_fun <- function(x){
  x[, c(1:length(x))] <- lapply(x[, c(1:length(x))], function(k){
    a <- gsub(",", ".", k, fixed = TRUE)
    b <- as.numeric(a)
  })
  return(x)
}

res <- my_fun(df)

print(res)
    V1   V2    V3
1 2.78 2.78 23.78
2 3.54   NA 31.54
3 1.09   NA 11.09
4 0.08 0.08 88.08
Harry Smith
  • 267
  • 1
  • 11