0

I have tried to remove the comma in the price column by the following code:

property2 %>%
    select(Price) %>%
    str_remove_all(",")

but it turns out to return something like this:

\" \"525000\" \"300000\" \"490000\" \"4100000\" \"750000\" \"2130000\" \"585000\" \"2480000\" \"710000\" \"565000\" \"1400000\" \"880000\" \"3500000\" \"1230000\" \"3150000\" \"499000\" \"480000\" \"475000\" \"2700000\" \"6500000\" \"5100000\" \"5000000\" \"5500000\" \"480000\" \"540000\")"
Warning message:
In stri_replace_all_regex(string, pattern, fix_replacement(replacement),  :
  argument is not an atomic vector; coercing

the data information

      Location Price Rooms add_rooms Bathrooms `Car Parks`
   <chr>    <chr> <chr> <chr>         <dbl>       <dbl>
 1 KLCC     1,25~ 2     1                 3           2
 2 Damansa~ 6,80~ 6     NA                7          NA
 3 Dutamas  1,03~ 3     NA                4           2
 4 Cheras   NA    NA    NA               NA          NA
 5 Bukit J~ 900,~ 4     1                 3           2
 6 Taman T~ 5,35~ 4     2                 5           4
 7 Seputeh  NA    NA    NA               NA          NA
 8 Taman T~ 2,60~ 5     NA                4           4
 9 Taman T~ 1,95~ 4     1                 4           3
10 Sri Pet~ 385,~ 3     NA                2           1
K.W. LEE
  • 63
  • 5
  • There's something strange with your data structure. Please share sample input, `dput(head(property2["Price"], 5))` would be perfect as it will be copy/pasteable and include all relevant class and structure information. – Gregor Thomas Feb 26 '21 at 15:37
  • How was the column "price" created? – TarJae Feb 26 '21 at 16:03
  • structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame")) – K.W. LEE Feb 26 '21 at 16:38
  • the data was created from importing csv – K.W. LEE Feb 26 '21 at 16:39
  • The REAL answer is to remove all formatting in your spreadsheet before saving it as csv – Hong Ooi Feb 26 '21 at 17:57
  • Does this answer your question? [How to read data when some numbers contain commas as thousand separator?](https://stackoverflow.com/questions/1523126/how-to-read-data-when-some-numbers-contain-commas-as-thousand-separator) – TarJae Feb 26 '21 at 18:03
  • Yes, it answered it – K.W. LEE Feb 27 '21 at 04:32

2 Answers2

0

You can use gsub to remove the "," and then convert to a numeric vector.

test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))

#remove commas
test_df$Price <-gsub(",","",test_df$Price)

#Change to numeric
test_df$Price <- as.numeric(test_df$Price)

test_df$Price

or tidyverse style

test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))

test_df <- test_df %>% 
  mutate(Price = gsub(",","",Price)%>%
           as.numeric)

test_df$Price
Damian
  • 516
  • 1
  • 4
  • 20
0

You can use lapply and readr::parse_number, credits to: How to read data when some numbers contain commas as thousand separator?

#your data
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))

test_df$Price<- lapply(test_df$Price, readr::parse_number)
test_df

enter image description here

TarJae
  • 72,363
  • 6
  • 19
  • 66