0

So I'm working in this table:

Raumeinheit Langzeitarbeitslose
Hamburg 33,23
Berlin 44,56

I'm trying to calculate the mean of Langzeitarbeitslose but I can't because

is.numeric

comes out as false because the column Langzeitarbeitslose is defined as character.

I think this might be because here in Germany we use "," to show decimals and not "."

I already tried

as.numeric(gsub(",", ".", West_data$Langzeitarbeitslose))

that gave me a working table in the console preview but when I looked at the table with

view(West_Data)

It still showed the Decimals of Langzeitarbeitslose seperated with ',' and

is.numeric(West_Data$Langzeitarbeitslose) 

came back as false.

zx8754
  • 52,746
  • 12
  • 114
  • 209
GeoNerd
  • 19
  • 4
  • How did you import that data? Using read.table you can set decimals to comma: `dec = ","`. – zx8754 Nov 24 '22 at 11:27
  • @zx8754 I imported it with the `read.csv` command. Is there a way to define decimals to comma with my full table already read in? – GeoNerd Nov 24 '22 at 11:28
  • We need to update the column: `West_data$Langzeitarbeitslose <- as.numeric(gsub(",", ".", West_data$Langzeitarbeitslose))` – zx8754 Nov 24 '22 at 11:29
  • Try it again with `read.csv("myfile.csv", dec = ",")` – zx8754 Nov 24 '22 at 11:29

2 Answers2

0

I guess you need to assign the result of as.numeric(gsub(",", ".", West_data$Langzeitarbeitslose)) to the column West_data$Langzeitarbeitslose

West_data$Langzeitarbeitslose <- as.numeric(gsub(",", ".", West_data$Langzeitarbeitslose))

The result of print(West_data) will be:

  Raumeinheit Langzeitarbeitslose
1     Hamburg               33.23
2      Berlin               44.56

The cast of datatype can be checked here:

> str(West_data)
'data.frame':   2 obs. of  2 variables:
 $ Raumeinheit        : chr  "Hamburg" "Berlin"
 $ Langzeitarbeitslose: num  33.2 44.6
asd-tm
  • 3,381
  • 2
  • 24
  • 41
0

Are you assigning the type conversion back to your dataframe?

using tidyverse

# reproducible data using dput()
df <- structure(list(Raumeinheit = c("Hamburg", "Berlin"), Langzeitarbeitslose = c("33,23", 
"44,56")), row.names = c(NA, -2L), class = c("tbl_df", "tbl", 
"data.frame"))

df <- df %>% 
         mutate(Langzeitarbeitslose = 
          as.numeric(gsub(",", ".",Langzeitarbeitslose)))

# A tibble: 2 × 2
  Raumeinheit Langzeitarbeitslose
  <chr>                     <dbl>
1 Hamburg                    33.2
2 Berlin                     44.6
pluke
  • 3,832
  • 5
  • 45
  • 68