I'm working on some clustering with k-means. Once the code ist established, I want to import data from an excel file. So the basic script works just well.
df <- USArrests
df <- na.omit(df)
df <- scale(df)
head(df, top = 10)
distance <- get_dist(df)
fviz_dist(distance, gradient = list(low = "#33E3FF", mid = "white", high =
"#80FF33"))
But if I export the rStudio training data to excel and reimport it back to rStudio,
I end up with two errors:
1)
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
2)
Warning message:
In stats::dist(x, method = method, ...) : NAs introduced by coercion
So this is my script, which produces errors
df <- USArrests
write.xlsx(df, "c:/my_path/USArrests.xlsx")
df <- read.xlsx(file = "c:/my_path/USArrests.xlsx", sheetIndex = 1)
df <- na.omit(df) df <- scale(df)
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
head(df, top = 10)
NA. Murder Assault UrbanPop Rape
1 Alabama 13.2 236 58 21.2
2 Alaska 10.0 263 48 44.5
3 Arizona 8.1 294 80 31.0
4 Arkansas 8.8 190 50 19.5
5 California 9.0 276 91 40.6
6 Colorado 7.9 204 78 38.7
distance <- get_dist(df)
Warning message:
In stats::dist(x, method = method, ...) : NAs introduced by coercion
fviz_dist(distance, gradient = list(low = "#33E3FF", mid = "white", high =
"#80FF33"))
How can fix this? Or how do I import excel data for fviz_dist?
Edit: Highlighting I how imported and export the data to excel:
write.xlsx(df, "c:/my_path/USArrests.xlsx")
df <- read.xlsx(file = "c:/my_path/USArrests.xlsx", sheetIndex = 1)