K-Means clustering in R error NA/NaN/Inf in foreign function call

Question

I have a dataset that I have created in R. It is structured as follows: enter image description here

I am trying to cluster the observations using k-means. However, I get the following error message:

> cl <- kmeans(sample, 3)

Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In storage.mode(x) <- "double" : NAs introduced by coercion

What does this mean? Am I prepocessing the data incorrectly? What can I do to fix it?

Your picture shows a mixture of character (ID, Genre) and numeric data. The kmeans function only works with numeric data. What does `str(samples)` show? — dcarlson, Dec 07 '19 at 23:02

dc37 · Answer 1 · 2019-12-07T07:56:33.157

In the documentation of kmeans (pass ?kmeans in the console to see it), it is stipulated that the argument x has to be:

numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns).

Here, you have the first row that is preventing to be used for kmeans. Basically, I believed that your first row is supposed to be your colnames.

Moreover, you can't make clustering with your second columns genre as it is character and I believed that the first column does not have to be used also, am I right ?

So, if your dataset is called samples, try to do:

colnames(samples) <- samples[1,]
samples_cluster <- samples[-1,3:ncol(samples)]
cl <- kmeans(samples_cluster,3)

Does it answer your question ?

If not, can you provide a reproducible example of your dataset in order we can verify the dataframe for kmeans clustering. To do this, please see: How to make a great R reproducible example

K-Means clustering in R error NA/NaN/Inf in foreign function call

1 Answers1