0

I have the data obtained from a survey and I would like to analyze it, make clusters and display it 3D as it allows to visualize the information in a more tangible way.

The case is that I have many columns with questions, which the respondents answer: Agree (1), Somewhat agree (0.8), Neutral (0.6), Somewhat disagree (0.4), Disagree (0.2) and finally a numerical rating question, what means are rather categorical data.

A sample of the dataset is shown below:

q1,q2,q3,q4,q5,q6,q7
1,0.8,0.6,0.2,0.2,0.4,10
0.2,1,0.4,0.4,0.4,0.4,9
0.6,1,0.2,0.4,0.2,0.2,6

I am trying to write some code in R based on the following reference: https://plotly.com/r/t-sne-and-umap-projections/

And the code I've tried to run is the following:

library(cluster)
gower_df <- daisy(data,
                    metric = "gower" ,
                    type = list(logratio = 2))

silhouette <- c()
silhouette = c(silhouette, NA)
for(i in 2:10){
  pam_clusters = pam(as.matrix(gower_df),
                 diss = TRUE,
                 k = i)
  silhouette = c(silhouette ,pam_clusters$silinfo$avg.width)
}
plot(1:10, silhouette,
     xlab = "Clusters",
     ylab = "Silhouette Width")
lines(1:10, silhouette)

pam_ = pam(gower_df, diss = TRUE, k = 2)
data[pam_$medoids, ]

pam_summary <- data %>%
  mutate(cluster = pam_$clustering) %>%
  group_by(cluster) %>%
  do(cluster_summary = summary(.))
pam_summary$cluster_summary[[1]]

library(Rtsne)
library(ggplot2)
tsne_object <- Rtsne(gower_df, is_distance = TRUE)
tsne_df <- tsne_object$Y %>%
  data.frame() %>%
  setNames(c("X", "Y")) %>%
  mutate(cluster = factor(pam_$clustering))
ggplot(aes(x = X, y = Y), data = tsne_df) +
  geom_point(aes(color = cluster))

library(plotly) 
library(umap) 
fig2 <- plot_ly(tsne_df)

fig2

But I get a 2D representation. Any idea how I can do it?

MSmith
  • 1
  • 2
  • 1
    Add `dims = 3` to your `Rtsne()` call. – SamR Jul 04 '22 at 14:42
  • @SamR I have tried to do it and modify the code I have posted but I can't get it. Could you help me? – MSmith Jul 04 '22 at 14:54
  • 1
    This example is not [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) - specifically you have not posted your data. It's also not particularly minimal, so it's hard to help further. You said you are following the plotly tutorial but you are also using a different package (Rtsne vs tsne) - presumably there's a reason but I don't know what it is so it's hard to say more. – SamR Jul 04 '22 at 14:58
  • 1
    @SamR was suggesting `tsne_object <- Rtsne(gower_df, is_distance = TRUE, dim=3)` – G5W Jul 04 '22 at 15:09
  • Hi! I get the following error by trying that: Error in initialize(...) : attempt to use zero-length variable name – MSmith Jul 05 '22 at 09:31

0 Answers0