I have the data obtained from a survey and I would like to analyze it, make clusters and display it 3D as it allows to visualize the information in a more tangible way.
The case is that I have many columns with questions, which the respondents answer: Agree (1), Somewhat agree (0.8), Neutral (0.6), Somewhat disagree (0.4), Disagree (0.2) and finally a numerical rating question, what means are rather categorical data.
A sample of the dataset is shown below:
q1,q2,q3,q4,q5,q6,q7
1,0.8,0.6,0.2,0.2,0.4,10
0.2,1,0.4,0.4,0.4,0.4,9
0.6,1,0.2,0.4,0.2,0.2,6
I am trying to write some code in R based on the following reference: https://plotly.com/r/t-sne-and-umap-projections/
And the code I've tried to run is the following:
library(cluster)
gower_df <- daisy(data,
metric = "gower" ,
type = list(logratio = 2))
silhouette <- c()
silhouette = c(silhouette, NA)
for(i in 2:10){
pam_clusters = pam(as.matrix(gower_df),
diss = TRUE,
k = i)
silhouette = c(silhouette ,pam_clusters$silinfo$avg.width)
}
plot(1:10, silhouette,
xlab = "Clusters",
ylab = "Silhouette Width")
lines(1:10, silhouette)
pam_ = pam(gower_df, diss = TRUE, k = 2)
data[pam_$medoids, ]
pam_summary <- data %>%
mutate(cluster = pam_$clustering) %>%
group_by(cluster) %>%
do(cluster_summary = summary(.))
pam_summary$cluster_summary[[1]]
library(Rtsne)
library(ggplot2)
tsne_object <- Rtsne(gower_df, is_distance = TRUE)
tsne_df <- tsne_object$Y %>%
data.frame() %>%
setNames(c("X", "Y")) %>%
mutate(cluster = factor(pam_$clustering))
ggplot(aes(x = X, y = Y), data = tsne_df) +
geom_point(aes(color = cluster))
library(plotly)
library(umap)
fig2 <- plot_ly(tsne_df)
fig2
But I get a 2D representation. Any idea how I can do it?