Is there a way to access or export the label numbers in an r plot?

Question

Ploting the students' score change in two tests

I have a plot where x is a test a and y is another test b. Each students are tested two times. Each dot represents one students "post minus pre" score on x and on y. As you can see, I assigned labels to the plot, but I want to export the id on different parts in the plot. Is there a way to do this?

what do you mean by "I want to export the id on different parts in the plot" ? Are you looking for a clustering algorithm to identify the students that improved and the ones that did not? — RockScience, Mar 11 '15 at 04:05
I have their individual scores, and I want to somehow extract the groups on the plot. For example, there are two big groups on the plot and I want to know the ids of thoses two groups. What do you mean by clustering algorithm? I think that would be helpful too. Actually I have four tests, and I am trying to group students into similar growth patterns. Can you give me an example of your algorithm? Thank you!@RockScience — William Liu, Mar 11 '15 at 04:10
William you should do some research on clustering, there are many ways to identify groups of id from a data set. http://www.statmethods.net/advstats/cluster.html I think in your case a simple k-mean cluster would work. — RockScience, Mar 11 '15 at 04:19
Example data and an example output would be really useful too. — tospig, Mar 11 '15 at 04:34

score 2 · Accepted Answer · answered Mar 11 '15 at 04:23

If myData is your data set, you can identify each group using a kmeans agorithm: (Make sure x and y are centered and normalized accordingly before)

myData <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
       matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(myData) <- c("x", "y")
(cl <- kmeans(myData, 2))
plot(myData, col = cl$cluster)
points(cl$centers, col = 1:2, pch = 8, cex = 2)

score 0 · Answer 2 · edited May 23 '17 at 11:50

0

Adds to the answer from @RockScience,

Maybe a better way to do this is to do first decide the number of clusters instead of assigning the number of clusters as 2, in that way you probability will get the exact group of people instead of dividing the whole group into just 2 clusters.

A link on how to find the number of clusters: find the number of clusters

edited May 23 '17 at 11:50

Community

1
1

answered Mar 11 '15 at 06:57

wang892

361
2
5

score 0 · Answer 3 · answered Mar 11 '15 at 21:58

0

Why not select by thresholds?

You are interested in students in a particular range.

So why not formalize the range, and select where 0

answered Mar 11 '15 at 21:58

Has QUIT--Anony-Mousse

76,138
12
138
194

Is there a way to access or export the label numbers in an r plot?

3 Answers3