-4

So I essentially had a bunch of students, and had them complete a "personality quiz". The personality quiz basically consisted of them rating themselves on a 1-10 scale for a number of different traits (i.e. introversion, ability to focus etc.)

The students were then put into groups and were made to do a couple of group assignments. I then made them do another quiz where they basically reflected on their performance on the assignments - things like how well the group got together, their mark on the assignments, how much disagreement they had, how well they were able to focus, etc. All on a 1-10 scale.

I now have a new set of students and made them complete the same personality quiz I gave to the first set of students.

I want to now make a machine learning algorithm and train it using the personality and performance data I got from the first set of students. I want it to now be able to group the new set of students using their personality quiz results so that the performance of the groups is maximized.

In other words, I had a set of students and I measured their personality and performance in groups I made up. I now have a new set of students and want a machine learning algorithm to learn from the original set of students' data and put the new students into groups so that their personalities work together to maximize performance.

Could someone point me in the right direction please? I have no experience in machine learning whatsoever and so have no idea what to use.

user4157124
  • 2,809
  • 13
  • 27
  • 42
  • 1
    @user5651239, recommendation requests are off topic on this site. – Don Reba Dec 04 '15 at 22:41
  • @DonReba Oh okay, I see. Any suggestions on a more appropriate way to find the answer to my question? – user5641293 Dec 04 '15 at 22:47
  • Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation. [on topic](http://stackoverflow.com/help/on-topic) applies here. – Prune Dec 04 '15 at 22:59
  • @Prune I looked it over and still think this is relevant. I'm not asking for a recommendation for a book or a tool, I'm asking what the most appropriate algorithm is. Here's an algorithm question just like this and it was on topic: http://stackoverflow.com/questions/22342854/what-is-the-optimal-algorithm-for-the-game-2048 – user5641293 Dec 04 '15 at 23:04
  • 1
    @user5641293 the difference being is that that user already had code for an algorithm and was inquiring how to improve upon what he originally had. As it stands, I believe your questions is "Too Broad" for a question on SO – R Nar Dec 04 '15 at 23:09
  • @RNar What can I do to improve the question so that it's less broad? – user5641293 Dec 04 '15 at 23:12
  • 1
    @user5641293 try the question yourself first and come back when you encounter a specific problem or have a specific question on comparing certain algorithms. It seems rather blunt to pretty much tell you "go out onto the world and fend for yourself" but that's essentially what you should be doing. Research the topic, find out some of the algorithms that may look good, try it out – R Nar Dec 04 '15 at 23:15
  • you say you have no experience in machine learning which means that you have a lot to learn. So learn it! it never hurts to search google about how machine learning works and what the logic is behind it. – R Nar Dec 04 '15 at 23:16

1 Answers1

0

First, as the comments mentioned, this is off topic for this site. But I am answering because I want to.

Now, the whole experiment you are doing is subject to some bias that may be problematic. I am not putting out reference because I am willing to spend time (you should find your own references for these points):

  1. People rate group-related performance more optimistically
  2. Self-formed groups sometimes inherent social ties, which affects performance
  3. Self-accessment of work may not correlate with global evaluation of the output
  4. Different tasks (classes in your case) requires different kinds of collaborations. Hence your algorithm will very likely work only for one class, if it works at all.

Now you have not define the following:

  1. Metric of performance for each group
  2. Metric of goodness for your model trained on a set of groups
  3. Size of groups (uniform or variant)
  4. Number of groups ( a pre-specified number or variant)

In general, you can do N cross-folding on your dataset using most models. In your case it is likely to be an optimization problem of goodness metric, calculated from combination of performances of groups, on all possible partitioning of students. It will be computationally expensive and not scalable at all if you don't devise a greedy algorithm.

I will leave you here. Now it's your time to do some work.

Patrick the Cat
  • 2,138
  • 1
  • 16
  • 33