0

I would like to collect user information to determine whether they are male or female. I have zero labeled data for my users, but I know some features that can easily predict their gender. An example would be texts created by the users that contain words strongly associated with one gender (ex: Male: beer, football game, boxers. Female: facial, makeup, bra).

Would this be considered unsupervised learning, since I don't have labelled data to train my models on?

Community
  • 1
  • 1
Popcorn
  • 5,188
  • 12
  • 54
  • 87

2 Answers2

1

This is neither supervised nor unsupervised. You are just applying some predefined rules to classify between male/fame.

This is also not machine learning, because you don't use any learning method...

dynamic
  • 46,985
  • 55
  • 154
  • 231
  • is there an academic name for this kind of pattern recognition? I'd like to research similar problems that have been done. – Popcorn Jan 22 '15 at 19:58
  • There is no pattern recognition, as you are simply "spotting" words and classifying based on rules. Your system doesnot learn at all. – Umang Gupta Jul 28 '16 at 15:47
0

A supervised learning method would use all of the text used by the users and allow the machine to determine which words are important and by how much by trying to guess the user's gender and then correcting itself with the label.

An unsupervised method would be to provide the machine with all the text by the users and allow it to try and create different pattern groups out of it. However, there are many more ways to group users than 'male' and 'female' so this is not exactly an ideal unsupervised problem.

Telling the system which words are important and separating the system into groups based on that would just be a regular program and can be accomplished by any programming language that can match text and provide an output.

pro·gram

noun 1. a planned series of future events, items, or performances.

Community
  • 1
  • 1
DoubleDouble
  • 328
  • 2
  • 12