15

I want to implement a python based semantic search over a set of keywords (mainly hobbies, latest news, etc which people might be interested to talk about). I want to know if there exist an ontology database(s) for the same and open source search algorithms/implementations for the same.

Eg. My set = {talking,drinking,tweeting,Katrina Kaif,Katrina cyclone,rock collecting,coin collecting}

So, on searching "accumulate" I might get rock collection and coin collecting as the output.

Edit : The terms can have multiple words. That is "President. Barack Obama of United States" is a valid query.

w2lame
  • 2,774
  • 6
  • 35
  • 48
  • Can you explain about the input set and the output set in details? – cola Jan 20 '12 at 19:23
  • @guru First we need to build a database of hobbies/topics or anything that people want to talk about. It would be good, if the database updates itself, but users would be adding theirs anyhow. Given, this database we want to implement a semantic search over them. So, given these terms I should be able to perform semantic search on them and return a list of users whose interests matches with the interest searched. – w2lame Jan 21 '12 at 10:24

2 Answers2

4

You might want to use "random indexing". It can do exactly what you need, it calculates a feature vector for each word and defines a metric of semantic similarity between two words.

All you need is to grab a copy of An Introduction to Random Indexing and a semanticvectors package to get you started ...

I hope this helps, if you need further advice, please comment ...

tuxdna
  • 8,257
  • 4
  • 43
  • 61
the swine
  • 10,713
  • 7
  • 58
  • 100
1

I hope but am not sure if this is helpful to you.

Gnowsys

  • still under heavy developement
0xc0de
  • 8,028
  • 5
  • 49
  • 75