3

I've read a bunch of posts such as:

And I've gone through and found the double metaphone code.

But the double metaphone algorithm returns a tuple for a given string, assuming that the string is a single word. Does anyone know of a phonetic algorithm that would work on multiple words in the same name? If not, is the best idea simply to write a script to count the word count for each word in this column and then run double metaphone on the word in a cell that occurs most frequently in my dataset?

Community
  • 1
  • 1
user1590499
  • 933
  • 3
  • 10
  • 17
  • 1
    What do you mean by "work on multiple words in the same name"? Do you mean strings like "John Bob"? – Hans Then Sep 18 '12 at 14:01
  • Also, the answer will depend a lot on what you want to achieve. What do you want to do with the phonetic matches? – Hans Then Sep 18 '12 at 14:02
  • Exactly, although the real examples tend to be more like "Hamburger Hamlet" – user1590499 Sep 18 '12 at 14:03
  • I'm trying to create an error-check to see if I mapped ID's correctly to differently spelled names (differently spelled due to internationalized code, and differences in human input for the same entity). In summary, each name should have a unique primary key so that I can import it into SQL properly. – user1590499 Sep 18 '12 at 14:04
  • Metaphone will not generate unique keys. The codes are too short for that. It is used as an approximation of the pronunciation. You can use it as a sorting key, to find names that are pronounced similarly. Keep in mind it it is designed to work with proper names (i.e. human names), not with dictionary words such as "Burger King". – Hans Then Sep 18 '12 at 14:24
  • Could you please describe what you're actually trying to achieve, perhaps using an example? It's difficult to understand what your question is actually asking. – mpenkov Oct 03 '12 at 07:27

1 Answers1

4

actually, it isn't true that metaphone or double metaphone are designed to work only with proper names and not with "dictionary words". metaphone, double metaphone, and metaphone 3 were all designed to work with both names and words, and developed against databases containing both