Coding words as products of integers

Question

I am trying to write a program that checks if smaller words are found within a larger word. For example, the word "computer" contains the words "put", "rum", "cut", etc. To perform the check I am trying to code each word as a product of prime numbers, that way the smaller words will all be factors of the larger word. I have a list of letters and a list of primes and have assigned (I think) an integer value to each letter:

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 
61, 67, 71, 73, 79, 83, 89, 97, 101]

index = 0
while index <= len(letters)-1:
    letters[index] = primes[index]
    index += 1

The problem I am having now is how to get the integer code for a given word and be able to create the codes for a whole list of words. For example, I want to be able to input the word "cab," and have the code generate its integer value of 5*2*3 = 30.

Any help would be much appreciated.

This approach to finding smaller words is incorrect; it does not take into account the ordering of the letters. — univerio, Feb 28 '14 at 00:14
@univerio: if you look at his examples ('rum' in 'computer') that doesn't matter; it looks like he is trying to build some sort of anagram solver. — Hugh Bothwell, Feb 28 '14 at 00:21

Hugh Bothwell · Answer 1 · 2014-02-28T02:45:11.447

from functools import reduce  # only needed for Python 3.x
from operator import mul

primes = [
    2,  3,  5,  7,  11, 13, 17, 19, 23, 29, 31, 37, 41,
    43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101
]
lookup = dict(zip("abcdefghijklmnopqrstuvwxyz", primes))

def encode(s):
    return reduce(mul, (lookup.get(ch, 1) for ch in s.lower()))

then

encode("cat")   # => 710
encode("act")   # => 710

Edit: more to the point,

def is_anagram(s1, s2):
    """
    s1 consists of the same letters as s2, rearranged
    """
    return encode(s1) == encode(s2)

def is_subset(s1, s2):
    """
    s1 consists of some letters from s2, rearranged
    """
    return encode(s2) % encode(s1) == 0

then

is_anagram("cat", "act")      # => True
is_subset("cat", "tactful")   # => True

Beat me by 13 seconds *and* included case-handling. – johnsyweb Feb 28 '14 at 00:28 — johnsyweb, Feb 28 '14 at 00:28

score 1 · Answer 2 · edited May 23 '17 at 11:57

I would use a dict here to look-up the prime for a given letter:

In [1]: letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

In [2]: primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 
61, 67, 71, 73, 79, 83, 89, 97, 101]

In [3]: lookup = dict(zip(letters, primes))

In [4]: lookup['a']
Out[4]: 2

This will let you easily determine the list of primes for a given word:

In [5]: [lookup[letter] for letter in "computer"]
Out[5]: [5, 47, 41, 53, 73, 71, 11, 61]

To find the product of those primes:

In [6]: import operator

In [7]: reduce(operator.mul, [lookup[letter] for letter in "cab"])
Out[7]: 30

MxLDevs · Accepted Answer · 2014-02-28T02:01:38.863

0

You've got your two lists set up, so now you just need to iterate over each character in a word and determine what value that letter gives you.

Something like

total = 1
for letter in word:
   index = letters.index(letter)
   total *= primes[index]

Or whichever operation you decide to use.

You would generalize that to a list of words.

edited Feb 28 '14 at 02:01

answered Feb 28 '14 at 00:12

MxLDevs

19,048
36
123
194

1

This creates a *sum*, not a product. It's also doing an O(n) lookup for each letter in the word. A `dict` approach would be more efficient. – johnsyweb Feb 28 '14 at 01:52
@Johnsyweb I wrote the answer based on existing code provided. If a hash were used, I would demonstrate how the hash would be used. Efficiency was not my focus but to understand the logic involved, as the question suggests (how should I continue vs how should I do this). Plus the title said sum and so I assumed the math in the question was just a typo. – MxLDevs Feb 28 '14 at 01:57

score 0 · Answer 4 · answered Feb 28 '14 at 00:35

Hmmmm... It isn't very clear how this code is supposed to run. If it is built to find words in the english dictionary, think about using PyEnchant, a module for checking if words are in the dictionary. Something you could try is this:

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101] 
word = raw_input('What is your word? ')
word = list(word)
total = 1
nums = []
for k in word:
    nums.append(primes[letters.index(k)])
for k in nums:
    total = total*k
print total

This will output as:

>>> What is your word? cat
710
>>>

This is correct, as 5*2*71 equals 710

Coding words as products of integers

4 Answers4