0

I am trying to solve Problem 22 on Project Euler:

Using names.txt (right click and 'Save Link/Target As...'), a 46K text file containing over five-thousand first names, begin by sorting it into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score.

For example, when the list is sorted into alphabetical order, COLIN, which is worth 3 + 15 + 12 + 9 + 14 = 53, is the 938th name in the list. So, COLIN would obtain a score of 938 × 53 = 49714.

What is the total of all the name scores in the file?

The contents of names.txt are

"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER","MARIA",....[46k omitted]

I don't understand why I get an answer that is incorrect when I use this code:

import os
chart=open('names.txt')
doc=chart.read()
doc=doc.split(',')
doc.sort()
z=0

def nameNum(name):
    r=0
    for letter in name:
        r=r+ord(letter) - 64
    return r

for string in doc:
    z+=(doc.index(string)+1)*nameNum(string)
print z

I was trying to make z produce the answer but it isn't correct and I can't figure out why.

This is python 3 by the way.

  • 1
    Questions seeking debugging help (**"why isn't this code working?"**) must include the desired behavior, *a specific problem or error* and *the shortest code necessary* to reproduce it **in the question itself**. Questions without **a clear problem statement** are not useful to other readers. See: [How to create a Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve). – MattDMo Mar 19 '16 at 20:48
  • 2
    also, this cannot be Python 3, since you use `print z` which only works in Python 2 – Antti Haapala -- Слава Україні Mar 19 '16 at 21:00
  • 1
    In the answer I cannot post, I advised you to write a testable function such as `allsum` and then apply it to simple inputs, starting with `' "" '` and `' "A", '`, for which you can calculate the answer. Only try `allsum(open('names.txt').read())` after something like `allsum(''' "CD","A","AB" ''')` works. – Terry Jan Reedy Mar 19 '16 at 22:01

2 Answers2

3

You're scoring the quotes " along with the names. The easiest is to just remove them before splitting at comma;

...
doc=chart.read()
doc=doc.replace('"', '')
doc=doc.split(',') 
...

Before the fix, you're getting a negative score for COLIN

Name: "COLIN", pos: 938, score: -6566

...but after the fix it's correct according to the example;

Name: COLIN, pos: 938, score: 49714
Joachim Isaksson
  • 176,943
  • 25
  • 281
  • 294
1

There is nothing wrong with your actual algorithm, but the problem is that while you split the file by ,, there are still " characters around each name; so you're incorrectly adding 2 * (64 - ord('"')) to the score of each name, which is why you get an incorrect result.

The file is in a format that could be easily parsed by ast.literal_eval() into a tuple of strings:

import ast
with open('names.txt') as f:
     contents = f.read()
     names = sorted(ast.literal_eval(contents)) 

Now names is a sorted list of names, without any extra characters. Using this list instead of your doc, I got the correct result from your algorithm.


Also while this does not affect the algorithm itself, using .index to find the index of items is very inefficient:

for string in doc:
    z+=(doc.index(string)+1)*nameNum(string)

you should use enumerate instead:

for number, name in enumerate(names, start=1):
    z += number * nameNum(name)

Now names is a sorted list of names, without any extra characters. With these changes I got the correct result.

Community
  • 1
  • 1