-1

I'm trying to extract all the numbers from a text file using re.findall() and compute the sum using a for-loop. What I was getting at first was a list of lists, so I put them in one overall list and tried to convert each number into integers. But the answer I get is different from provided answer. Can someone take a look at my code and see if I overlooked anything? Link to the text file is: http://python-data.dr-chuck.net/regex_sum_376410.txt

handle = open('regex_sum.txt','r')
import re

lst = list()
for line in handle:
    b = re.findall('[0-9]+', line)
    if b:
        lst.append(b)

total = 0

for numbers in lst:
    for num in numbers:
       c = int(num)
       total = total + c
print total

P.S I did figure out why I got a different answer. My code extracted a few irrelevant numbers at the beginning of the output '0, 8, 4, 376410' which was not present in the text file. Anybody know how to fix my code so these numbers don't appear again?

Barmar
  • 741,623
  • 53
  • 500
  • 612
Tokaalmighty
  • 402
  • 4
  • 15

1 Answers1

1

So if the expected answer should end with 629 then you should change append to extend (see the difference). While you're extending you might as well convert the findings into ints. Then you can use the default sum function.

import re

handle = open('regex_sum.txt','r')

lst = []
for line in handle:
    b = re.findall('\d+', line)
    if b:
        lst.extend([int(x) for x in b])
        # this also works if you want to use append
        #for x in b:
        #    lst.append(int(x))

print(sum(lst))
# outputs: 437629
depperm
  • 10,606
  • 4
  • 43
  • 67