6
  • I have a regex that looks for numbers in a file.
  • I put results in a list

The problem is that it prints each results on a new line for every single number it finds. it aslo ignore the list I've created.

What I want to do is to have all the numbers into one list. I used join() but it doesn't works.

code :

def readfile():
    regex = re.compile('\d+')
for num in regex.findall(open('/path/to/file').read()):
    lst = [num]
    jn = ''.join(lst)    
    print(jn)

output :

122
34
764
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
shimo
  • 63
  • 3

3 Answers3

3

What goes wrong:

# this iterates the single numbers you find - one by one
for num in regex.findall(open('/path/to/file').read()):  
    lst = [num]                  # this puts one number back into a new list
    jn = ''.join(lst)            # this gets the number back out of the new list
    print(jn)                    # this prints one number

Fixing it:

Reading re.findall() show's you, it returns a list already.

There is no(t much) need to use a for on it to print it.

If you want a list - simply use re.findall()'s return value - if you want to print it, use one of the methods in Printing an int list in a single line python3 (several more posts on SO about printing in one line):

import re

my_r = re.compile(r'\d+')                 # define pattern as raw-string

numbers = my_r.findall("123 456 789")     # get the list

print(numbers)

# different methods to print a list on one line
# adjust sep  / end to fit your needs
print( *numbers, sep=", ")                # print #1

for n in numbers[:-1]:                    # print #2
    print(n, end = ", ")
print(numbers[-1])

print(', '.join(numbers))                 # print #3

Output:

['123', '456', '789']   # list of found strings that are numbers
123, 456, 789
123, 456, 789
123, 456, 789

Doku:


More on printing in one line:

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
1

In your case, regex.findall() returns a list and you are are joining in each iteration and printing it.

That is why you're seeing this problem.

You can try something like this.

numbers.txt

Xy10Ab
Tiger20
Beta30Man
56
My45one

statements:

>>> import re
>>>
>>> regex = re.compile(r'\d+')
>>> lst = []
>>>
>>> for num in regex.findall(open('numbers.txt').read()):
...     lst.append(num)
...
>>> lst
['10', '20', '30', '56', '45']
>>>
>>> jn = ''.join(lst)
>>>
>>> jn
'1020305645'
>>>
>>> jn2 = '\n'.join(lst)
>>> jn2
'10\n20\n30\n56\n45'
>>>
>>> print(jn2)
10
20
30
56
45
>>>
>>> nums = [int(n) for n in lst]
>>> nums
[10, 20, 30, 56, 45]
>>>
>>> sum(nums)
161
>>>
hygull
  • 8,464
  • 2
  • 43
  • 52
-1

Use list built-in functions to append new values.

def readfile():
regex = re.compile('\d+')
lst = []

for num in regex.findall(open('/path/to/file').read()):
    lst.append(num)

print(lst)
ThunderMind
  • 789
  • 5
  • 14