If you have a file like this:
1 2
2 4
4 8
You can do the following:
from functools import reduce
def is_int(s):
try:
int(s)
return True
except ValueError:
return False
filename = 'path'
def sum_columns(filename):
with open (filename) as f:
lines = f.readlines()
return sum([
reduce(lambda x, y: x * y, map(int,line.split("\t")))
for line in lines
if len(list(filter(is_int, line.split("\t")))) == 2
])
Explanation:
At the top I define a helper function, that determins if a string can be converted into an int or not. This will be used later to ignore lines that don't have 2 numbers. It's based on this answer
def is_int(s):
try:
int(s)
return True
except ValueError:
return False
Then, we open the file, and read all lines into a variable. This is not the most efficient, as it can be processed line by line without storing the while file, however, for smaller files this is negligable.
with open (filename) as f:
lines = f.readlines()
Next, is a single operation to perform your query, but let's break it down:
First, we iterate through all the lines:
for line in lines
Next, we only keep the lines that have exactly two numbers separated by tabs:
if len(list(filter(is_int, line.split("\t")))) == 2
Finally, we turn each number in the line into int
s, and multiply them all together:
reduce(lambda x, y: x * y, map(int,line.split("\t")))
We then sum all of these and return the result
If performance is a concern, you can achieve the same thing, reading the contents line by line, instead of pulling the whole file into a variable. It is less elegant, but more efficicient:
def sum_columns(filename):
total = 0
with open (filename) as f:
for line in f:
if len(list(filter(is_int, line.split("\t")))) != 2:
continue
total += reduce((lambda x, y: x * y), map(int,line.split("\t")))
return total
(Note, that you still need the import and helpers from the above example)