1

I am trying to read first and third columns from a text file and add them together.

Following code works perfectly and gives me the result I need but trying to find out if there is a better more pythonic way to write this?

with open('random.txt', 'r') as fn:
    next(fn)
    numbers = fn.readlines()
    first_col = [int(x.split(',')[0]) for x in numbers]
    third_col = [int(y.split(',')[2]) for y in numbers]

    result = [v + z for v, z in zip(first_col, third_col)]

    print(result)

The random file is literally a random file.

col1,col2,col3
44,65,78
55,87,98
12,32,62

Result:

[122, 153, 74]
Tom Cider
  • 181
  • 1
  • 1
  • 11

7 Answers7

6

If you can use numpy then my suggestion is to use loadtxt function:

import numpy as np
np.loadtxt('random.txt', dtype=int, skiprows=1, delimiter=',', usecols=(0, 2)).sum(axis=1).tolist()
AGN Gazer
  • 8,025
  • 2
  • 27
  • 45
4

You can use zip:

with open('random.txt', 'r') as fn:
    next(fn)
    first_col, _, third_col  = [
        *zip(*(int(x) for x in map(lambda x: x.split(','), fn))
    ]
    ...
    results = [x+y for x, y in zip(first_col, second_col)]

Or if you do not need to hold the cols:

results = [
    x+y for x, _, y in zip(*(int(x) for x in map(lambda x: x.split(','), fn))
]
Netwave
  • 40,134
  • 6
  • 50
  • 93
  • 1
    This is the most elegant approach in pure Python and this answer should have been the accepted answer. – AGN Gazer May 02 '19 at 07:34
3

In addition to the answers provided here, you can use csv package to process the file.

import csv
with open('random.txt', 'r') as fn:
    csv_reader = csv.reader(fn)
    next(csv_reader, None)  # skip the headers
    result = [int(f)+int(t) for f,_, t in csv_reader] 
    print result    

The easiest solution will be to use pandas if you are comfortable with it.

import pandas as pd
df = pd.read_csv('random.txt')
print df.col1 + df.col2

If you want the result as a list,

import pandas as pd
df = pd.read_csv('random.txt')
res =  df.col1 + df.col2
print res.tolist()
Unni
  • 5,348
  • 6
  • 36
  • 55
3

I would say the easiest way is to just sticking to the basics, there is no correct pythonic way! You can make your code as easy and as complex you want.

import csv

res = []
with open('file.txt', 'r') as fp:
    #Open csv file
    reader = csv.reader(fp)
    next(reader)
    #Iterate through rows and append the sum of first and third rows to a list
    for row in reader:
        res.append(int(row[0]) + int(row[2]))

print(res)
#[122, 153, 74]
Devesh Kumar Singh
  • 20,259
  • 5
  • 21
  • 40
2
import sys
import csv

with open(sys.argv[1]) as fh:
    reader = csv.reader(fh)
    rows = [list(map(int, row)) for row in reader]
    sums = [v + z for v, _, z in rows]
    print(sums)  # [122, 153, 74]
FMc
  • 41,963
  • 13
  • 79
  • 132
2

Your code is "pythonic" enough, but you're doing more work and using more space than you need to.

with open('random.txt', 'r') as fn:
    next(fn) # skip the first row
    total = 0
    for row in fn:
        first_col, _, third_col = row.split(',')
        total += int(first_col) + int(third_col)

print(result)

You could tidy this up with a function perhaps

def sum_row(row):
    first_col, _, third_col = row.split(',')
    return int(first_col) + int(third_col)

with open('random.txt', 'r') as fn:
    next(fn) # skip the first row
    result = sum(sum_row(row) for row in fn)

print result

If you need an industrial strength solution, i.e., other people are using this too and you might need to maintain it in the future, use csv.

import csv

def sum_row(row):
    return int(row[0]) + int(row[2])

with open('random.txt', 'r') as fn:
    reader = csv.reader(fn)
    result = sum(sum_row(row) for row in fn)
munk
  • 12,340
  • 8
  • 51
  • 71
  • Excellent! What does '_' do in first_col, _, third_col? – Tom Cider May 01 '19 at 04:03
  • 1
    it's a variable name that's usually used to indicate you're ignoring the value. You could replace it with `first_col, second_col, third_col = row.split(',')`, but when someone goes to read your code or you use a linter, they'll ask why you didn't use second_col. See https://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-single-underscore-variable-in-python – munk May 01 '19 at 04:06
2

There is another option with one-liner list comprehension, but we must use the methodcaller higher order function, to split each line from the file.

The list comprehension gets lines from the file and then the map function executes a split(",") method on each one to transform it to a list of columns.

from operator import methodcaller
with open('random.txt','r') as f:
    next(f)
    sum = [ int(c1)+int(c3) for c1,_,c3 in map(methodcaller("split", ","),f)]
sum  

The additional advantage is that we can convert it into a generator without wasting any memory.

from operator import methodcaller
with open('data','r') as f:
    next(f)
    v = ( int(c1)+int(c3) for c1,_,c3 in map(methodcaller("split", ","),f))
    print(list(v)) # just to print the result
szmurlor
  • 21
  • 4