Even if you fix the .split
stuff in
bal_fwd, junk, sdc, junk2, est_read=str(luserfld10).split(' ')
it won't do what you want because it's assigning the results of the split to local names bal_fwd
, sdc
, etc, that only exist inside the csv_read
function, not to the names you defined outside the function in the global scope.
You could use global
statements to tell Python to assign those values to the global names, but it's generally best to avoid using the global statement unless you really need it. Also, merely using a global
statement won't put the string data into your bal_fwd
list. Instead, it will bind the global name to your string data and discard the list. If you want to put the string into the list you need to .append
it, like you did with unique_id
. You don't need global
for that, since you aren't performing an assignment, you're just modifying the existing list object.
Here's a repaired version of your code, tested with the data sample you posted.
import csv
unique_id = []
total_amt = []
luserfld10 = []
bal_fwd = []
sdc = []
est_read = []
def csv_read(file_obj):
for line in csv.DictReader(file_obj, delimiter=','):
unique_id.append(line["LUSERFLD4"])
total_amt.append(line["LAMOUNT1"])
fld10 = line["LUSERFLD10"]
luserfld10.append(fld10)
t = fld10.split(' | ')
bal_fwd.append(t[0])
sdc.append(t[1])
est_read.append(t[2])
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
csv_read(f_obj)
print('id', unique_id)
print('amt', total_amt)
print('fld10', luserfld10)
print('bal', bal_fwd)
print('sdc', sdc)
print('est_read', est_read)
output
id ['VAAA0022179', 'VAAA0022179']
amt ['-12847.28', '12847.28']
fld10 ['N | N | N', 'N | N | N']
bal ['N', 'N']
sdc ['N', 'N']
est_read ['N', 'N']
I should mention that using t = fld10.split(' | ')
is a bit fragile: if the separator isn't exactly ' | '
then the split will fail. So if there's a possibility that there might not be exactly one space either side of the pipe (|
) then you should use a variation of Jinje's suggestion:
t = fld10.replace('|', ' ').split()
This replaces all pipe chars with spaces, and then splits on runs of white space, so it's guaranteed to split the subields correctly, assuming there's at least one space or pipe between each subfield (Jinje's original suggestion will fail if both spaces are missing on either side of the pipe).
Breaking your data up into separate lists may not be a great strategy: you have to be careful to keep the lists synchronised, so it's tricky to sort them or to add or remove items. And it's tedious to manipulate all the data as a unit when you have it spread out over half a dozen named lists.
One option is to put your data into a dictionary of lists:
import csv
from pprint import pprint
def csv_read(file_obj):
data = {
'unique_id': [],
'total_amt': [],
'bal_fwd': [],
'sdc': [],
'est_read': [],
}
for line in csv.DictReader(file_obj, delimiter=','):
data['unique_id'].append(line["LUSERFLD4"])
data['total_amt'].append(line["LAMOUNT1"])
fld10 = line["LUSERFLD10"]
t = fld10.split(' | ')
data['bal_fwd'].append(t[0])
data['sdc'].append(t[1])
data['est_read'].append(t[2])
return data
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
data = csv_read(f_obj)
pprint(data)
output
{'bal_fwd': ['N', 'N'],
'est_read': ['N', 'N'],
'sdc': ['N', 'N'],
'total_amt': ['-12847.28', '12847.28'],
'unique_id': ['VAAA0022179', 'VAAA0022179']}
Note that csv_read
doesn't directly modify any global variables. It creates a dictionary of lists and passes it back to the code that calls it. This makes the code more modular; trying to debug large programs that use globals can become a nightmare because you have to keep track of every part of the program that modifies those globals.
Alternatively, you can put the data into a list of dictionaries, one per row.
def csv_read(file_obj):
data = []
for line in csv.DictReader(file_obj, delimiter=','):
luserfld10 = line["LUSERFLD10"]
bal_fwd, sdc, est_read = luserfld10.split(' | ')
# Put desired data and into a new dictionary
row = {
'unique_id': line["LUSERFLD4"],
'total_amt': line["LAMOUNT1"],
'bal_fwd': bal_fwd,
'sdc': sdc,
'est_read': est_read,
}
data.append(row)
return data
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
data = csv_read(f_obj)
pprint(data)
output
[{'bal_fwd': 'N',
'est_read': 'N',
'sdc': 'N',
'total_amt': '-12847.28',
'unique_id': 'VAAA0022179'},
{'bal_fwd': 'N',
'est_read': 'N',
'sdc': 'N',
'total_amt': '12847.28',
'unique_id': 'VAAA0022179'}]