0

I am trying to split a list that I have converted with str(), but I don't seem to be returning any results?

My code is as follows:

import csv

def csv_read(file_obj):
    reader=csv.DictReader(file_obj,delimiter=',')
    for line in reader:
        unique_id.append(line["LUSERFLD4"])
        total_amt.append(line["LAMOUNT1"])
        luserfld10.append(line["LUSERFLD10"])       
        break

    bal_fwd, junk, sdc, junk2, est_read=str(luserfld10).split(' ')

if __name__=="__main__":
    with open("UT_0004A493.csv") as f_obj:
        csv_read(f_obj)

print (luserfld10)
print (bal_fwd)
print (sdc)
print (est_read)

print (luserfld10) returns ['N | N | Y'] which is correct. (Due to system limitations when creating the csv file, this field holds three separate values)
All variables have been defined and I'm not getting any errors, but my last three print commands are returning empty lists?

I've tried dedenting the .split() line, but then I can unpack only one value.

How do I get them to each return N or Y?

Why isn't it working as it is?

I'm sure it's obvious, but this is my first week of coding and I haven't been able to find the answer anywhere here. Any help (with explanations please) would be appreciated :)

Edit: all defined variables are as follows:

luserfld10=[]
bal_fwd=[]
sdc=[]
est_read=[]

etc.

File contents I'm not certain how to show? I hope this is okay?

LACCNBR,LAMOUNT1,LUSERFLD4,LUSERFLD5,LUSERFLD6,LUSERFLD8,LUSERFLD9,LUSERFLD10
1290,-12847.28,VAAA0022179,84889.363,Off Peak - nil,5524.11,,N | N | N
2540255724,12847.28,VAAA0022179,84889.363,Off Peak - nil,5524.11,,N | N | N
SparkAndShine
  • 17,001
  • 22
  • 90
  • 134
  • 2
    Unless you provide the file contents and the definitions for the variables nobody can reproduce this. If I were you I'd edit this and provide these things. – Dimitris Fasarakis Hilliard Nov 25 '16 at 10:50
  • 1
    Looks like you are split on list instead of string, try `luserfld10[0].split()` – Skycc Nov 25 '16 at 10:50
  • Is line you are trying to split "N|N|Y"? If yes, then why do you split with ',', you should split wiht '|' – Ada Borowa Nov 25 '16 at 11:02
  • JimFasarakis-Hilliard - More information added. I hope that's okay? Skycc - I've tried that now, but am getting the same result. AdaBorowa - The string has spaces before and after the '|'. I tried using ' | ' but that didn't work either.. – SparkleBerries Nov 25 '16 at 11:09
  • 1
    use globals inside of your method. – Lafexlos Nov 25 '16 at 11:13
  • @Lafexlos It worked! Thank you so much!! I'm not sure what you mean by local vs global variables though? – SparkleBerries Nov 25 '16 at 11:14
  • http://stackoverflow.com/questions/855493/referenced-before-assignment-error-in-python Check this one please. It explains way better than I can for sure. :) – Lafexlos Nov 25 '16 at 11:20
  • Your `csv_read` function doesn't have a `return` statement, so it returns `None`. Why do you have a `break` statement in the `for` loop? Do you only want to read the first line of data from the CSV file? Why do you want to put the data into separate lists: a single list of dictionaries (with each dictionary containing the data you want to keep from each row of the CSV) would be more useful, IMHO. – PM 2Ring Nov 25 '16 at 11:45
  • I think you will find these articles helpful: [Other languages have "variables", Python has "names"](http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#other-languages-have-variables) and [Facts and myths about Python names and values](http://nedbatchelder.com/text/names.html). – PM 2Ring Nov 25 '16 at 11:49

2 Answers2

0

If the luserfld10 is ['N | N | Y']

then,

luserfld10[0].replace('|', '').split()

Result:

['N', 'N', 'Y']
wolendranh
  • 4,202
  • 1
  • 28
  • 37
kkiat
  • 541
  • 3
  • 11
0

Even if you fix the .split stuff in

bal_fwd, junk, sdc, junk2, est_read=str(luserfld10).split(' ')

it won't do what you want because it's assigning the results of the split to local names bal_fwd, sdc, etc, that only exist inside the csv_read function, not to the names you defined outside the function in the global scope.

You could use global statements to tell Python to assign those values to the global names, but it's generally best to avoid using the global statement unless you really need it. Also, merely using a global statement won't put the string data into your bal_fwd list. Instead, it will bind the global name to your string data and discard the list. If you want to put the string into the list you need to .append it, like you did with unique_id. You don't need global for that, since you aren't performing an assignment, you're just modifying the existing list object.

Here's a repaired version of your code, tested with the data sample you posted.

import csv

unique_id = []
total_amt = []
luserfld10 = []
bal_fwd = []
sdc = []
est_read = []

def csv_read(file_obj):
    for line in csv.DictReader(file_obj, delimiter=','):
        unique_id.append(line["LUSERFLD4"])
        total_amt.append(line["LAMOUNT1"])
        fld10 = line["LUSERFLD10"]
        luserfld10.append(fld10)

        t = fld10.split(' | ')
        bal_fwd.append(t[0])
        sdc.append(t[1])
        est_read.append(t[2])

if __name__=="__main__":
    with open("UT_0004A493.csv") as f_obj:
        csv_read(f_obj)

    print('id', unique_id)
    print('amt', total_amt)
    print('fld10', luserfld10)
    print('bal', bal_fwd)
    print('sdc', sdc)
    print('est_read', est_read)

output

id ['VAAA0022179', 'VAAA0022179']
amt ['-12847.28', '12847.28']
fld10 ['N | N | N', 'N | N | N']
bal ['N', 'N']
sdc ['N', 'N']
est_read ['N', 'N']

I should mention that using t = fld10.split(' | ') is a bit fragile: if the separator isn't exactly ' | ' then the split will fail. So if there's a possibility that there might not be exactly one space either side of the pipe (|) then you should use a variation of Jinje's suggestion:

t = fld10.replace('|', ' ').split()

This replaces all pipe chars with spaces, and then splits on runs of white space, so it's guaranteed to split the subields correctly, assuming there's at least one space or pipe between each subfield (Jinje's original suggestion will fail if both spaces are missing on either side of the pipe).


Breaking your data up into separate lists may not be a great strategy: you have to be careful to keep the lists synchronised, so it's tricky to sort them or to add or remove items. And it's tedious to manipulate all the data as a unit when you have it spread out over half a dozen named lists.

One option is to put your data into a dictionary of lists:

import csv
from pprint import pprint

def csv_read(file_obj):
    data = {
        'unique_id': [],
        'total_amt': [],
        'bal_fwd': [],
        'sdc': [],
        'est_read': [],
    }

    for line in csv.DictReader(file_obj, delimiter=','):
        data['unique_id'].append(line["LUSERFLD4"])
        data['total_amt'].append(line["LAMOUNT1"])
        fld10 = line["LUSERFLD10"]

        t = fld10.split(' | ')
        data['bal_fwd'].append(t[0])
        data['sdc'].append(t[1])
        data['est_read'].append(t[2])

    return data

if __name__=="__main__":
    with open("UT_0004A493.csv") as f_obj:
        data = csv_read(f_obj)

    pprint(data)

output

{'bal_fwd': ['N', 'N'],
 'est_read': ['N', 'N'],
 'sdc': ['N', 'N'],
 'total_amt': ['-12847.28', '12847.28'],
 'unique_id': ['VAAA0022179', 'VAAA0022179']}

Note that csv_read doesn't directly modify any global variables. It creates a dictionary of lists and passes it back to the code that calls it. This makes the code more modular; trying to debug large programs that use globals can become a nightmare because you have to keep track of every part of the program that modifies those globals.


Alternatively, you can put the data into a list of dictionaries, one per row.

def csv_read(file_obj):
    data = []
    for line in csv.DictReader(file_obj, delimiter=','):
        luserfld10 = line["LUSERFLD10"]
        bal_fwd, sdc, est_read = luserfld10.split(' | ')
        # Put desired data and into a new dictionary
        row = {
            'unique_id': line["LUSERFLD4"],
            'total_amt': line["LAMOUNT1"],
            'bal_fwd': bal_fwd,
            'sdc': sdc,
            'est_read': est_read, 
        }
        data.append(row)
    return data

if __name__=="__main__":
    with open("UT_0004A493.csv") as f_obj:
        data = csv_read(f_obj)

    pprint(data)

output

[{'bal_fwd': 'N',
  'est_read': 'N',
  'sdc': 'N',
  'total_amt': '-12847.28',
  'unique_id': 'VAAA0022179'},
 {'bal_fwd': 'N',
  'est_read': 'N',
  'sdc': 'N',
  'total_amt': '12847.28',
  'unique_id': 'VAAA0022179'}]
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182