CSV Parsing, trying to understand some code

Question

Here's the code

import csv


def csv_dict_reader(file_obj):
    """
    read a CSV file using csv.DictReader
    """

    reader = csv.DictReader(file_obj, delimiter=',')
    for line in reader:
        print(line['first_name']),
        print(line['last_name']),

if __name__== "__main__":
    with open("dummy.csv") as f_obj:
        csv_dict_reader(f_obj)

I wanted to try and do a quick breakdown, to see if I understand how exactly this works. Here we go:

1) import csv brings in the csv method

2) We define a function, which takes 'file_obj' as its argument

3) the reader variable makes a call to a function within csv called "DictReadre", which subsequently takes arguments from 'file_obj' and specifies a 'delimiter'

4) I get confused with this for loop, why is that we don't have to define line beforehand? Is it that line is already defined as part of 'reader'?

5) I'm really confused when it comes to 'name' and 'main', are these somehow related to how we specify a 'file_obj'? I'm equally confused with how we end up specifying the 'file_obj' in the end; I've been assuming 'f_obj' somehow manages to fill this role.

--edit--

Awesome, this is starting to make a whole lot more sense to me. So, when I make a 'class' call to DictReader(), I'm creating an instance of it in the variable 'reader'?

Maybe I'm going too far off the beaten path, but what in the DictReader() class allows for it to determine the structure of fields like 'last_name' or 'first_name'? I'm assuming it has something to do with how CSV files are structures, but I'm not entirely certain.

I think this post will help you http://stackoverflow.com/questions/419163/what-does-if-name-main-do — Hamed Moghaddam, Dec 03 '14 at 02:41

abarnert · Accepted Answer · 2014-12-03T02:52:38.357

1) import csv brings in the csv method

Well, not quite; it brings in the csv module.*

_{* … which includes the csv.DictReader class, which has a csv.DictReader.__next__ method that you call implicitly, but that's not important here.}

2) We define a function, which takes 'file_obj' as its argument

Exactly.*

_{* Technically, there's a distinction between arguments and parameters, or between actual vs. formal arguments/parameters. You probably don't want to learn that yet. But if you do, formal parameters go in function definitions; actual arguments go in function calls.}

3) the reader variable makes a call to a function within csv called "DictReadre", which subsequently takes arguments from 'file_obj' and specifies a 'delimiter'

Again, not quite; it makes a call to the class DictReader. Calling a class constructs an instance of that class. Arguments are passed the same way as in a function call.* You can see the parameters that DictReader takes by looking it up in the help.

_{* In fact, constructing a class actually calls the class's __new__ method, and then (usually) its __init__ method. But that's only important when you're writing new classes; when you're just using classes, you don't care about __new__ or __init__. That's why the documentation shows, e.g., class csv.DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds).}

4) I get confused with this for loop, why is that we don't have to define line beforehand? Is it that line is already defined as part of 'reader'?

No, that's exactly what for statements do: each time through the loop, line gets assigned to the next value in reader. The tutorial explains in more detail.

A simpler example may help:

for a in [1, 2, 3]:
    print(a)

This assigns 1 to a, prints out that 1, then assigns 2 to a, prints out that 2, then assigns 3 to a, prints out that 3, then it's done.

Also, you may be confused by other languages, which need variables to be declared before they can be used. Python doesn't do that; you can assign to any name you want anywhere you want, and if there wasn't a variable with that name, there is now.

5) I'm really confused when it comes to 'name' and 'main'

This is a special case where you have to learn something reasonably advanced a little early.

The same source code file can be used as a script, to run on the command line, and also as a module, to be imported by other code. The way you distinguish between the two is by checking __name__. If you're being run as a script, it will be '__main__'. If you're being used as a module by some other script, it will be whatever the name of your module is.

So, idiomatically, you define all your public classes and functions and constants that might be useful to someone else, then you do if __name__ == '__main__': and put all the "top-level script" code there that you want to execute if someone runs you as a script.

Again, the tutorial explains in more detail.

Looking for tutorial links, I just realized the official tutorial doesn't explain how to use stdlib and other modules until after it's explained how to create your own modules, run them as scripts, install them on `sys.path`, and find `.pyc` files in the cache… That seems a little bit out of order… — abarnert, Dec 03 '14 at 02:54

CSV Parsing, trying to understand some code

1 Answers1