5

I'm trying to iterate over each row in a list of lists, append an element from each row to a new list, then find the unique elements in the new list.

I understand that I can do this easily with a for loop. I'm trying a different route because I want to learn more about classes and functions.

Here's an example of the list of lists. The first row is the header:

legislators = [
 ['last_name', 'first_name', 'birthday', 'gender', 'type', 'state', 'party'],
 ['Bassett', 'Richard', '1745-04-02', 'M', 'sen', 'DE', 'Anti-Administration'],
 ['Bland', 'Theodorick', '1742-03-21', '', 'rep', 'VA', ''],
 ['Burke', 'Aedanus', '1743-06-16', '', 'rep', 'SC', ''],
 ['Carroll', 'Daniel', '1730-07-22', 'M', 'rep', 'MD', ''],
 ['Clymer', 'George', '1739-03-16', 'M', 'rep', 'PA', ''],
 ['Contee', 'Benjamin', '', 'M', 'rep', 'MD', ''],...]

Here's my code:

import csv
f = open("legislators.csv")
csvreader = csv.reader(f)
legislators = list(csvreader)

class Dataset:
    def __init__(self, data):
        self.header = data[0] #Isolate header from CSV file
        self.data = data[1:] #Subset CSV data to remove header

legislators_dataset = Dataset(legislators)

def the_set_maker(dataset):
    gender = []
    for each in dataset:
        gender.append(each[3])
    return set(gender)

t=the_set_maker(legislators_dataset)
print(t)

I get the following error:

TypeErrorTraceback (most recent call last)
<ipython-input-1-d65cb459931b> in <module>()
     20     return set(gender)
     21
---> 22 t=the_set_maker(legislators_dataset)
     23 print(t)

<ipython-input-1-d65cb459931b> in the_set_maker(dataset)
     16 def the_set_maker(dataset):
     17     gender = []
---> 18     for each in dataset:
     19         gender.append(each[3])
     20     return set(gender)

TypeError: 'Dataset' object is not iterable

I think the answer is to try to create a method using def __iter__(self) in my Dataset class, but I haven't been able to get this to work. Is this the right track? If not, what's a better one?

martineau
  • 119,623
  • 25
  • 170
  • 301
AwfulPersimmon
  • 175
  • 2
  • 14
  • To make an object *iterable*, it needs to implement `__iter__`, which must return an *iterator*, i.e., an object that implements `__iter__` **and** `__next__`. Iterator `__iter__` methods should simply return `self`. – juanpa.arrivillaga Aug 15 '17 at 19:30
  • @juanpa.arrivillaga Thank you. I'll look into __next__. Can you demonstrate how to use __next__ and __iter__ in my code? – AwfulPersimmon Aug 15 '17 at 19:31
  • I'm not convinced this is a duplicate of that. You answered "How do I make my class iterable" with "what is an iterable," which requires at least one or two logical jumps to implement. I've reopened. – Adam Smith Aug 15 '17 at 19:32
  • You should be able the get the behavior you're looking for by defining your iter method as **return iter(self.data)**. Alternatively, take a look at making a Pandas DataFrame out of your CSV file (or out of the list objects), and reference the column by name instead – scnerd Aug 15 '17 at 19:33
  • What's more, with the detail in the question, I feel like this can generate a useful answer. – Adam Smith Aug 15 '17 at 19:33
  • @AdamSmith check out the second dupe target, "Build a basic Python iterator". – juanpa.arrivillaga Aug 15 '17 at 19:33
  • @TroubleZero uh, that is totally wrong. All built-in container types are iterable in Python - heck, even many non-container types are iterable (e.g. file objects). Also, your class should't "return data[1:]", whatever that means. Rather, you *should make your class iterable*, which might involve simply delegating to `iter(self.data)` – juanpa.arrivillaga Aug 15 '17 at 19:39
  • 1
    Just an aside, you can do `self.header, *self.data = data`... – Jon Clements Aug 15 '17 at 19:40
  • @JonClements That's very cool, I never knew you could do that in an assignment. Great suggestion – scnerd Aug 15 '17 at 19:41
  • As a minor suggestion, you can also improve your set_maker function using a set comprehension: `return {each[3] for each in dataset}` – scnerd Aug 15 '17 at 19:48

3 Answers3

5

According to the documentation for __iter__:

This method should return a new iterator object that can iterate over all the objects in the container.

You might try the following class definition:

class Dataset:
    def __init__(self, data):
        self.header = data[0] #Isolate header from CSV file
        self.data = data[1:] #Subset CSV data to remove header

    def __iter__(self):
        return iter(self.data)

If you're open to trying new options, consider using Pandas:

import pandas as pd
df = pd.read_csv('legislators.csv')
t=df['gender']

Or, if you really want to read in the CSV yourself,

df = pd.DataFrame(legislators[1:], columns=legislators[0])
martineau
  • 119,623
  • 25
  • 170
  • 301
scnerd
  • 5,836
  • 2
  • 21
  • 36
  • The `pandas` seems irrelevant. Regardless, the `csv` module is just fine... – juanpa.arrivillaga Aug 15 '17 at 19:41
  • True, it's irrelevant to learning how to make objects iterable. In the specific case given, though, he mentioned wanting to get specific elements out of each row, in which case `numpy` or (since we have labeled columns) `pandas` is ideal for the purpose – scnerd Aug 15 '17 at 19:44
  • ... so is the standard library `csv` and it doesn't require the heavy, heavy, `numpy`/`pandas` dependancy just to grab data from a column in a csv file. – juanpa.arrivillaga Aug 15 '17 at 19:45
  • 1
    I was able to get the code to work- thank you all for your help and for the alternative ideas, especially the DataFrame. As you can see, I'm new to all of this, so I really appreciate the support. – AwfulPersimmon Aug 15 '17 at 19:49
2

As you mentioned, you'll need to implement __iter__ in class Dataset. Note that this is actually the set(...) call that's throwing the error, since it iterates through your class to enumerate the set elements.

Luckily your set elements are likely just Dataset.data, which makes this easy to write Dataset.__iter__.

class Dataset(object):
    ...

    def __iter__(self):
        return iter(self)

I would point out, however, that your the_set_maker function seems a little too specialized to be top-level. It's also a bit trivial, since it's literally set([el[3] for el in container]). I would put this in Dataset as well.

class Dataset(object):
    ...

    def to_set(self):
        return set([el[3] for el in self.data])
        # Note that this throws away your header!
Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • 1
    Thank you for this. I agree that the function is trivial- I'm trying to use classes and functions even if they are trivial so that I can become more comfortable with them, so I appreciate your example of how I can incorporate the function into the class! – AwfulPersimmon Aug 15 '17 at 19:58
-2

you need to change a little

class Dataset:
    i = 0

    def __init__(self, data):
        self.header = data[0] #Isolate header from CSV file
        self.data = data[1:] #Subset CSV data to remove header

    def __iter__(self):
        return self
    def __next__(self):
        return self.next()

    def next(self):
        if self.i < len(self.data):
            self.i += 1
            return self.data[self.i-1]
        else:
            raise StopIteration()
Yuriy Arhipov
  • 342
  • 2
  • 4
  • This is incorrectly implemented, you don't need *both `.next` and `__next__`*, the former is Python 2, the latter is Python 3, don't mix them. – juanpa.arrivillaga Aug 15 '17 at 19:34
  • Also, this makes `Dataset` an *iterator*, which is almost certainly not what you want. – juanpa.arrivillaga Aug 15 '17 at 19:35
  • That's fine, I suppose, if you want to write code that is runnable on Python 2/3, but the main issue here is that you've made `Dataset` an *iterator*, not merely *iterable*, which is not a good practice. Consider all the built-in container types, `list`, `tuple`, `set` etc - these return specialized iterator objects from `iter`, not themselves. The *iterators* return themselves. – juanpa.arrivillaga Aug 15 '17 at 19:37
  • Also, you've made `i` a *class variable*, then *shadow it with an instance variable*, not really what you want, and can lead to subtle bugs... – juanpa.arrivillaga Aug 15 '17 at 19:40
  • @Yuriy Arhipov I appreciate the help- like the answer I selected, it's a bit beyond me, but that's good. – AwfulPersimmon Aug 15 '17 at 20:01