What is the most Pythonic way to index collection data

Question

I wrote a quick script to scrape various data about mixed martial arts fights and their associated odds.

Originally, the data was a tuple with the first entry being the name of a fighter (string) and the second being their odds (float). The script later accessed this data, and I defined two constants, FIGHTER = 0 and ODDS = 1 so that I later use fight_data[FIGHTER] or fight_data[ODDS].

Since the data is immutable, a tuple made sense, and by defining constants my reasoning was that my IDE/Editor could catch typos as opposed to using a string index for a dictionary.

FIGHTER = 0
ODDS = 1
fight_data = get_data()

def process_data(fight_data):
    do_something(fight_data[FIGHTER])
    do_something(fight_data[ODDS])

What are the other alternatives? I thought of making a FightData class, but the data is strictly a value object with two small elements.

class FightData(object):
    fighter = None
    odds = None
    def __init__(self, fighter, odds):
        self.fighter = fighter
        self.odds = odds

    fight_data = get_data()

    def process_data(data):
        do_something(fight_data.fighter)
        do_something(fight_data.odds)

In addition, I realized I could use a dictionary, and have fight_data["fighter"] but that seems both ugly and unnecessary to me.

Which one of these alternatives is the best?

No code-review requested, just a question in terms of style. — Viktor Chynarov, Dec 23 '14 at 19:45
@ViktorChynarov Do you have some code to show us? Are you running into any performance issues? I'm really not sure how to address this question. — wheaties, Dec 23 '14 at 19:49
"Pythonic" -- a linguistic and conceptual disappointment. (i.e. "I worry that I'm just not Pythonic enough.") Previously this concept was just called "idiomatic" -- and not so laden with Pythonic self-consciousness. :) — J0e3gan, Dec 23 '14 at 19:49

score 1 · Answer 1 · answered Dec 23 '14 at 20:03

Python is a "multi-paradigm" language, so in my opinion, either the procedural approach or the object-oriented approach is valid and Pythonic. For this use-case, with such a limited amount of data, I don't think you need to worry too much.

However, if you're going down the OOP route, I would define your class to be called Fighter and give it attributes called name and odds, and then do_something with the entire Fighter instance:

class Fighter(object):
    def __init__(self, name, odds):
        self.name = name
        self.odds = odds

fighters = get_data()

# for example:
for fighter in fighters:
     do_something(fighter)

*There should be one-- and preferably only one --obvious way to do it.* – [PEP 20: The Zen of Python](https://www.python.org/dev/peps/pep-0020/) (aka `import this`) — Palec, Dec 23 '14 at 20:43

score 1 · Answer 2 · answered Dec 23 '14 at 20:14

These are my thoughts... unless you have serious performance issues or efficiency metrics you're trying to achieve, I would use a dict instead of a tuple. Just because the data is immutable doesn't mean you have to use a tuple. And IMO it looks cleaner and is easier to read. Using magic numbers like:

FIGHTER = 1
ODDS = 0

as index markers makes the code harder to understand. And a class is a bit overkill. But if you use a dict your code will look something like:

fight_data = get_data()

def process_data(fight_data):
    do_something(fight_data['fighter'])
    do_something(fight_data['odds'])

I just got rid of two lines of code, and now we don't have to use any magic variables to reference data. It's much easier to see exactly what you're doing without having to worry about FIGHTER and ODDS.

Don't use variables if you really don't have to. FIGHTER and ODDS really aren't necessary, that's why we have dicts.

score 1 · Answer 3 · edited May 23 '17 at 10:33

Simple pieces of immutable data that you want to reference by field-name sounds like the perfect usecase for a namedtuple.

The SO question/answer in the above link gives a great explanation, but in summary: namedtuples are easily defined, memory-efficient immutable data structures that support data access via attribute reference much like Python Classes, but also fully support tuple operations as well.

from collections import namedtuple

#Defining the form of the namedtuple is much more lightweight than Classes
FightData = namedtuple("FightData", "fighter odds")

#You instantiate a namedtuple much like you would a class instance
fight_data1 = FightData("Andy Hug", 0.8)

#Fields can be referenced by name
print fight_data1.fighter
print fight_data1.odds

#Or by index just like a normal tuple
print fight_data1[0], fight_data1[1]

#They're tuples, so can be iterated over as well
for data in fight_data1:
    print data

What is the most Pythonic way to index collection data

3 Answers3