0

I have a few Pandas dataframes that I want to loop through at once to do some initial verification and debugging, but I can't figure out how to automate it. I've read a bunch of posts on here and various blogs of people trying to do something similar and the responses all tend towards "That's the wrong way to do that", but I haven't found anything that actually does what I'm looking for. In slightly-more-than-pseudocode, what I'm trying to do is:

for i in ('train', 'test', 'address', 'latlon'):
        print('{}:'.format(i))
        print(<i>.head())  ## what should <i> be?

In shell and Perl scripts it's as simple as encapsulating the variable name in {} (${i}), but I can't find a Python equivalent. I've tried various permutations of format(), but I keep getting AttributeError: 'str' object has no attribute 'head' (I get the same error if I try just print(i.head())). I've also tried using globals(), but I get a key error on the first loop.

This is just for early-stage development and will get removed, so it doesn't have to be super clean, but it's an issue that I've ran into a few times now and it's really aggravating me.

EDIT: After some trial and error I got the below to work. Hopefully this will help someone in the future.

frames = dict({'train': train.head(), 'test': test.head(), 'address':address.head(), 'latlon': latlon.head()})

for i in frames.keys():
    print('{}'.format(i))
    print(frames[i])

I still don't understand how this is supposed to be an improvement over something like ${i} available in other languages, but it works.

jml
  • 137
  • 1
  • 10
  • 1
    Don't do this. Just use a dictionary or list instead. – Carcigenicate Dec 09 '19 at 17:04
  • 1
    Just put the real variable in your tuple, instead of a string containing its name: `for i in (train, test ...): print(i.head())` – Thierry Lathuille Dec 09 '19 at 17:05
  • @Carcigenicate, that answer isn't helpful. There's a hundred examples of people saying literally that exact thing, but if I could see how a dictionary or list would solve my problem I wouldn't be here. – jml Dec 09 '19 at 17:09
  • @ThierryLathuille, that got me closer, but it causes the `print('{}:'.format(i))` to print the entire df instead of the name. If I can only have the name or the contents I'll take the latter, but it helps to have the name when it loops through to the next df. – jml Dec 09 '19 at 17:11
  • @jml Have a dictionary mapping, for example, `"train"` to the object that you want `.head` to be. Or just stick the train, test, address... objects in a list and iterate over them. `for obj in [train_obj, test_obj]: print(obj.head())` like Thierry showed. – Carcigenicate Dec 09 '19 at 17:12
  • That helped me get it sorted out. I added specifics as a separate answer to https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables – jml Dec 09 '19 at 18:02

0 Answers0