1

A lot of requests now return page-tokens and I was curious what the most appropriate way of dealing with them was?

I have usually just gone with recursion:

def get_followers(user_id,
                  access_token,
                  cursor=''):
    url = 'https://api.instagram.com/v1/users/%s/followed-by?access_token=%s&cursor=%s' % (my_user,
                                                                                           access_token,
                                                                                           cursor)
    print(url)
    response = requests.get(url)
    json_out = json.loads(response.text)
    for user in json_out['data']:
        usrname = user['username']
        my_followers.add_follower(usrname)
        print("Added: %s" % usrname)
    try:
        cursor = json_out['pagination']['next_url']
        get_followers(user_id, access_token, cursor)
    except KeyError:
        print("Finished with %d followers" % len(my_followers.followers))
    return

However, perhaps a:

while True:
  ..
  if condition:
    break

Or some other implementation is seen as more efficient/pythonic?

mptevsion
  • 937
  • 8
  • 28
  • Is there a chance you'll recurse beyond [the maximum recursion depth](http://stackoverflow.com/questions/3323001/maximum-recursion-depth)? – Peter Wood Jan 06 '16 at 10:03

2 Answers2

3

You could change this to a generator function that yields the next follower when the structure is accessed, fetching more data only when necessary.

def get_followers(...)
    ...
    while token is not None:
        # fetch data
        ...
        for user in json_out['data']:
            yield user
        ...
        # fetch new token

Then, iterate through this generator to apply your data handling. This also has the advantage of separating data acquisition and handling logic.

followers = get_followers(...) 
for user in followers:
        username = user['username']
        my_followers.add_follower(username)
        print("Added: %s" % username)
relet
  • 6,819
  • 2
  • 33
  • 41
1

Thanks for the help, I have never used generators apart from the implicit (range vs xrange) distinction. I was curious if in this case it provides any speed/memory advantage?

My final code-snippet:

def get_followers(user_id,
                  access_token,
                  cursor=''):
    """ Create a generator which will be accessed later to get the usernames of the followers"""
    while cursor is not None:
        # fetch data
        url = 'https://api.instagram.com/v1/users/%s/followed-by?access_token=%s&cursor=%s' % (my_user,
                                                                                               access_token,
                                                                                               cursor)
        print(url)
        response = requests.get(url)
        json_out = json.loads(response.text)
        for user in json_out['data']:
            yield user

        # fetch new token
        try:
            cursor = json_out['pagination']['next_url']
        except KeyError:
            print("Finished with %d followers" % len(my_followers.followers))
            cursor = None

followers_gen = get_followers(my_user, access_token)  # Get followers
for usr in followers_gen:
    usrname = usr['username']
    my_followers.add_follower(usrname)
    print("Added: %s" % usrname)

Compared to not using a generator:

def get_followers(user_id,
                  access_token,
                  cursor=''):
    """ Create a generator which will be accessed later to get the usernames of the followers"""
    while cursor is not None:
        # fetch data
        url = 'https://api.instagram.com/v1/users/%s/followed-by?access_token=%s&cursor=%s' % (my_user,
                                                                                               access_token,
                                                                                               cursor)
        print(url)
        response = requests.get(url)
        json_out = json.loads(response.text)
        for usr in json_out['data']:
            usrname = usr['username']
            my_followers.add_follower(usrname)
            print("Added: %s" % usrname)

        # fetch new token
        try:
            cursor = json_out['pagination']['next_url']
            get_followers(user_id, access_token, cursor)
        except KeyError:
            print("Finished with %d followers" % len(my_followers.followers))
            cursor = None

get_followers(my_user, access_token)  # Get followers
mptevsion
  • 937
  • 8
  • 28
  • It depends on what you do with the list afterwards. If the only goal is to generate the whole list of followers every single time, then no, you probably won't have a speed advantage. – relet Jan 07 '16 at 07:44