1

Is this possible? I want to print lines in my file 5 at a time (to send to an API in a batch). But when I get to the last few lines they never print because there are less than 5, never triggering my if statement to print. SO I figured one way to tackle this is to print the remaining lines when the loop closes.

The current code is messy and redundant but this is the idea:

urls = []
urls_csv = ""
counter = 0

with open(urls_file) as f:
    for line in f:

        # Keep track of the lines we've went through
        counter = counter + 1

        # If we have 5 urls in our list it's time to send them to the API call
        if counter > 5:
            counter = 0
            urls_csv = ",".join(urls) # turn the python list into a string csv list
            do_api(urls_csv) # put them to work

            urls = [] # reset the value so we don't send the same urls next time
            urls_csv = "" # reset the value so we don't send the same urls next time
         # Else append to the url list
         else:
            urls.append(line.strip))

Also - Generally speaking, is there a better way to tackle this?

Thisisstackoverflow
  • 251
  • 1
  • 2
  • 11
  • is "hashes_csv" supposed to be "urls_csv" ? – TehTris Dec 11 '15 at 21:52
  • also, personally, i would just get all of the URLs and then split them into 5 sized chunks. and then print them(or use them or watever) like that. Also id probably keep them as lists for as long as i could, so i could iterate over it and use the strings inside without having to do weird splits and joins all the time. – TehTris Dec 11 '15 at 21:54
  • this is what you are looking for : http://stackoverflow.com/questions/1630320/what-is-the-pythonic-way-to-detect-the-last-element-in-a-python-for-loop – Romain Dec 11 '15 at 22:04
  • Yes, hashes should of been urls. As for Romain's link, I can't do anything special at the beginning of the loop (math with remainder) because I don't know how long the file will be... because I'm reading the lines into memory one at a time. The file will be pretty big but I guess I could read it in all at once. – Thisisstackoverflow Dec 11 '15 at 22:34

2 Answers2

2

You can group them into sets of 5 lines at a time with the itertools grouper recipe.

import itertools

def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return itertools.zip_longest(*args, fillvalue=fillvalue)

with open(...) as f:
    for group in grouper(f, 5, fillvalue=""):
        do_api(",".join([g.strip() for g in group if g]))
Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • 1
    upvoted. I was trying out the same solution, but I got stuck on how to remove the fillvalue. Easily solved. Thank you – Pynchia Dec 11 '15 at 22:20
1

What do you think of

urls = []

with open(urls_file) as f:
    while True:
        try:
            for i in range(5):
                urls.append(next(f).rstrip())
            print(urls)  # i.e. you have the list of urls, now use it/put it to work
            urls = []
        except StopIteration:
            print(urls)
            break

with an input file of

line1
line2
line3
line4
line5
line6
line7

it produces

['line1', 'line2', 'line3', 'line4', 'line5']
['line6', 'line7']
Pynchia
  • 10,996
  • 5
  • 34
  • 43