2

I have written a small program based on iterators to display a multicolumn calendar.

In that code I am using itertools.groupby to group the dates by month by the function group_by_months(). There I yield the month name and the grouped dates as a list for every month. However, when I let that function directly return the grouped dates as an iterator (instead of a list) the program leaves the days of all but the last column blank.

I can't figure out why that might be. Am I using groupby wrong? Can anyone help me to spot the place where the iterator is consumed or its output is ignored? Why is it especially the last column that "survives"?

Here's the code:

import datetime
from itertools import zip_longest, groupby

def grouper(iterable, n, fillvalue=None):
    """\
    copied from the docs:
    https://docs.python.org/3.4/library/itertools.html#itertools-recipes
    """
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

def generate_dates(start_date, end_date, step=datetime.timedelta(days=1)):
    while start_date < end_date:
        yield start_date
        start_date += step

def group_by_months(seq):
    for k,v in groupby(seq, key=lambda x:x.strftime("%B")):
        yield k, v # Why does it only work when list(v) is yielded here?

def group_by_weeks(seq):
    yield from groupby(seq, key=lambda x:x.strftime("%2U"))

def format_month(month, dates_of_month):
    def format_week(weeknum, dates_of_week):
        def format_day(d):
            return d.strftime("%3e")
        weekdays = {d.weekday(): format_day(d) for d in dates_of_week}
        return "{0} {7} {1} {2} {3} {4} {5} {6}".format(
            weeknum, *[weekdays.get(i, "   ") for i in range(7)])
    yield "{:^30}".format(month)
    weeks = group_by_weeks(dates_of_month)
    yield from map(lambda x:format_week(*x), weeks)

start, end = datetime.date(2016,1,1), datetime.date(2017,1,1)
dates = generate_dates(start, end)
months = group_by_months(dates)
formatted_months = map(lambda x: (format_month(*x)), months)
ncolumns = 3
quarters = grouper(formatted_months, ncolumns)
interleaved = map(lambda x: zip_longest(*x, fillvalue=" "*30), quarters)
formatted = map(lambda x: "\n".join(map("   ".join, x)), interleaved)
list(map(print, formatted))

This is the failing output:

           January                          February                          March             
                                                                  09           1   2   3   4   5
                                                                  10   6   7   8   9  10  11  12
                                                                  11  13  14  15  16  17  18  19
                                                                  12  20  21  22  23  24  25  26
                                                                  13  27  28  29  30  31        
            April                             May                              June             
                                                                  22               1   2   3   4
                                                                  23   5   6   7   8   9  10  11
                                                                  24  12  13  14  15  16  17  18
                                                                  25  19  20  21  22  23  24  25
                                                                  26  26  27  28  29  30        
             July                            August                         September           
                                                                  35                   1   2   3
                                                                  36   4   5   6   7   8   9  10
                                                                  37  11  12  13  14  15  16  17
                                                                  38  18  19  20  21  22  23  24
                                                                  39  25  26  27  28  29  30    
           October                          November                         December           
                                                                  48                   1   2   3
                                                                  49   4   5   6   7   8   9  10
                                                                  50  11  12  13  14  15  16  17
                                                                  51  18  19  20  21  22  23  24
                                                                  52  25  26  27  28  29  30  31

This is the expected output:

           January                          February                          March             
00                       1   2   05       1   2   3   4   5   6   09           1   2   3   4   5
01   3   4   5   6   7   8   9   06   7   8   9  10  11  12  13   10   6   7   8   9  10  11  12
02  10  11  12  13  14  15  16   07  14  15  16  17  18  19  20   11  13  14  15  16  17  18  19
03  17  18  19  20  21  22  23   08  21  22  23  24  25  26  27   12  20  21  22  23  24  25  26
04  24  25  26  27  28  29  30   09  28  29                       13  27  28  29  30  31        
05  31                                                                                          
            April                             May                              June             
13                       1   2   18   1   2   3   4   5   6   7   22               1   2   3   4
14   3   4   5   6   7   8   9   19   8   9  10  11  12  13  14   23   5   6   7   8   9  10  11
15  10  11  12  13  14  15  16   20  15  16  17  18  19  20  21   24  12  13  14  15  16  17  18
16  17  18  19  20  21  22  23   21  22  23  24  25  26  27  28   25  19  20  21  22  23  24  25
17  24  25  26  27  28  29  30   22  29  30  31                   26  26  27  28  29  30        
             July                            August                         September           
26                       1   2   31       1   2   3   4   5   6   35                   1   2   3
27   3   4   5   6   7   8   9   32   7   8   9  10  11  12  13   36   4   5   6   7   8   9  10
28  10  11  12  13  14  15  16   33  14  15  16  17  18  19  20   37  11  12  13  14  15  16  17
29  17  18  19  20  21  22  23   34  21  22  23  24  25  26  27   38  18  19  20  21  22  23  24
30  24  25  26  27  28  29  30   35  28  29  30  31               39  25  26  27  28  29  30    
31  31                                                                                          
           October                          November                         December           
39                           1   44           1   2   3   4   5   48                   1   2   3
40   2   3   4   5   6   7   8   45   6   7   8   9  10  11  12   49   4   5   6   7   8   9  10
41   9  10  11  12  13  14  15   46  13  14  15  16  17  18  19   50  11  12  13  14  15  16  17
42  16  17  18  19  20  21  22   47  20  21  22  23  24  25  26   51  18  19  20  21  22  23  24
43  23  24  25  26  27  28  29   48  27  28  29  30               52  25  26  27  28  29  30  31
moooeeeep
  • 31,622
  • 22
  • 98
  • 187
  • Did you read [the documentation](https://docs.python.org/2/library/itertools.html#itertools.groupby)? – BrenBarn Jan 06 '16 at 22:26
  • 3
    @BrenBarn obviously not thoroughly enough. You mean this section? _"when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list"_ – moooeeeep Jan 06 '16 at 22:32
  • @BrenBarn Then it must be essentially this: http://stackoverflow.com/q/16598244/1025391 - But wait - that's still somehow different. I need to think about it... – moooeeeep Jan 06 '16 at 22:37
  • why not yield the values in each group as a list, as mentioned in the code? – Pynchia Jan 06 '16 at 22:50
  • @Pynchia I actually do that, but I thought I should try to understand why I have to do that. – moooeeeep Jan 06 '16 at 22:57
  • you have to do that because `groupby` gives you the values in each group as an iterator, which can be consumed (i.e. iterated on) once only. A `list` on the other hand is an iterable which generates a new iterator when iterated on. Try doing `lst=[1,2,3]; iter(lst) is lst` – Pynchia Jan 06 '16 at 23:11
  • 1
    unrelated: use `x.month` instead of `x.strftime('%B')` to group by month and [use `x.isocalendar()[1]` for a week number](http://stackoverflow.com/q/2600775/4279). – jfs Jan 07 '16 at 05:16

1 Answers1

3

As the docs state (c.f.):

when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list

That means the iterators are consumed, when the code later accesses the returned iterators out of order, i.e., when the groupby is actually iterated. The iteration happens out of order because of the chunking and interleaving that is done here.

We observe this specific pattern (i.e., only the last column is fully displayed) because of the way we iterate. That is:

  1. The month names for the first line are printed. Thereby the iterators for up to the last column's month are consumed (and their content discarded). The groupby() object produces the last column's month name only after the first columns' data.

  2. We print the first week line. Thereby the already exhausted iterators for the first columns are filled up automatically using the default value passed to zip_longest(). Only the last column still provides actual data.

  3. The same happens for the next lines of month names.

moooeeeep
  • 31,622
  • 22
  • 98
  • 187
  • the iterators are consumed (if they are not exhausted yet) when you call `next()` (explicitly or implicitly e.g., via a `for`-loop) on the `groupby()` object. When the code later accesses the returned iterators out of order they are already exhausted. – jfs Jan 07 '16 at 05:19