itertools.groupby( ) in python

Question

I have Set of tuples. For example:

set([(('E', ('T',)), 0),
 (('F', ('(', 'E', ')')), 0),
 (('T', ('F',)), 0),
 (('__S__', ('E', '$')), 0),
 (('E', ('E', '+', 'T')), 0),
 (('T', ('T', '*', 'F')), 0),
 (('F', ('id',)), 0)])

as you see every tuple has a tuple as it's first element ( ex. ('F', ('(', 'E', ')')) ).
first element of this tuple is single character and second element is another tuple ( ex. ('(', 'E', ')')) ). this tuple has one or more single character in it.
(It is actually Context Free Grammar. first element is LHS of rule(head), second tuple is RHS(body)
number in second element of each tuple is a pointer to one of the characters in RHS of this grammar.

what I am trying to do is grouping this tuples with respect to the element that has been pointed to.
for this purpose I wrote following code:

import itertools
S = set([(('E', ('T',)), 0), (('F', ('(', 'E', ')')), 0), (('T', ('F',)), 0), (('__S__', ('E', '$')), 0), (('E', ('E', '+', 'T')), 0), (('T', ('T', '*', 'F')), 0), (('F', ('id',)), 0)])
for v, h in itertools.groupby(S, lambda x: x[0][1][x[1]] if len(x[0][1]) > x[1] else None ):
     if (v is None):
         continue
     print '--'
     print v
     for hi in h:
         print hi

two tuples are in the same group if x[0][1][x[1]] are the same. x[0][1] is second tuple of first tuple(RHS of grammar) and x[1] is the pointer.
I get following result:

--
(
(('F', ('(', 'E', ')')), 0)
--
F
(('T', ('F',)), 0)
--
E
(('__S__', ('E', '$')), 0)
--
T
(('T', ('T', '*', 'F')), 0)
--
id
(('F', ('id',)), 0)
--
T
(('E', ('T',)), 0)
--
E
(('E', ('E', '+', 'T')), 0)

As you can see there is two group with key 'T'. I don't understand what am I doing wrong here!
I am almost new python programmer. In case the problem is too stupid!
thanks!

I'm not sure I follow the code well enough to make this an answer, but I think the problem is that you're using a `set` (which is unordered). You might need to sort the set using your groupby key function before you pass it to `groupby`. — mgilson, Jun 17 '13 at 20:53

Gareth Latty · Accepted Answer · 2013-06-17T20:58:29.887

itertools.groupby() requires the data to be sorted if you want all like data to be grouped, as per the documentation:

Generally, the iterable needs to already be sorted on the same key function.

The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.

Simply call sorted() on your data first (using your function as a key function), then do your grouping.

key_func = lambda x: x[0][1][x[1]] if len(x[0][1]) > x[1] else None
itertools.groupby(sorted(data, key=key_func), key_func)

Thanks man! It worked like charm! I didn't read documentation and just relied on few googling! — Ashkan, Jun 17 '13 at 21:05

itertools.groupby( ) in python

1 Answers1

Linked