How do I iterate through a dictionary/set in SLY?

Question

So, I'm trying to transition my code from my earlier PLY implementation to SLY. Previously, I had some code that loaded a binary file with a wide range of reserved words scraped from documentation of the scripting language I'm trying to implement. However, when I try to iterate through the scraped items in the lexer for SLY, I get an error message inside LexerMetaDict's __setitem__ when trying to iterate through the resulting set of:

Exception has occurred: AttributeError     
Name transition redefined    
  File "C:\dev\sly\sly\lex.py", line 126, in __setitem__    
    raise AttributeError(f'Name {key} redefined')    
  File "C:\dev\sly\example\HeroLab\HeroLab.py", line 24, in HeroLabLexer    
    for transition in transition_set:    
  File "C:\dev\sly\example\HeroLab\HeroLab.py", line 6, in <module>    
    class HeroLabLexer(Lexer):

The code in question:

from transitions import transition_set, reference_set
class HeroLabLexer(Lexer):

# initial token assignments

    for transition in transition_set:
        tokens.add(transition)

I might not be as surprised if it were happening when trying to add to the tokens, since I'm still trying to figure out how to interface with the SLY method for defining things, but if I change that line to a print statement, it still fails when I iterate through the second item in "transition_set". I've tried renaming the various variables, but to little avail.

I updated the answer with a possibly clearer (or at least longer) explanation of the scoping problem. If it's not clear, please let me know so I can improve it. Python's class scope rules are weird to start with, and Sly just makes it all weirder. — rici, Sep 17 '22 at 18:10

rici · Accepted Answer · 2022-09-20T04:45:49.243

The error you get is the result of a modification Sly makes to the Lexer metaclass, which I discuss below. But for a simple answer: I assume tokens is a set, so you can easily avoid the problem with

tokens |= transition_set

If transition_set were an iterable but not a set, you could use the update method, which works with any iterable (and any number of iterable arguments):

tokens.update(transition_set)

tokens doesn't have to be a set. Sly should work with any iterable. But you might need to adjust the above expressions. If tokens is a tuple or a list, you'd use += instead of |= and, in the case of lists, extend instead of update. (There are some minor differences, as with the fact that set.update can be used to merge several sets.)

That doesn't answer your direct question, "How do I iterate... in SLY". I interpret that as asking:

How do I write a for loop at class scope in a class derived from sly.lexer?

and that's a harder question. The games which Sly plays with Python namespaces make it difficult to use for loops in the class scope of a lexer, because Sly replaces the attribute dictionary of Lexer class (and its subclasses) with a special dictionary which doesn't allow redefinition of attributes with string values. Since the iteration variable in a for statement is in the enclosing scope --in this case is the class scope--, any for loop with a string index variable and whose body runs more than once will trigger the "Name redefined" error which you experienced.

It's also worth noting that if you use a for statement at class scope in any class, the last value of the iteration variable will become a class attribute. That's hardly ever desirable, and, really, that construction is not good practice in any class. But it doesn't usually throw an error.

At class scope, you can use comprehensions (whose iteration variables are effectively locals). Of course, in this case, there's no advantage in writing:

tokens.update(transition for transition in transition_set)

But the construct might be useful for other situations. Note, however, that other variables at class scope (such as tokens) are not visible in the body of the comprehension, which might also create difficulties.

Although it's extremely ugly, you can declare the iteration variable as a global, which makes it a module variable rather than a class variable (and therefore just trades one bad practice for another one, although you can later remove the variable from the module).

You could do the computation in a different scope (such as a global function), or you could write a (global) generator to use with tokens.update(), which is probably the most general solution.

Finally, you can make sure that the index variable is never an instance of a str.

Thank you. Hmm... I'm trying to decide my best approach. Right now, I've got a slightly clumsy check where it resolves it at parse time, but since those are reserved words, I probably will want to handle them earlier so that there's less logic double-checking all of the values. Not that that helps with other aspects of this rather... interesting... language that allowed variables to just be numbers, which means a number might actually be a variable. — Sean Duggan, Sep 17 '22 at 18:16
@sean: If the `for` loop worked for you, then replacing it with `tokens |= transition_set` will work fine, as well as being more readable (imho) and more efficient. I think that's independent of the question of how you recognise keywords (if that's what you're referring to in your comment): you might want to take a look at [this section of the Sly documentation](https://github.com/dabeaz/sly/blob/master/docs/sly.rst#token-remapping), which provides a short-cut to the "traditional" Ply remapping mechanism. — rici, Sep 17 '22 at 18:38
Just a quick minor note, the tokens are built as a tuple, so the "|=" and "update" items don't work. I did verify that I can add items with "tokens += tuple(transition_set". — Sean Duggan, Sep 19 '22 at 12:36
OK, I correct my previous comment. It could be a tuple. But a set would allow in-place modification. I'm a bit surprised that Sly lets you redefine a tuple member and not a string member; I thought that the only redefinitions it allowed were callables. I'll have to look at that code again someday. — rici, Sep 19 '22 at 20:19
Eyeh, my attempts to define ID exceptions as a list comprehension haven't gone so well so far. I have code that works in the debug console, but fails in the code, `[ID.__setitem__(value, value.upper()) for value in transition_set]`. I'm starting to think that it will make more sense to just add additional functionality in the SLY implementation instead of trying to fiddle with all of these funky ways to loop without looping. — Sean Duggan, Sep 19 '22 at 20:31
If that's exactly what you wrote, it will indeed fail, although I'm not sure that Sly is to blame. Python's class scope is a truly bizarre beast and even experienced pythonistas get it wrong sometimes. Read the last paragraph of [this reference manual section](https://docs.python.org/3.12/reference/executionmodel.html#resolution-of-names), paying particular notice to the example at the end, which I believe is precisely the issue you hit. — rici, Sep 20 '22 at 02:48
By the way, I managed to do this rather crudely with a `for` statement by ensuring that the iteration variable was not a string. For example: `for id in (lambda v=v:v for v in transition_set): ID[id()] = id().upper()`. There are probably less mysterious ways to do it; `lambda v=v:v for v in ...` is a way to convert an iterable of values into an iterable of functions which return those values, taking advantage of the fact that default values are immediately evaluated to avoid the common late-binding problem with creating functions in for loops. — rici, Sep 20 '22 at 16:59

How do I iterate through a dictionary/set in SLY?

1 Answers1