Do identical Abstract Syntax Trees guarantee the same behaviour?

Question

Given two programs that generate the same Abstract Syntax Tree (AST), are they guaranteed to run with the same behaviour given the same input?

For a concrete example, I want to run a formatter to a Python module in order to change the style. To check that the formatter hasn't modified the logic of the program, I want to compare the AST of the formatted module and the original. Is this a good way to do it?

Short answer: Yes! The same AST means the same abstract program. Yet, you need to be flexible about actual identifier names (it's the same program when a variable is renamed). — Apalala, Dec 03 '19 at 11:16
@Apalala Do you mean that if all identifiers are changed consistently the AST will be the same? Or the opposite? — Carles Garcia, Dec 03 '19 at 14:10
The issue is what to compare between ASTs. AST comparison is fine if the source code is the same and the parsers are different. But if you're doing source code transformations you must make allowance for changes in, for instance, line and column numbers, and perhaps other stuff. My previous comment meant that since programs are provably equivalent under well known refactorings, a simple-minded AST comparison may not be enough as a test. — Apalala, Dec 04 '19 at 15:40
Somewhat related: [elegant way to test python ASTs for equality (not reference or object identity) - Stack Overflow](https://stackoverflow.com/questions/3312989/elegant-way-to-test-python-asts-for-equality-not-reference-or-object-identity) — user202729, Dec 22 '22 at 16:46

kaya3 · Accepted Answer · 2019-12-02T18:30:45.430

The non-pedantic answer to this question is yes, they do, since the AST is an intermediate representation used by the compiler; once the AST is generated, that is what's used to generate the bytecode. A simple way to check that the two ASTs are the same is to use the ast.dump function and then compare the results as strings.

The pedantic answer is that it depends what you mean by "identical" - specifically, what properties of the two ASTs you wish to compare to determine if they are identical.

For example, x = 1; raise ValueError() and x = 1\nraise ValueError() compile to "identical" ASTs:

>>> import ast
>>> print(ast.dump(ast.parse('x = 1; raise ValueError()')))
Module(body=[
  Assign(targets=[Name(id='x', ctx=Store())], value=Num(n=1)),
  Raise(exc=Call(func=Name(id='ValueError', ctx=Load()), args=[], keywords=[]), cause=None)
])
>>> print(ast.dump(ast.parse('x = 1\nraise ValueError()')))
Module(body=[
  Assign(targets=[Name(id='x', ctx=Store())], value=Num(n=1)),
  Raise(exc=Call(func=Name(id='ValueError', ctx=Load()), args=[], keywords=[]), cause=None)
])

However, the AST also includes metadata about line numbers and positions, so these two ASTs are not exactly identical:

>>> ast.parse('x = 1; raise ValueError()').body[1].lineno
1
>>> ast.parse('x = 1\nraise ValueError()').body[1].lineno
2

Furthermore, these line numbers are available at runtime in error messages; the first one says line 1 and the second says line 2:

>>> exec('x = 1; raise ValueError()')
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    exec('x = 1; raise ValueError()')
  File "<string>", line 1, in <module>
ValueError
>>> exec('x = 1\nraise ValueError()')
Traceback (most recent call last):
  File "<pyshell#14>", line 1, in <module>
    exec('x = 1\nraise ValueError()')
  File "<string>", line 2, in <module>
ValueError

It is also technically possible for code to inspect the line number from the error message and then decide its behaviour based on that. Any such code is an abomination and should be taken out back and shot, but as a certified pedant it's my duty to note that such code can exist.

So technically, your code formatter won't result in truly "identical" ASTs because their line/position metadata may be different - your code formatter has to change that metadata in order to do anything useful. But this is a reasonable caveat for an automated code-formatting tool such as yours to have, because people who write code which breaks when you reformat it should know that their code is too fragile to be reformatted by an automatic tool.

For completeness, if you want to ensure that the compiled bytecode is the same, you can use the dis.get_instructions function: this is a stricter check than ast.dump because the bytecode includes the line numbers (but not the positions within lines), but if your formatter shouldn't move code between different lines then you might prefer this way.

>>> import dis
>>> instructions1 = list(dis.get_instructions('x = 1; y = 2'))
>>> instructions2 = list(dis.get_instructions('x = 1\ny = 2'))
>>> instructions1 == instructions2
False

pedantic comments help everyone learn. – Wai Ho Choy Jun 05 '23 at 03:38 — Wai Ho Choy, Jun 05 '23 at 03:38

Do identical Abstract Syntax Trees guarantee the same behaviour?

1 Answers1