3

Consider the following Abstract Data Type (using Haskell syntax):

data Expr = Literal String | Symbol String | And [Expr] | Or [Expr]

In Python, one can make use of dataclasses and inheritance to obtain a similar type construction:

@dataclass
class Expr:
    # "union"
    foo: str | None = None
    bar: list["Expr"] | None = None


@dataclass
class Literal(Expr):
    pass


@dataclass
class Symbol(Expr):
    pass


@dataclass
class And(Expr):
    pass


@dataclass
class Or(Expr):
    pass

As an interesting exercise, I am wondering whether it is possible to obtain a similar effect, but with a different definition which avoids duplication. I came up with the following theoretical notation:

# simulate Haskell's
# data Expr = Literal String | Symbol String | And [Expr] | Or [Expr]
# the following will bring names into scope
(
    make_adt(
        type_cons="Expr",
        possible_fields=[
            ("foo", str),
            ("bar", list["Expr"]),
        ],
    )
    .add_data_cons("Literal", fields=["foo"])
    .add_data_cons("Symbol", fields=["foo"])
    .add_data_cons("And", fields=["bar"])
    .add_data_cons("Or", fields=["bar"])
)

Here I am saying that there's a base type (the Expr type constructor) with 4 data constructors: Literal, Symbol, And, Or.

Each data constructor takes an additional argument (either str or list[Expr]), which is referred in the fields argument above (must be a subset of the possible_fields).

So:

  • Literal("foo"): sets the foo field for the instance
  • And([Literal("foo"), Symbol("baz")]): sets the bar field for the instance

The constraint here, as opposed to plain inheritance, is that that Literal and Symbol don't have the bar field, and similarly, And, Or, don't have the foo field. Or, to relax this a bit, we at least have to enforce that only non-null attributes are the ones defined in the fields list above.

My questions are:

  • Can something like this be implemented?
    • I'm thinking along the lines of attrs and dynamic class creation using type(...).
  • How brittle it would be?

P.S. I know it does not necessarily make sense to over-engineer this, especially in a dynamically typed language like Python, but I consider it to be an interesting exercise nonetheless.

Alexandru Dinu
  • 1,159
  • 13
  • 24
  • looks like there are ways to [dynamically add class methods](https://stackoverflow.com/questions/17929543/how-can-i-dynamically-create-class-methods-for-a-class-in-python) but I've never seen "dynamic class creation" in python. you might have to resort to templated code generation – willwrighteng Jan 01 '23 at 23:03
  • 1
    Re: "dynamic class creation", my understanding is that one can leverage the built-in function `type`, like: `MyClass = type("MyClass", (object,), {})`, then instantiate objects: `foo = MyClass()`. From the [docs](https://docs.python.org/3/library/functions.html#type): With three arguments, return a new type object. This is essentially a dynamic form of the class statement. – Alexandru Dinu Jan 01 '23 at 23:30
  • 1
    @AlexandruDinu Yep. It's also possible to just use a `class` statement in a function and return that class for simple cases. – Carcigenicate Jan 02 '23 at 00:31
  • It's unspecific. For example, your second question - 'How brittle...' depends on the implementation. What about trying to implement yourself and ask a specific question? They use ```eval()``` or ```exec()``` most cases, see [```namedtuple()```](https://github.com/python/cpython/blob/3.11/Lib/collections/__init__.py#L348) for example. – relent95 Jan 02 '23 at 00:46
  • 1
    Although Python can allow you to dynamically inject symbols in a namespace, I suggest imitating the philosophy of C/C++ preprocessor directives and dynamically generate Python modules proper instead. Python 3.9+ has [`ast.unparse`](https://docs.python.org/3/library/ast.html#ast.unparse), which means you can use whatever more expressive syntax you'd like and use Python's `ast` library to generate Python modules proper. – dROOOze Jan 05 '23 at 20:32

2 Answers2

1

Your idea inspired me to take my own twist on the problem. My solution in its current state is somewhat hacky, and has its own limitations (you cannot name parameters), but I think it feels much nicer to write.

The machinery to set it up:

from typing import Callable, ParamSpec, List, get_args
from collections import namedtuple

class ADT(type):
    def __new__(metacls, name, bases, namespace):
        new_type = super().__new__(metacls, name, bases, namespace)
        for cons_name, cons_type in new_type.__annotations__.items():
            param_types, ret_type = get_args(cons_type)
            type_const = type(cons_name, (new_type, namedtuple(cons_name, [""]*len(param_types), rename=True)), {})
            type_const.__annotations__.update({cons_name: Callable[param_types, new_type]})
            setattr(new_type, cons_name, type_const)
        return new_type
    
    @classmethod
    def __prepare__(metacls, name, bases, **kwargs):
        namespace = {"TypeCons": Callable[ParamSpec("P"), name]}
        return namespace

Afterwards, you can use it like so:

class Expr(metaclass=ADT):
    Literal: TypeCons[str]
    Symbol:  TypeCons[str]
    And:     TypeCons[List["Expr"]]
    Or:      TypeCons[List["Expr"]]

Note the quoting of "Expr" is typical for self-referential type annotations in Python. I'm fairly certain I could hack something together to remove that limitation using reflection, but it would be way hackier than what I have above even.

Moving on, here's what it would look like in use, using Python's new match-case statements:

expr = Expr.Or([Expr.Literal("1"), Expr.Symbol("x")])

match expr:
    case Expr.Or([]):
        print("Won't match")
    case Expr.Or([Expr.Literal("1"), Expr.Symbol(sym)]):
        print("Matching with symbol:", sym)
    case _:
        print("Catch-all")

Which of course prints "Matching with symbol: x".

I had also considered using method stubs instead of type annotations, like below:

class Expr2(metaclass=ADT2):
    def Literal(foo: str):
        ...
    def Symbol(foo: str):
        ...
    def And(bar: List["Expr"]):
        ...
    def Or(bar: List["Expr"]):
        ...

-but ultimately decided on the annotation based approach (partially for being a bit cleaner, partially for the implementation challenge).

Dillon Davis
  • 6,679
  • 2
  • 15
  • 37
0

Here's an initial attempt which can pave the way to further refinements.

Let's start with a simpler ADT:

data List = Null | Cons Integer List

We can support this in Python by implementing the following:

from dataclasses import dataclass, make_dataclass


def make_adt(
    type_cons: str, 
    data_cons: dict[str, list[tuple[str, type]]]
) -> None:
    """
    data <type_cons> = <data_cons>

    Nothing is returned, instead, the global namespace is updated.
    """
    base = dataclass(type(type_cons, (object,), {}))
    globals()[type_cons] = base

    for name, ts in data_cons.items():
        derived = make_dataclass(name, ts, bases=(base,))
        globals()[name] = derived

So we can use this as:

make_adt(
    type_cons="List",
    data_cons={
        "Null": [],
        "Cons": [("x", int), ("xs", "List")],
    },
)

Furthermore:

def show(xs: List) -> str:
    match xs:
        case Null():
            return "<END>"
        case Cons(head, tail):
            return f"{head}:{show(tail)}"

lst = Cons(1, Cons(2, Cons(3, Null())))
print(lst)
# Cons(x=1, xs=Cons(x=2, xs=Cons(x=3, xs=Null())))

show(lst)
# '1:2:3:<END>'

My original ADT can be defined like:

make_adt(
    type_cons="Expr",
    data_cons={
        "Literal": [('x', str)],
        "Symbol": [('x', str)],
        "And": [('xs', list["Expr"])],
        "Or": [('xs', list["Expr"])],
    },
)

x = Literal('42')
y = Symbol('<foo>')
z = And([x, y])
q = Or([x, y])

print(z)
# And(xs=[Literal(x='42'), Symbol(x='<foo>')])
Alexandru Dinu
  • 1,159
  • 13
  • 24