Storing a database or third-party connection in a Flink Stateful function module

Question

I'm trying to understand the scope of a Flink Statefun module. Let's say I have a third-party service that needs establishing a connection first (e.g. credential. that takes a long time) And after that, I can interact with it.

I'm trying to understand the scope of a Statfun module and if I should create that connection for all of my functions or if I can create that per module.

score 2 · Answer 1 · answered Jul 23 '21 at 15:31

When a module is instantiated, it will create one physical instance of each function type. Those functions are then multiplexed to support multiple ids under the hood. For expensive connections that should be shared, create them when the physical instance is created.

For example, lets say you are using the Java SDK (though this will look the same for any language SDK). The most natural place to create resources is from the supplier of the StatefulFunctionSpec, passing the resource into the Java classes constructor.

StatefulFunctionSpec spec =
        StatefulFunctionSpec.builder(GreeterFn.TYPE)
            .withValueSpec(GreeterFn.SEEN)
            .withSupplier(() -> {
                 var resource = createExpensiveResource();
                 return new GreeterFn(resource);
            })
            .build();

Thanks @Seth, would you please provide the entry point for that in Python as well? — Omid, Jul 24 '21 at 00:27

Storing a database or third-party connection in a Flink Stateful function module

1 Answers1