Questions tagged [flink-statefun]

Stateful Functions is an API that simplifies building distributed stateful applications. It’s based on functions with persistent state that can interact dynamically with strong consistency guarantees. The runtime is built on Apache Flink®.

Stateful Functions Applications

A stateful function is a small piece of logic/code existing in multiple instances that represent entities — similar to actors. Functions are invoked through messages and are:

  • Stateful: Functions have embedded, fault-tolerant state, accessed locally like a variable.
  • Virtual: Much like FaaS, functions don't reserve resources — inactive functions don't consume CPU/Memory.

Applications are composed of modules of multiple functions that can interact arbitrarily with:

  • Exactly-once Semantics: State and messaging go hand-in-hand, providing exactly-once message/state semantics.
  • Logical Addressing: Functions message each other by logical addresses. No service discovery needed.
  • Dynamic and Cyclic Messaging: Messaging patterns don't need to be pre-defined as dataflows (dynamic) and are also not restricted to DAGs (cyclic).

A Runtime built for Serverless Architectures

The Stateful Functions runtime is designed to provide a set of properties similar to what characterizes serverless functions, but applied to stateful problems.

The runtime is built on Apache Flink, with the following design principles:

  • Logical Compute/State Co-location: Messaging, state access/updates and function invocations are managed tightly together. This ensures a high-level of consistency out-of-the-box.
  • Physical Compute/State Separation: Functions can be executed remotely, with message and state access provided as part of the invocation request. This way, functions can be managed like stateless processes and support rapid scaling, rolling upgrades and other common operational patterns.
  • Language Independence: Function invocations use a simple HTTP/gRPC-based protocol so that Functions can be easily implemented in various languages.

References

89 questions
3
votes
1 answer

Flink: The program's entry point class not found in the jar file

I'm trying to deploy a Flink stateful function as a flink jar and I followed the instruction here. However I'm getting the error saying that the program entry point class was not found in the jar even after I added the dependency in my…
flint_stone
  • 803
  • 1
  • 10
  • 19
3
votes
1 answer

Stateful Functions in Apache Flink

I examine new Stateful Functions 2.0 API of Apache Flink. I read following documentation link https://ci.apache.org/projects/flink/flink-statefun-docs-stable/. Also I ran examples in Git repo.…
2
votes
1 answer

Using Flink connector within Flink StateFun

I've managed to plug in the GCP PubSub dependency into the Flink Statefun JAR and then build the Docker image. I've added the below to the pom.xml. org.apache.flink
N P
  • 2,319
  • 7
  • 32
  • 54
2
votes
0 answers

Is there a way to customise a Kafka Deserializer in Kafka Ingress of Flink stateful function?

In my project, I am going to use flink statefun kafka ingress for consuming avro-serialized records from kafka, but it seems there is no config parameter for users to specify the deserilizer for deserializing the kafka record's key, in the source…
Xin Li
  • 21
  • 1
2
votes
1 answer

Need advice on migrating from Flink DataStream Job to Flink Stateful Functions 3.1

I have a working Flink job built on Flink Data Stream. I want to REWRITE the entire job based on the Flink stateful functions 3.1. The functions of my current Flink Job are: Read message from Kafka Each message is in format a slice of data packets,…
Yun Xing
  • 43
  • 4
2
votes
1 answer

Flink Stateful Functions with an existing Flink application

I'd appreciate some advice around the use of Stateful functions. We are currently using Flink whereby we consume from a number of kafka streams, aggregate, run a computation and then output to a new stream. The problem is that the computation…
gambino
  • 63
  • 7
2
votes
1 answer

Storing a database or third-party connection in a Flink Stateful function module

I'm trying to understand the scope of a Flink Statefun module. Let's say I have a third-party service that needs establishing a connection first (e.g. credential. that takes a long time) And after that, I can interact with it. I'm trying to…
Omid
  • 1,959
  • 25
  • 42
2
votes
1 answer

is it possible to set flink `state.checkpoint.dir` programmatically?

We have flink-conf.yaml for local runs in our project. We'd like to be able to run flink locally for testing. Part of our team uses Macs and the other part, PCs. We'd like to set state.checkpoint.dir to some universally acceptable path, preferably…
Yar
  • 629
  • 6
  • 17
2
votes
1 answer

Apache Flink Stateful Functions python vs java performance

What are the advantages and disadvantages of using python or java when developing apache flink stateful function. Is there any performance difference? which one is more efficient for the same operation? Can we develop the application completely on…
2
votes
1 answer

Stateful Functions Fault Tolerant Message Distribution in Apache Flink

I'm trying to implement messaging scenario using apache flink stateful functions. By design I need to calculate some statistics from incoming messages and store them in the states. After that scenario functions will access these states and messages…
2
votes
1 answer

Flink Statefun connections to Flink Table API

We are interested in connecting to a regular Flink Streaming application from new Stateful Functions , ideally using the Table API. The idea is to consult tables registered in Flink from Statefun, is this possible, and what is the right way to do…
Leo
  • 21
  • 1
1
vote
0 answers

Programmatically determining when Flink Stateful Functions has fully processed a batch of Kafka events

We have a manually-executed migration program that publishes a burst of Kafka events. Other programs also publish a steady trickle of events on the same Kafka topic. We have a Flink StateFun cluster that ingests records from the Kafka topic and…
Gabriel Deal
  • 935
  • 1
  • 8
  • 19
1
vote
0 answers

Multiple Flink Statefun jobs on the same Flink cluster

Is it possible to register multiple statefun jobs on the same Flink cluster? I saw this configuration: "statefun.flink-job-name" (here). but it seems to be relevant to the Flink cluster configuration. So it looks like I can define a single job…
Raz Omessi
  • 1,812
  • 1
  • 14
  • 13
1
vote
0 answers

In StateFun with Apache DataStream examples how do we connect to remote Stateful Functions

I have been going to through the below link explaining the usage of Stateful Function with DataStream. https://github.com/apache/flink-statefun/blob/7ec6664ed7edf14d441110745df4c8d79b5d3abd/docs/content/docs/sdk/flink-datastream.md I am not able to…
1
vote
1 answer

What is the correct way to scale flink statefun remote function

Figure out what is the correct way to scale up the remote function. Figure out scaling relations between replicas of the remote function, Flink parallelism.default configuration, ingress topic partition counts together with message partition keys.…
joeyinso
  • 13
  • 2
1
2 3 4 5 6