I have a use case where I want to implement an AWS Kinesis Data Application with Flink in Java. It will listen to multiple Kinesis streams via the Data Streams API. However, the analysis of those streams will be done in Python (since our data scientists prefer Python).
From this answer, there appears to be support for calling Python UDFs from Java. However, I want to be able to convert an incoming stream to a table, via
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
Table sessionsTable = tableEnv.fromDataStream(inputStream);
...and then have a Python processor that is invoked to process that stream.
I really have 3 questions here:
- Is this a supported use case?
- If so, is there documentation that describes how to do so?
- If so, will this add significant overhead to the application?