Questions tagged [flink-streaming]

Apache Flink is an open source platform for scalable batch and stream data processing. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

Flink's streaming API provides rich semantics, including processing- and event-time windows, as well as stateful UDFs. Flink streaming uses a light-weight fault-tolerance mechanism with exactly-once processing guarantees.

Learn more about Apache Flink at the project website: https://flink.apache.org/

3185 questions
163
votes
4 answers

What is/are the main difference(s) between Flink and Storm?

Flink has been compared to Spark, which, as I see it, is the wrong comparison because it compares a windowed event processing system against micro-batching; Similarly, it does not make that much sense to me to compare Flink to Samza. In both cases…
fnl
  • 4,861
  • 4
  • 27
  • 32
30
votes
4 answers

could not find implicit value for evidence parameter of type org.apache.flink.api.common.typeinfo.TypeInformation[...]

I am trying to write some use cases for Apache Flink. One error I run into pretty often is could not find implicit value for evidence parameter of type org.apache.flink.api.common.typeinfo.TypeInformation[SomeType] My problem is that I cant really…
jheyd
  • 599
  • 1
  • 5
  • 10
21
votes
2 answers

Some puzzles for the operator Parallelism in Flink

I just got the example below for the parallelism and have some related questions: The setParallelism(5) is setting the Parallelism 5 just to sum or both flatMap and sum? Is it possible that we can set the different Parallelism to different…
YuFeng Shen
  • 1,475
  • 1
  • 17
  • 41
20
votes
3 answers

Flink webui when running from IDE

I am trying to see my job in the web ui. I use createLocalEnvironmentWithWebUI, code is running well in IDE, but impossible to see my job in http://localhost:8081/#/overview val conf: Configuration = new Configuration() import…
GermainGum
  • 1,349
  • 3
  • 15
  • 40
20
votes
1 answer

How to implement HTTP sink correctly?

I want to send calculation results of my DataStream flow to other service over HTTP protocol. I see two possible ways how to implement it: Use synchronous Apache HttpClient client in sink public class SyncHttpSink extends…
Maxim
  • 9,701
  • 5
  • 60
  • 108
19
votes
1 answer

How to output one data stream to different outputs depending on the data?

In Apache Flink I have a stream of tuples. Let's assume a really simple Tuple1. The tuple can have an arbitrary value in it's value field (e.g. 'P1', 'P2', etc.). The set of possible values is finite but I don't know the full set beforehand…
Jan Thomä
  • 13,296
  • 6
  • 55
  • 83
14
votes
1 answer

Flink, how to set parallelism properly when using multiple Kafka source?

I still cannot get a clear idea of parallelism, let's say we have a flink cluster which has enough slots. In our flink job, we consume 3 kafka topics from 3 different kafka clusters, each topic has 10 partitions. If we want to consume the message as…
gfytd
  • 1,747
  • 2
  • 24
  • 47
12
votes
1 answer

Flink: No operators defined in streaming topology. Cannot execute

I am trying to setup a very basic flink job. When I try to run, get the following error: Caused by: java.lang.IllegalStateException: No operators defined in streaming topology. Cannot execute. at…
Luciano
  • 1,119
  • 12
  • 18
11
votes
2 answers

Apache Flink: Using filter() or split() to split a stream?

I have a DataStream from Kafka which has 2 possible value for a field in MyModel. MyModel is a pojo with domain-specific fields parsed from a message from Kafka. DataStream stream = env.addSource(myKafkaConsumer); I want to apply window…
Son
  • 877
  • 1
  • 12
  • 22
11
votes
2 answers

Apache Flume vs Apache Flink difference

I need to read a stream of data from some source (in my case it's UDP stream, but it shouldn't matter), transform the each record and write it to the HDFS. Is there any difference between using Flume or Flink for this purpose? I know I can use…
Kateryna Khotkevych
  • 1,248
  • 1
  • 12
  • 22
11
votes
1 answer

Flink: How to handle external app configuration changes in flink

My requirement is to stream millions of records in a day and it has huge dependency on external configuration parameters. For example, a user can go and change the required setting anytime in the web application and after the change is made, the…
Neoster
  • 195
  • 2
  • 11
11
votes
2 answers

Apache Flink vs Twitter Heron?

There are a lot of questions comparing Flink vs Spark Streaming, Flink vs Storm and Storm vs Heron. The origin of this question is from the fact that both Apache Flink and Twitter Heron are true stream processing frameworks (not micro-batch, like…
experimenter
  • 878
  • 1
  • 11
  • 27
11
votes
1 answer

How to count unique words in a stream?

Is there a way to count the number of unique words in a stream with Flink Streaming? The results would be a stream of number which keeps increasing.
Jun
  • 639
  • 10
  • 25
11
votes
1 answer

Flink: Sharing state in CoFlatMapFunction

Got stuck a bit with CoFlatMapFunction. It seems to work fine if I place it on the DataStream before window but fails if placed after window's “apply” function. I was testing two streams, main “Features” on flatMap1 constantly ingesting data and…
10
votes
1 answer

Apache Flink: How to apply multiple counting window functions?

I have a stream of data that is keyed and need to compute counts for tumbled of different time periods (1 minute, 5 minutes, 1 day, 1 week). Is it possible to compute all four window counts in a single application?
1
2 3
99 100