I'm working on apache flink for data streaming and I have few questions. Any help is greatly appreciated. Thanks.
1) Are there any restrictions on creating tumbling windows. For example, if I want to create a tumbling window per user id for 2 secs and let’s say if I have more than 10 million user id's would that be a problem. (I'm using keyBy user id and then creating a timeWindow for 2 secs)? How are these windows maintained internally in flink?
2) I looked at rebalance for round robin partitioning. Let’s say I have a cluster set up and if I have a parallelism of 1 for source and if I do a rebalance, will my data be shuffled across machines to improve performance? If so is there a specific port using which the data is transferred to other nodes in the cluster?
3) Are there any limitations on state maintenance? I'm planning to maintain some user id related data which could grow very large. I read about flink using rocks db to maintain the state. Just wanted to check if there are any limitations on how much data can be maintained?
4) Also where is the state maintained if the amount of data is less? (I guess in JVM memory) If I have several machines on my cluster can every node get the current state version?