19

We are using a kafka broker setup with a kafka streams application that runs using Spring cloud stream kafka. Although it seems to run fine, we do get the following error statements in our log:

2019-02-21 22:37:20,253 INFO kafka-coordinator-heartbeat-thread | anomaly-timeline org.apache.kafka.clients.FetchSessionHandler [Consumer clientId=anomaly-timeline-56dc4481-3086-4359-a8e8-d2dae12272a2-StreamThread-1-consumer, groupId=anomaly-timeline] Node 2 was unable to process the fetch request with (sessionId=1290440723, epoch=2089): INVALID_FETCH_SESSION_EPOCH. 

I searched the internet but there is not much information on this error. I guessed that it could have something to do with a difference in time settings between the broker and the consumer, but both machines have the same timeserver settings.

Any idea how this can be resolved?

mmelsen
  • 636
  • 1
  • 8
  • 24

5 Answers5

13

There is a concept of fetch session, introduced within KIP-227 since 1.1.0 release: https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability

Kafka brokers, which are replica followers, fetch messages from the leader. In order to avoid sending full metadata each time for all partitions, only those partitions which changed are sent within the same fetch session.

When we look into Kafka's code, we can see an example, when this is returned:

if (session.epoch != expectedEpoch) {
        info(s"Incremental fetch session ${session.id} expected epoch $expectedEpoch, but " +
          s"got ${session.epoch}.  Possible duplicate request.")
        new FetchResponse(Errors.INVALID_FETCH_SESSION_EPOCH, new FetchSession.RESP_MAP, 0, session.id)
      } else {

src: https://github.com/axbaretto/kafka/blob/ab2212c45daa841c2f16e9b1697187eb0e3aec8c/core/src/main/scala/kafka/server/FetchSession.scala#L493

In general, if you don't have thousands of partitions and, at the same time, this doesn't happen very often, then it shouldn't worry you.

tgrez
  • 704
  • 5
  • 10
  • 2
    Unfortunately I don't think this is related to network issues as I'm experiencing them also on my local docker setup: anomaly-timeline-2 | 2019-02-22 14:45:39,593 INFO anomaly-timeline-db8558f2-cb17-4a87-b4ba-fe0fd1c47ec0-StreamThread-1 org.apache.kafka.clients.FetchSessionHandler [Consumer clientId=anomaly-timeline-db8558f2-cb17-4a87-b4ba-fe0fd1c47ec0-StreamThread-1-consumer, groupId=anomaly-timeline] Node 1001 was unable to process the fetch request with (sessionId=593140062, epoch=65): INVALID_FETCH_SESSION_EPOCH. – mmelsen Feb 22 '19 at 15:02
  • 1
    which Kafka version do you use? – tgrez Feb 22 '19 at 15:26
  • 1
    Kafka version: 2.0.1 Kafka commitId: fa14705e51bd2ce5 – mmelsen Feb 22 '19 at 16:12
  • do you have an idea what is causing this? Would it help to post my consumer settings?? – mmelsen Mar 04 '19 at 21:00
  • It's possible that you have some clients that are reading from a broker during the time the the logs are being rotated based on the retention schedule; this will also cause you to receive the INVALID_FETCH_SESSION_EPOCH as the partition in question happens to no longer exist in the log segment. – zen Mar 13 '19 at 18:00
  • It is not a network issue this one, the answer by DanM is the accepted one. – Utsav Jha Jan 13 '20 at 11:39
10

It seems as this might be caused by Kafka-8052 issue, which was fixed for Kafka 2.3.0

Dan M
  • 1,175
  • 12
  • 23
  • I recently upgraded my kafka clients to match the broker which was at least 2.3.0. That indeed solved the issue, thanks! – mmelsen Jul 08 '20 at 06:26
1

Indeed, you can have this message when rolling or retention-based deletion occurs, as zen pointed out in comments. It's not a problem if it doesn't happen all the time. If it does, check your log.roll and log.retention configurations.

yuranos
  • 8,799
  • 9
  • 56
  • 65
1

Updating the client version to 2.3 (same version from broker) fix it for me.

Guilherme Torres Castro
  • 15,135
  • 7
  • 59
  • 96
  • This worked for me to. We were running server broker version `2.3.0` but our clients were using `2.2.1`. Upgrading the clients to version `2.3.1` made these `INVALID_FETCH_SESSION_EPOCH` logs stop. – Jesse Webb Apr 24 '20 at 16:41
0

In our case, The root cause was kafka Broker - client incompatibility. If your cluster is behind the client version you might see all kinds of odd problems such as this.

Our kafka broker is on 1.x.x and our kafka-consumer was on 2.x.x. As soon as we downgraded our spring-cloud-dependencies to Finchley.RELEASE our problem was solved.

dependencyManagement {
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Finchley.RELEASE"
    }
}
so-random-dude
  • 15,277
  • 10
  • 68
  • 113