4

TL;DR - The node name in the sessionId is not being updated to the current node name in the backup when the primary goes down.

Tomcat version - apache-tomcat-7.0.50

I have two nodes (2 instances of my application in 2 seperate tomcats) set up, with the session replication config (also uses sticky session).Below is the cluster config from server.xml, which is inside the Engine tag. It is similar in both nodes, except the port numbers :

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="8">
    <Manager className="org.apache.catalina.ha.session.DeltaManager"
    expireSessionsOnShutdown="false"
    notifyListenersOnReplication="true"/>

    <Channel className="org.apache.catalina.tribes.group.GroupChannel">
        <Membership     className="org.apache.catalina.tribes.membership.McastService"
        address="228.0.0.4"
        port="45564"
        frequency="500"
        dropTime="3000"/>
        <Receiver      className="org.apache.catalina.tribes.transport.nio.NioReceiver"
        address="auto"
        port="4050"
        autoBind="100"
        selectorTimeout="5000"
        maxThreads="6"/>

        <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
            <Transport     className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
        </Sender>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
        <Interceptor      className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>\
</Channel>

<Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=""/>
<Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

<Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
tempDir="/tmp/war-temp/"
deployDir="/tmp/war-deploy/"
watchDir="/tmp/war-listen/"
watchEnabled="false"/>
<ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>

From the tomcat Manager, I can see that session (Ex: D042A0C5E380EB9E500224C87233119C.myNode1) is being created in the primary node on login, and replicated in the backup correctly.

But, as soon as the primary node goes down, I am expecting the sessionId in the backup node to be updated with the current node name i.e : D042A0C5E380EB9E500224C87233119C.myNode2

Example :

When user logs in :

Node 1 - Primary - jsessionIdSample.node1 
Node 2 - Backup - jsessionIdSample.node1 

When one node 1 goes down (Expected) :

Node 1 - - jsessionIdSample.node1 (NODE GOES DOWN) 
Node 2 - Primary - jsessionIdSample.node2 

But what is happening :

Node 1 - - jsessionIdSample.node1 (NODE DOWN) 
Node 2 - Backup - jsessionIdSample.node1

I have two questions :

1) Is my understanding that the sessionID should be updated in the backup soon after the primary node goes down correct? I read the tomcat docs, and it seems it should.

2) If it should, can you please help me with the config to make this work? I have tried solutions from other questions on SO, but none of them seem to work.

Edit : Adding complete engine config as per suggestion :

Node 1

    <Engine name="Catalina" defaultHost="localhost" jvmRoute="node1">
  <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
              <Manager className="org.apache.catalina.ha.session.DeltaManager"
               expireSessionsOnShutdown="false"
               notifyListenersOnReplication="true"/>

      <Channel className="org.apache.catalina.tribes.group.GroupChannel">
        <Membership className="org.apache.catalina.tribes.membership.McastService"
                    address="228.0.0.4"
                    port="45564"
                    frequency="500"
                    dropTime="3000"/>
        <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                  address="228.0.0.4"
                  port="4005"
                  autoBind="100"
                  selectorTimeout="5000"
                  maxThreads="6"/>

        <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
          <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
        </Sender>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
      </Channel>

      <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
             filter=""/>
      <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

      <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                tempDir="/tmp/war-temp/"
                deployDir="/tmp/war-deploy/"
                watchDir="/tmp/war-listen/"
                watchEnabled="false"/>
      <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
      <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
  </Cluster>

  <Realm className="org.apache.catalina.realm.LockOutRealm">
    <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
           resourceName="UserDatabase"/>
  </Realm>

  <Host name="localhost" appBase="webapps"
        unpackWARs="true" autoDeploy="true">
    <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
           prefix="localhost_access_log." suffix=".txt"
           pattern="%h %l %u %t &quot;%r&quot; %s %b" />
  </Host>
</Engine>

Node 2

    <Engine name="Catalina" defaultHost="localhost" jvmRoute="node2">
  <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
              <Manager className="org.apache.catalina.ha.session.DeltaManager"
               expireSessionsOnShutdown="false"
               notifyListenersOnReplication="true"/>

      <Channel className="org.apache.catalina.tribes.group.GroupChannel">
        <Membership className="org.apache.catalina.tribes.membership.McastService"
                    address="228.0.0.4"
                    port="45564"
                    frequency="500"
                    dropTime="3000"/>
        <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                  address="228.0.0.4"
                  port="4010"
                  autoBind="100"
                  selectorTimeout="5000"
                  maxThreads="6"/>

        <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
          <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
        </Sender>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
      </Channel>

      <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
             filter=""/>
      <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

      <Deployer className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                tempDir="/tmp/war-temp/"
                deployDir="/tmp/war-deploy/"
                watchDir="/tmp/war-listen/"
                watchEnabled="false"/>
      <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
      <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
  </Cluster>

  <Realm className="org.apache.catalina.realm.LockOutRealm">
    <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
           resourceName="UserDatabase"/>
  </Realm>

  <Host name="localhost" appBase="webapps"
        unpackWARs="true" autoDeploy="true">
    <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
           prefix="localhost_access_log." suffix=".txt"
           pattern="%h %l %u %t &quot;%r&quot; %s %b" />
  </Host>
</Engine>

Thanks in advance!

bub
  • 657
  • 1
  • 9
  • 18

1 Answers1

1

Not sure if you're mixing things up a bit - e.g. you say

When user logs in :

Node 1 - Primary - jsessionIdSample.node1 
Node 2 - Backup -  jsessionIdSample.node1

Shouldn't Node 2 use a session id ending in .node2?

The session id used IMHO is coming from the jvmRoute attribute in <Engine> and sticks to the machine - when Node 1 is configured with <Engine jvmRoute="node1">, that's what Node 1 knows. Unfortunately you're not quoting your Engine configuration above.

The jvmRoute is merely a hint to the loadbalancer, which machine to route to, thus it must be stable and reliable. It might be as simple as to make extra extra extra sure that your Node 2 has a jvmRoute="node2" configured. I've never seen any different behaviour in tomcat.

Even after your updates to the question I have the feeling that something is weird - see the quoted part in my answer, where Node 1 and Node 2 both use "node1" as indicator. That might be the culprit (or a forgotten typo). The workers, if you're using Apache httpd, have to be named just like their names in workers.properties (at least they have to be unique. See https://tomcat.apache.org/connectors-doc/reference/workers.html)

Olaf Kock
  • 46,930
  • 8
  • 59
  • 90
  • Hi @Olaf, thank you for your response, I will update the question with the engine config. The jvmRoute attribute is as follows : Node 1 - node1, Node 2 - node2 – bub Nov 02 '16 at 12:09
  • 1
    Hi @Olaf, I have added the complete engine config in server.xml to the question. I am using sticky session in my apache worker.properties. When the user logs in, his request is directed to Node1, and all further requests continue to be served from node 1. When the jsessionid is created in node1, the same is replicated to node2 (jsessionIdSample.node1). I also tried this without an app deployed, ie. by just logging into tomcat manager, and noticed that it replicated the session id with the jvmroute of the primary node (jsessionIdSample.node1). Is this not supposed to happen? – bub Nov 02 '16 at 12:22
  • 1
    I am expctng the jvmroute to chnge in the jsessionid, when node1 is brght down/loses membership of the multicast. In this case, node2 will be primary, and jsessionid will be changed to jsessionIdSample.node2. This is happening, but only AFTER the first request has been served. Ex. When node1 goes down, the next request is sent to node2 , with the session id jsessionIdSample.node1 (Which causes a session timeout), but aftr this reqst is served, the session id is changed to jsessionIdSample.node2. I am expctng this change to be done as soon as node1 loses membership with the multicast group. – bub Nov 02 '16 at 12:29
  • Hi @Olaf, is my understanding of how JvmRouteSessionIDBinderListener works correct? – bub Nov 09 '16 at 06:11
  • 1
    Without looking at the implementation, I'd expect the first request after node1 shuts down to still carry the node1 session-id - because it's stored in a cookie and nobody has changed the cookie yet. With the response there can be a new Set-Cookie directive that will change the cookie to now have a node2 id. I'd expect node2 to find the node1 session though. I'll admit that I typically don't use session replication in clusters - thus if there are issues, I'm not running into them – Olaf Kock Nov 09 '16 at 07:32
  • @bub : did you issue get solved as I am facing the same problem ? – Harsh Kanakhara Apr 25 '17 at 13:01
  • @HarshKanakhara : no, it did not get solved. Please tell me if you were able to solve it. If I understand correctly, the JvmRouteSessionIDBinderListener is supposed to update the route name in the session id when the active session goes down, but it is not happening immediately, which leads to invalidation of the active session :( – bub May 02 '17 at 06:56
  • @bub: error still remains. I am still not getting how the JvmRouteSessionIDBinderListener works ? – Harsh Kanakhara May 02 '17 at 13:32
  • @HarshKanakhara , same here :(. Please share this question, so we can hopefully find an answer. – bub May 04 '17 at 13:43