1

I have an EE app I want to deploy to 2 wildfly13 instances in cluster. I have an entity using @Cache (from hibernate) and @NamedQuery with hints to use cache as well: entity can be queried by both id (will use @Cache) and by other query (using query hint in this case).

The cache region used for the hint is "replicated-query". I use wildfly 13, so I have hibernate 5.1.14 (non ee 8 preview mode), infinispan 9.2.4 and jgroups 4.0.11 and java 10 (we can't go to java 11 because of some removal in Unsafe class we still have libs depending on it). The app is 100+ EJBs and close to 150k LOC so upgrading wildfly is not an option for the moment.

Problem is: replicated cache is not replicating, even not starting as replicated.

Infinispan replicated cache not replicating objects for read is not helpful, nor is Replicated infinispan cache with Wildfly 11.

I use jgroups with tcpping (as the app will be deployed on a private cloud, we need to keep network as low as possible so udp is not an option). The cluster is forming well between the 2 wildfly instances (confirmed by the logs and jmx), but the replicated cache is not starting on deployment, as if it could not find a transport.

The cache name i use for type "replicated-cache" is not making any difference, including the pre-configured "replicated-query".

Using the "non deprecated configuration" for jgroups as mentionned by Paul Ferraro here, did not allow the cluster to form (which in my case is a step back because the cluster is forming when using my conf).

One weird thing thougt: the UpdateTimestamp cache, configured as replicated is replicating (confirmed by logs and jmx: the name of the region is suffixed by repl_async).

The caches are in invalidation_sync by default and works fine as the sql query is only issued once with same parameters (confirmed by logs and statistics).

For the moment (test/debug purpose), I deploy both instances on my local. omega1 with a port offset of 20000 and omega2 with port offset of 30000.

I haven't tried distributed cache because from what I read, I would face the same kind of issue.

Here is the relevant part of the entity:

@Entity
@Table(name = "my_entity", schema = "public")
@NamedQueries({
        @NamedQuery(name = "myEntityTest", query = "select p from MyEntity p where p.value = :val", hints = {
                @QueryHint(name = org.hibernate.annotations.QueryHints.CACHEABLE, value = "true"),
                @QueryHint(name = org.hibernate.annotations.QueryHints.CACHE_REGION, value = "RPL-myEntityTest")
        })
})
@Cache(usage = CacheConcurrencyStrategy.NONE, region = "replicated-entity")

Here is the jgroups subsystem portion of standalone-full-ha.xml:

        <subsystem xmlns="urn:jboss:domain:jgroups:6.0">
            <channels default="omega-ee">
                <channel name="omega-ee" stack="tcpping" cluster="omega-ejb" statistics-enabled="true"/>
            </channels>
            <stacks>
                <stack name="tcpping">
                    <transport type="TCP" statistics-enabled="true" socket-binding="jgroups-tcp"/>
                    <protocol type="org.jgroups.protocols.TCPPING">
                        <property name="port_range">
                            10
                        </property>
                        <property name="discovery_rsp_expiry_time">
                            3000
                        </property>
                        <property name="send_cache_on_join">
                            true
                        </property>
                        <property name="initial_hosts">
                            localhost[27600],localhost[37600]
                        </property>
                    </protocol>
                    <protocol type="MERGE3"/>
                    <protocol type="FD_SOCK"/>
                    <protocol type="FD_ALL"/>
                    <protocol type="VERIFY_SUSPECT"/>
                    <protocol type="pbcast.NAKACK2"/>
                    <protocol type="UNICAST3"/>
                    <protocol type="pbcast.STABLE"/>
                    <protocol type="pbcast.GMS"/>
                    <protocol type="MFC"/>
                    <protocol type="FRAG2"/>
                </stack>
            </stacks>
        </subsystem>

Here is the socket-binding for jgroups-tcp:

<socket-binding name="jgroups-tcp" interface="private" port="7600"/>

And this is the infinispan hibernate cache container section of standalone-full-ha.xml:

            <cache-container name="hibernate" module="org.infinispan.hibernate-cache">
                <transport channel="omega-ee" lock-timeout="60000"/>
                <local-cache name="local-query">
                    <object-memory size="10000"/>
                    <expiration max-idle="100000"/>
                </local-cache>
                <invalidation-cache name="entity">
                    <transaction mode="NON_XA"/>
                    <object-memory size="10000"/>
                    <expiration max-idle="100000"/>
                </invalidation-cache>
                <replicated-cache name="replicated-query">
                    <transaction mode="NON_XA"/>
                </replicated-cache>
                <replicated-cache name="RPL-myEntityTest" statistics-enabled="true">
                    <transaction mode="BATCH"/>
                </replicated-cache>
                <replicated-cache name="replicated-entity" statistics-enabled="true">
                    <transaction mode="NONE"/>
                </replicated-cache>
            </cache-container>

and I've set the following properties in persistence.xml

        <properties>
            <property name="hibernate.dialect" value="org.hibernate.dialect.PostgreSQL9Dialect"/>
            <property name="hibernate.cache.use_second_level_cache" value="true"/>
            <property name="hibernate.cache.use_query_cache" value="true"/>

            <property name="hibernate.show_sql" value="true"/>
            <property name="hibernate.format_sql" value="true"/>
        </properties>

I expect:

  1. the replicated caches to start on deployment (maybe even on start if they are configured in infinispan subsystem)

  2. cached data to be replicated between nodes on read and invalidated cluster wide on update/expiration/invalidation

  3. data to be retrieved from cache (local because it should have been replicated).

I feel that I'm not so far from the expected result, but I'm missing something.

Any help will be much appreciated!

Update 1: I just tried what @Bela Ban suggested and set initial hosts to localhost[7600] on both nodes with no success: the cluster is not forming. I use port offset to start both nodes on my local machine to avoid port overlap.

With localhost[7600] on both hosts, how would one node know on which port to connect to the other one since I need to use port offset?

I even tried localhost[7600],localhost[37600] on the node i start with offset 20000 and localhost[7600],localhost[27600] on the one i start with offset 30000. The cluster is forming but the cache is not replicating.

Update 2: The entity's cache is in invalidation_sync and works as expected, which means that jgroups is working as expected and confirms the cluster is well formed, so my guess is the issue is infinispan or wildfly related.

will
  • 61
  • 1
  • 12
  • Might be silly, but have you tried removing `CacheConcurrencyStrategy.NONE`? Or using any of the other options? – Galder Zamarreño May 22 '19 at 13:39
  • The other important thing here is to figure out which Infinsipan instance is in use, whether the one provided by WildFly, or whether you're providing your own Infinispan version within your deployment. For a Hibernate Cache use case, the former is recommended. If using the former, [this example](https://github.com/infinispan/infinispan-simple-tutorials/tree/master/hibernate-cache/wildfly-local) should work as is. – Galder Zamarreño May 22 '19 at 13:42
  • well, the entity cache is fine and invalidating correctly already. Does the @Cache annotation influences some cache other than the entity one?? If I don't provide infinispan's jar, I get NoClassDefFoundError on org.infinispan.Cache (smells like a classloading issue but I don't seem to fix it... – will May 22 '19 at 14:06
  • and also tried with CacheConcurrencyStrategy.READ_WRITE, same result. – will May 23 '19 at 07:35
  • Can you share more details? e.g. your pom.xml – Galder Zamarreño May 24 '19 at 15:24
  • Apart from the pom.xml, if you can build a test case to replicate this that'd be great too. – Galder Zamarreño May 24 '19 at 15:26
  • I'll take some time to create a reproducer project and post the link to it. I'll let you know. – will Jun 04 '19 at 07:19
  • 1
    Hi, @will, did you managed to solve this issue? I am having similar with session cache, which is not replicated I think. – zygimantus May 28 '20 at 14:47

2 Answers2

0

If you use port 7600 (in jgroups-tcp.xml), then listing ports 27600 and 37600 won't work: localhost[27600],localhost[37600] should be localhost[7600].

Bela Ban
  • 2,186
  • 13
  • 12
  • I just tried what you suggest and the cluster is not forming. I start both wildfly on my local with a port offset: 20000 for the first one and 30000 for the second one. So how one instance can know on what port it should try to connect to the other one? – will May 10 '19 at 12:49
  • I would start with an ofset of 100 (if you are running on the same system), and a vanilla Wildfly just with the change of UDP->TCP. If you deploy a simple EJB or web-distributed application the cluster should start and the two nodes should form a cluster (both nodes need to have the appliation!) The initial_hosts shoud have both servers with the correct ip[port] – wfink May 11 '19 at 10:37
  • that does not help either. my initial settings allow the jgroups cluster to form (the ejb cache is distrubuted), but hibernate replicated cache is not relicating. – will May 13 '19 at 16:04
-1

As well as correcting the ports as indicated in the other answer, I think you need <global-state/> in your <cache-container>, e.g.:

       <cache-container name="hibernate" module="org.infinispan.hibernate-cache">
            <transport channel="omega-ee" lock-timeout="60000"/>
            <global-state/>
            <local-cache name="local-query">
                <object-memory size="10000"/>
       ...etc...
batwad
  • 3,588
  • 1
  • 24
  • 38