6

We've recently started experiencing with deployments in Weblogic 12c using the weblogic.Deployer utility. We can deploy an EAR fine, but whenever we try to undeploy that application with the Managed Server still running it will start using 100% of our CPU (4-core Xeon, bare-metal).

After some tinkering and countless thread dumps, we could isolate the problem on 4 stuck threads. Each one of them consumed 100% on a core. The load average would jump from something around 0.10 to 4.00 in 5 minutes tops.

This is the threads that seems to be stuck:

"ExecuteThread: '3' for queue: 'weblogic.socket.Muxer'" daemon prio=10 tid=0x00007fb52801c800 nid=0x6bf0 runnable [0x00007fb58a0ad000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000000e18c66d0> (a sun.nio.ch.Util$2)
        - locked <0x00000000e18c66c0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000e18c6598> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
        at weblogic.socket.NIOSocketMuxer.selectFrom(NIOSocketMuxer.java:541)
        at weblogic.socket.NIOSocketMuxer.processSockets(NIOSocketMuxer.java:470)
        at weblogic.socket.SocketReaderRequest.run(SocketReaderRequest.java:30)
        at weblogic.socket.SocketReaderRequest.execute(SocketReaderRequest.java:43)
        at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:147)
        at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:119)

I've seem many people with the same problem (not with Weblogic, though):

https://github.com/netty/netty/issues/327

https://issues.jboss.org/browse/XNIO-172

Why does select() consume so much CPU time in my program?

I don't think this could be happening because an old JDK version. java -version says:

java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

I googled a little bit but did not find anything on that. Do you WL experts know what could be the cause of this problem ?

Thanks a lot!

Community
  • 1
  • 1
Gustavo Ramos
  • 1,324
  • 1
  • 12
  • 23

3 Answers3

6

I faced the same issue. I managed to solve it by using the following settings:

1. Using posix muxer :

set('MuxerClass', 'weblogic.socket.PosixSocketMuxer')

See Weblogic tunning

2. Add startup arguments:

-Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.PollSelectorProvider -DUseSunHttpHandler=true
  • sun.nio.ch.PollSelectorProvider uses linux poll instead of epoll_wait

  • -DUseSunHttpHandler=true bypasses using weblogic http socket implementation

Omar MEBARKI
  • 647
  • 4
  • 8
3

After much tinkering, an almost sleepless night and googling till I bled, I'm almost sure I got it solved.

This solution is heavily based on another thread: https://stackoverflow.com/a/7827952/1484232

To summarize the whole shebang, GC threads collision (most likely) were causing the issues here. After applying some parameters to my VM, it was magically solved.

-XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC 
-XX:ParallelCMSThreads=2 
-XX:+CMSParallelRemarkEnabled 
-XX:+CMSIncrementalMode 
-XX:+CMSIncrementalPacing 
-XX:CMSFullGCsBeforeCompaction=1 
-XX:+CMSClassUnloadingEnabled 
-XX:CMSInitiatingOccupancyFraction=80

If anyone ever has the same trouble, this can be used as a try to get things working again.

Cheers.

Community
  • 1
  • 1
Gustavo Ramos
  • 1,324
  • 1
  • 12
  • 23
  • I'm having the same problem, but these settings make no difference(my feeling is, they make it even wors). – NeplatnyUdaj Feb 12 '16 at 12:39
  • Hi @NeplatnyUdaj... mileage seems to vary from case to case. It seems NIO is not well implemented in Weblogic 12c. We didn't see this problems on version 12.1.3 though, so it could be a good idea to give it a try. Maybe even 12.1.4 which is newer. Good luck. – Gustavo Ramos Feb 14 '16 at 23:01
  • I've changed JDK weblogic starts with from 1.7.0_25 to 1.8.0_60 and so far so good. – NeplatnyUdaj Feb 15 '16 at 14:09
  • @NeplatnyUdaj great! I use 1.7.0_62. 1.8 is on my schedule. Hope it solves for you. – Gustavo Ramos Feb 16 '16 at 17:21
1

This is a known issue with Weblogic 12c, and is published as the following Oracle Support document:

Performance Issue Due To weblogic.socket.NIOSocketMuxer Usage In WLS 12.1.2+ (Doc ID 2128032.1) (link)

The workaround provided is to switch to using a Native Muxer class, as described in the answer from Omar MEBARKI.

The article does not address any or the other workarounds mentioned in the other answers here.

Mike
  • 57
  • 6