0

Project:

I use Sflow + Ganglia to monitor JVM metrics of Websphere Application Server (WAS). WAS is instrumented using AspectJ aspects. I have added an aspect to measure all application method runtimes.

I use Hsflowd as a JVM metrics collector. Hsflowd internally uses the JMX-SflowAgent javaagent to hook into the JVM to collect metrics using MXBeans (RuntimeMXBean, GarbageCollectorMXBean, CompilationMXBean and ThreadMXBean).

Issue:

When I run WAS without aspectjweaver hook I can see all metrics (CPU, desk, memory, process etc.) in Ganglia web continuously. But when aspectjweaver is added to JVM args and after restarting the server I can see the metrics for 10 mins, but after that it doesn't report JVM metrics in Ganglia web.

In the Aspectj weaving logs I can see that AspectJ is weaving the JMXsflowAgent code. Even though it is exclued via !call(* com.sflow.JMX.SFlowAgent(..)).

Aspect:

package com.foo.main;

import java.io.*;
import java.lang.reflect.Method;
import java.security.Signature;
import java.util.*;

import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.*;
import org.osgi.service.application.ApplicationAdminPermission;

@Aspect
public class ResponseTimeAspect {
    @Pointcut(
        "execution(* com.foo.*(..)) && " +
        "!within(com.foo.main.ResponseTimeAspect) && " + 
        "!within(ThreadLocal+) && " + 
        "!within(&& !within(*..*Aspect)) && " + 
        "!within(com.foo.main.AppInformationReader) && " + 
        "!within(@org.aspectj.lang.annotation.Aspect *) && " + 
        "!within(com.sflow.jmx.SFlowAgent) && " + 
        "!(call( * com.sflow.jmx.SFlowAgent(..)))"
    )
    public void loggingResponseTime() {}

    private static ThreadLocal<String> uuidContainer = new ThreadLocal<String>() {
        @Override
        protected String initialValue(){
            return UUID.randomUUID().toString();
        } 
    };

    AppInformationReader logWriter = AppInformationReader.getInstance();

    @Around("loggingResponseTime()")
    public Object tracing(ProceedingJoinPoint thisJoinPoint) throws Throwable {

        Long startTime= System.currentTimeMillis();
        Long startTotalMemory = Runtime.getRuntime().totalMemory();
        Long startFreeMemory = Runtime.getRuntime().freeMemory();

        Object ret = thisJoinPoint.proceed();

        Long elapsedTime=System.currentTimeMillis() - startTime;
        Long endTotalMemory = Runtime.getRuntime().totalMemory();
        Long endFreeMemory = Runtime.getRuntime().freeMemory(); 
        String methodSignature=thisJoinPoint.getSignature().toString();
        String classname=methodSignature.split("\\.")[thisJoinPoint.getSignature().toString().split("\\.").length-1];
        String methodName =thisJoinPoint.getSignature().getDeclaringType().getCanonicalName();
        logWriter.writeLog(uuidContainer.get().toString(), startTime, System.currentTimeMillis(), elapsedTime, classname, methodName);
        return ret;
     }
}

The JMX packages are under com.sflow.jmx.SFlowAgent.

kriegaex
  • 63,017
  • 15
  • 111
  • 202
Vimlesh Yadav
  • 109
  • 1
  • 14
  • It is a bit hard to say anything intelligent about AspectJ or Java code we cannot see and about configurations we cannot see either. Maybe you want to consider providing more details as nobody here has a magic crystal globe. – kriegaex Sep 13 '14 at 09:27
  • @Kriegaex- further configuration is not required.. i think JMX and AspectJ will not work together on our desktops. – Vimlesh Yadav Sep 16 '14 at 08:24
  • aspectj is weaving the JMX code eventhough it is not in execution( * com.foo..*) pointcut.. i have enabled debugs for aspectj weaving and saw that aspectj is weaving the run() method also of the JMX agent. I am not sure but for this could be the issue here.. my pointcut is http://stackoverflow.com/questions/25663050/where-to-code-threadlocal-remove-in-aspectj-class – Vimlesh Yadav Sep 16 '14 at 12:48
  • Have you ever read and understood my answer and comments in the other topic?! Your pointcut intercepts EVERYTHING, try to limit it to what you are interested in. If you do not listen to answers, do not ask. If you think you know the answer even though it is obviously wrong, why are you here? And if you are not interested in learning AspectJ basics, why do you risk using it? – kriegaex Sep 16 '14 at 13:34
  • I Completely understood your answer and even accepted your answer. Apologies for not explaining it clearly. i have edited my question to make it more clear. – Vimlesh Yadav Sep 17 '14 at 08:53
  • Issue here is how Aspectj and JMXSflow agent are interlinked. How LTW of JMXSflowAgent is corrupting/Stopping SflowAgent to report metrics. SFLowAgent has 1 deamon thread which collects metrics after every 20 secs. – Vimlesh Yadav Sep 17 '14 at 09:08
  • Why do you even aspects into the other agent? Can you not just exclude it from weaving by defining a proper pointcut? I assume you want to intercept your own code, not third party libs, right? Edit: Ah, okay, I just saw your pointcut snippet. I had overlooked it because it was not formated as code. Please post the full pointcut, maybe even the full aspect. I think I know what is going wrong, but I want to double-check. What you posted is not an [SSCCE](http://sscce.org/), so it is hard to tell. – kriegaex Sep 17 '14 at 10:12
  • kriegaex.. i have posted the complete aspect code. actually JMX-Agent thread is running only once with aspectJ but without aspectj it runnuign after every 10 secs. and i more info with aspectj it is printing system out logs in native_stdout and not in SystemOut but without aspectj it is printing debugs in SystemOut.log of web Sphere – Vimlesh Yadav Sep 17 '14 at 12:19
  • I have reformatted and rephrased your question so as to make it more human-readable. Maybe I can look at the problem after dinner, it may take a while though. What I can already say is that your pointcut is way too complex and contains things which contradict each other as well as a real syntax error. It should not even work as it is now. – kriegaex Sep 17 '14 at 16:41

1 Answers1

0

Disclaimer: This is an answer, but not yet a solution. It does not make sense to write even more remarks , so I will rather refine my answer here as I gather more information from Vimlesh.

Okay, it is impossible to reproduce your problem with just the aspect and not a real SSCCE displaying the problematic behaviour. There are a lot of open questions:

  • I do not know how many classes the aspect is being applied to,
  • how many threads there are in the app server and
  • what happens with memory consumption in the 10 minutes before and the time after the JMX results are no longer displayed.
  • You said that the SFlow agent was running only once instead of every 10 seconds. How do you know? Can you please provide information explaining how you found out and how to reproduce the behaviour, optimally without an app server but with a plain Java SE VM?
  • I also wonder why the aspect gathers information about free memory. Is that not what the other Java agent is expected to do? Why do it twice?
  • It strikes me as odd that a variable named logWriter is an instance of AppInformationReader. So what is it, a reader or a writer? And what does the class do? The aspect uses it, but it is not shown.
  • Why on earth do you create UUIDs per thread in the aspect and what are they used for? They seem to add no value, as I already said in the other question you posted earlier. You did not answer the question then, can you do it now? It looks like useless overhead.
  • The pointcut is overkill. For example
    • execution(* com.foo.*(..)) only captures method executions in classes directly under package com.foo, but not in any sub-packages. So it is useless to exclude classes from sub-packages. Probably what you really want is execution(* com.foo..*(..)) - note the two dots in foo..* signifying subpackages.
    • You have misunderstood my answer in the other question because you did not choose one of my solutions to exclude the aspect and its internally used anonymous ThreadLocal subclass, but concatenated all of them with &&. That does not make it any better or more readable.
    • You try to exclude call( * com.sflow.jmx.SFlowAgent(..)), but for two reasons it is not necessary: Firstly, SFlowAgent is not in the targetted package com.foo. Secondly, an execution() joinpoint can never be a call() joinpoint at the same time, so the intersection set must be empty - no need to exclude the calls from the executions.
    • This syntax is invalid: !within(&& !within(*..*Aspect)) - maybe a copy & paste error concerning the nested within() clauses.

Having said that, you probably want this pointcut:

execution(* com.foo..*(..)) &&
!within(@org.aspectj.lang.annotation.Aspect *) &&
!within(com.foo.main.AppInformationReader)

That should be enough.

After having fixed your pointcut you can try to stop collecting and logging information in the aspect to make it more efficient. As for the other Java agent, there is no need to exclude it from aspect weaving, but maybe you need to exclude the aspect from being targetted by SFlowAgent. Maybe there is a problem in aspect code being instrumented by the SFlow agent, but that is just a guess. Maybe your configuration is wrong, maybe something else. It looks to me as though you are trying to wield two weapons (tools) you have never learned to use well enough. Without an SSCCE it is really hard to diagnose the problem's root cause.

Update: You can also try to list the AspectJ weaver as the first Java agent on the JVM command line, i.e. before the SFlow agent. Test if it makes any difference.

kriegaex
  • 63,017
  • 15
  • 111
  • 202
  • Thanks a lot kriegaex for pointing out mistakes in my pointcut. i will surely learn and correct it. i have made changes in my working copy but as code was big and while copy pasting some mistakes happened. Issue was resolved using -loadersToSkip option, i avoided weaving for the loader which was loading Sflowagent class. now its working fine. thanks alot for help. please update the answer – Vimlesh Yadav Sep 18 '14 at 05:49
  • What should I edit in my answer? You have not answered any of my questions or followed my advice. Instead you tell me that you found a workaround, which is nice. But the root cause must be something different because the pointcut you showed here should not weave the other agent. Maybe you have another one doing that, I have no idea. Please share an SSCCE so we all can reproduce the original problem and learn something. I put quite a lot of time in trying to help you lately, but you seem to ignore most of my hints, never share full information and do not help to make the problem reproduceable. – kriegaex Sep 18 '14 at 09:20
  • let me try to answer your questions - it should weave all classes under com.foo..* packages. As this aspect is for APM tool and we don't have any idea of code, no of threads are in the app server. Aim is to weave all classes under com.foo..* package and get response time of all methods executed for 1 call flow. there is no memory consumption in the 10 minutes before and the time after the JMX results are no longer displayed. i have downloaded the code from added debugs in code and executed with this customized code. i can see that debugs under ru() were called only once. – Vimlesh Yadav Sep 18 '14 at 12:39
  • How to reproduce - run Aspectjweaver and sflowagent.jar together with any application server. AppInformationReader initailly meant only for reading some user defined data and later i used the same class for dumping data(UUID, responseTime, methodName and className) to log files. Once i am able to fix my functionality issue i will correct as per Single Responsibility principle. UUIDs are used for unique identification of distributed transactions. we have distributed transactions and hence i need to collect data for all the Hops for a sinle distributed transaction. – Vimlesh Yadav Sep 18 '14 at 12:46
  • sflowagent http://jmx-sflow-agent.googlecode.com/svn/trunk/src/com/sflow/jmx/SFlowAgent.java has some MXBEANS which were getting corrupted after weaving. Code which is getting corrupted is /* OperatingSystemMXBean osMX = ManagementFactory.getOperatingSystemMXBean(); String className = osMX.getClass().getName(); if("com.sun.management.OperatingSystem".equals(className) || "com.sun.management.UnixOperatingSystem".equals(className)) { */.I still dont know what was going wrong this code. – Vimlesh Yadav Sep 18 '14 at 13:03
  • Kriegaexk - 1 more question - How we can do distributed transaction tracing. like sending same UUId to all JVMs on path of transaaction. How we can achieve this functionality if hops are communicating using web Services(SOAP+Rest).. – Vimlesh Yadav Sep 18 '14 at 13:23
  • I was busy and on the road for almost two weeks. Now I repeat: Provide us with an [SSCCE](http://sscce.org/) (please click the link and read), "try with any app server" is not enough information IMO. Try what? Please create a little Maven sample project on GitHub and tell me which target to call, that would be tremendously helpful. You have written so much here, so this seems to be important to you. Then go the last mile and prepare the SSCCE, otherwise I will not write anything more here. Make it reproduceable! – kriegaex Sep 24 '14 at 09:36