0

I have a small java application that does the following:

  • reads messages (json strings) from a queue and sends it to a sender
  • sender accumulates the received messages and when it reaches a specific size (lets says 5k) makes a http call and posts those messages
  • apache http client is used to post the messages and it is asynchronous, meaning I don't wait for the response and respective call back methods are invoked upon completing the post

Here is the pseudo code

Reader class: SomeProcessor.java

public class SomeProcessor extends Processor {
    @Override
    public void process(Messages m) {

    // some processing on m

    String jsonMessage = convertToJSON(m);

    getSender().send(jsonMessage);
    }
}

Base class: Processor.java

public class Processor {
    private HttpSender sender = null ;
    public Processor() {
        setSender(new HttpSender());
    }
    public HttpEventCollectorSender getSender() {
        return sender;
    }
    public void setSender(HttpEventCollectorSender sender) {
        this.sender = sender;
    }
}

Sender class: HttpSender.java

public class HttpSender {
    private List<String> eventsBatch = new ArrayList<String>(5000);

    public synchronized void send(final String message) {
        eventsBatch.add(message);
        if (eventsBatch.size() >= 5000) {
            flush(); // calls http post 
        }
    }

    public synchronized void flush() {
        if (eventsBatch.size() > 0) {
            postEventsAsync(eventsBatch);
        } 

        // since the above call is asynchronous after the post is called, I am assuming I should re-init the list, instead of clear. Is this correct?
        eventsBatch = new ArrayList<String>(5000); 
    }

    public void postEventsAsync(final List<String> events) {
        startHttpClient(); // make sure http client is started
        final String encoding = "utf-8";
        // create http request
        final HttpPost httpPost = new HttpPost(url);
        httpPost.setHeader(AuthorizationHeaderTag, String.format(AuthorizationHeaderScheme, token));
        StringEntity entity = new StringEntity(String.join("", events), encoding);
        entity.setContentType(HttpContentType);
        httpPost.setEntity(entity);
        httpClient.execute(httpPost, new FutureCallback<HttpResponse>() {
            @Override
            public void completed(HttpResponse response) {
            // log to console   
            }

            @Override
            public void failed(Exception ex) {
            // just log to console  
            }

            @Override
            public void cancelled() {
            }
        });
    }
}

Overall, I see a very high utilization of memory and noticed that even when processing has been stopped heap is not getting cleared. I looked into the heap dump and I see my "json string" messages represented as char[]. I suspect that something funky is happening with all those String not being GC'ed.

Thoughts?

Update-1: Based on the comments from below, attaching a Heap snapshot where the processing was paused and the heap space is still 4GB enter image description here

Update-2: GC Report http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMDcvNS8tLXN0cmVhbWdlc3RfZ2MuemlwLS0xNy03LTMx

jagamot
  • 5,348
  • 18
  • 59
  • 96
  • This post should help you !! - https://stackoverflow.com/questions/6748432/java-heap-space-out-of-memory – Rohit Padma Jul 05 '17 at 14:17
  • are you sure they are not GC'ed? Or did probably just no major GC run yet due to the available heap being fairly large? Have you enabled GC logging to see if a GC has happened since the json strings actually were consumed and got freed? – cello Jul 05 '17 at 14:20
  • there is also no need to initialize the list again. a simple `eventsBatch.clear()` should do the job – XtremeBaumer Jul 05 '17 at 14:27
  • @XtremeBaumer - I am always initializing my array with 5k elements. Copying array when the size increases is not an issue here, so what is the advantage of calling clear (I think it just sets the references to elements to null, at least from looking at the code)? – jagamot Jul 05 '17 at 14:39
  • @cello - I am using the following - -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC. I notice that GC has been kicked off since the time I paused the processing. – jagamot Jul 05 '17 at 14:47
  • Do you call httpSender.flush() before processor stops? – Alexander Anikin Jul 05 '17 at 15:06
  • @AlexanderAnikin - Yes, I do call flush before the processor stops. – jagamot Jul 05 '17 at 17:31
  • It's possible httpClient holds some data. – Alexander Anikin Jul 06 '17 at 14:20
  • hmm....may be I'll pause the processing and check the references. If you access my link from Update-2, it clearly shows that Full GC is happening several times. Wondering how I can fix that. – jagamot Jul 06 '17 at 14:57
  • @AlexanderAnikin - If so, how to clear this data ? – jagamot Jul 23 '17 at 12:23
  • @jagamot - It's either httpCache (google for "apache httpclient disable cache") or internal data of the client. To defeat the latter you have to let httpClient go once your don't need it - by recreating it every time for 5k records (AFAIK it's general java approach), by timer in guard thread, special call to your class or other methods. – Alexander Anikin Jul 31 '17 at 08:51

0 Answers0