I have a small java application that does the following:
- reads messages (json strings) from a queue and sends it to a sender
- sender accumulates the received messages and when it reaches a specific size (lets says 5k) makes a http call and posts those messages
- apache http client is used to post the messages and it is asynchronous, meaning I don't wait for the response and respective call back methods are invoked upon completing the post
Here is the pseudo code
Reader class: SomeProcessor.java
public class SomeProcessor extends Processor {
@Override
public void process(Messages m) {
// some processing on m
String jsonMessage = convertToJSON(m);
getSender().send(jsonMessage);
}
}
Base class: Processor.java
public class Processor {
private HttpSender sender = null ;
public Processor() {
setSender(new HttpSender());
}
public HttpEventCollectorSender getSender() {
return sender;
}
public void setSender(HttpEventCollectorSender sender) {
this.sender = sender;
}
}
Sender class: HttpSender.java
public class HttpSender {
private List<String> eventsBatch = new ArrayList<String>(5000);
public synchronized void send(final String message) {
eventsBatch.add(message);
if (eventsBatch.size() >= 5000) {
flush(); // calls http post
}
}
public synchronized void flush() {
if (eventsBatch.size() > 0) {
postEventsAsync(eventsBatch);
}
// since the above call is asynchronous after the post is called, I am assuming I should re-init the list, instead of clear. Is this correct?
eventsBatch = new ArrayList<String>(5000);
}
public void postEventsAsync(final List<String> events) {
startHttpClient(); // make sure http client is started
final String encoding = "utf-8";
// create http request
final HttpPost httpPost = new HttpPost(url);
httpPost.setHeader(AuthorizationHeaderTag, String.format(AuthorizationHeaderScheme, token));
StringEntity entity = new StringEntity(String.join("", events), encoding);
entity.setContentType(HttpContentType);
httpPost.setEntity(entity);
httpClient.execute(httpPost, new FutureCallback<HttpResponse>() {
@Override
public void completed(HttpResponse response) {
// log to console
}
@Override
public void failed(Exception ex) {
// just log to console
}
@Override
public void cancelled() {
}
});
}
}
Overall, I see a very high utilization of memory and noticed that even when processing has been stopped heap is not getting cleared. I looked into the heap dump and I see my "json string" messages represented as char[]. I suspect that something funky is happening with all those String not being GC'ed.
Thoughts?
Update-1: Based on the comments from below, attaching a Heap snapshot where the processing was paused and the heap space is still 4GB
Update-2: GC Report http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMDcvNS8tLXN0cmVhbWdlc3RfZ2MuemlwLS0xNy03LTMx