Logging with log4j only once every so many calls to logger.info/debug/warn() calls

Question

I have a scenario where a particular log message might get printed a lot of times (may be in millions). For example, if we log (using logger.warn() method)for every record with the missing field(s), we might end up logging a lot-cases where input file has a lot of records with missing fields(for example, large files on HDFS). This quickly fills up the disk space.

To avoid this situation, I am trying to log once for every (for example) 1000 records with missing fields. I can implement all of this logic outside of the log4j package, but I was wondering if there is a cleaner way to do this. Ideally, all of this logic would go into the log4j code.

This seems like a commonly encountered problem, but there is hardly any info on this. Any thoughts?

You need every 1000 records log. thats implies for all your log statements — Ali Helmy, Jul 14 '15 at 07:17
you might want override Logger.getLogger(YourClass.class).info()/warn()/debug() functionality of Log4J — Anant Laxmikant Bobde, Jul 14 '15 at 07:24
@Ali Helmy: That is correct, I want to log only once for every 1000 records with missing fields. The log message does not have to be aggregated. I just want to log the 999th record with missing field(s). — ajaymysore, Jul 14 '15 at 07:24
Review [this](http://stackoverflow.com/questions/31337243/avoid-using-if-clause/31337475#31337475) it's similar. — Ali Helmy, Jul 14 '15 at 07:26

score 0 · Answer 1 · answered Jul 14 '15 at 07:28

Log4J cant do that out of the box. However you may try to write your own listener. If you want to switch to Logback as your logging framework, there is a Filter called DuplicateMessageFilter that drops messages after a certain repetition. You should really consider this, because that much logging will surely impact on your performance. Logback is configured the same way as Log4J and supports SLF4J out of the box.

score 0 · Answer 2 · answered Jul 14 '15 at 07:58

You can use a counter and set the log level programmatically. Not the best software design, but enough if you want to do this kind of logging only at one point.

import org.apache.log4j.Level;
import org.apache.log4j.Logger;

public class LogExample {

    private static final Logger LOG = Logger.getLogger(LogExample.class);

    private static final Level DEFAULT_LOG_LEVEL = Level.ERROR;

    public static void main(final String[] args) {
        int count = 0;
        LOG.setLevel(DEFAULT_LOG_LEVEL);
        for (int i = 1; i < 1000000; i++) {
            count++;
            final boolean logInfo = (count % 1000) == 0;
            if (logInfo) {
                LOG.setLevel(Level.INFO);
            }
            LOG.info("test: " + i);
            if (logInfo) {
                LOG.setLevel(DEFAULT_LOG_LEVEL);
            }
        }
    }
}

This is what I wanted to avoid, having all these "if" conditions in my code. this boilerplate code should typically go inside log4j package. — ajaymysore, Jul 14 '15 at 16:02

Logging with log4j only once every so many calls to logger.info/debug/warn() calls

2 Answers2