I am trying to get some logging out of my mapper jobs, running on Dataproc.
Following the advice here, I simply defined a log4j logger and info'ed to it:
import org.apache.log4j.Logger;
public class SampleMapper extends Mapper<LongWritable, Text, Text, Text> {
private Logger logger = Logger.getLogger(SampleMapper.class);
@Override
protected void setup(Context context) {
logger.info("Initializing NoSQL Connection.")
try {
// logic for connecting to NoSQL - ommitted
} catch (Exception ex) {
logger.error(ex.getMessage());
}
}
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
// mapper code ommitted
}
}
However I can't find any logs anywhere, not through Dataproc user interface, not by calling yarn logs
on the master, and not even when logging in to the worker instances and searching in various sensible places.
Is there any configuration I am missing that should make it work?
Where is the default log4j configuration read from and how can I aggregate it?