You are almost there while browsing the log files.
The general convention of the stored log is something like this: Inside the containers
path where there are multiple application_id, the first one(something like this application_1618292556240_0001
ending with 001
) will be of the driver node and the rest will be from the executor.
I have no official documentation where it is mentioned above. But I have seen this in all my clusters.
So if you browse to the other application id, you will be able to see the executor log file.
Having said that it is very painful to browse to so many executors and search for the log.
How do I personally see the log from EMR cluster:
log in to one of the EC2 instance having enough access to download the files from S3 where the log of EMR is getting saved.
Navigate to the right path on the instance.
mkdir -p /tmp/debug-log/ && cd /tmp/debug-log/
Download all the files from S3 in a recursive manner.
aws s3 cp --recursive s3://your-bucket-name/cluster-id/ .
In your case, it would be
`aws s3 cp --recursive s3://brand17-logs/j-20H1NGEP519IG/ .`
Uncompress the log file:
find . -type f -exec gunzip {} \;
Now that all the compressed files are uncompressed, we can do a recursive grep like below:
grep -inR "message-that-i-am-looking-for"
the flag with grep means the following:
i -> case insensitive
n -> will display the file and line number where the message is present
R -> search it in a recursive manner.
- Browse to the exact file by
vi
pointed by the above grep
command and see the more relevant log in that file.
More readings can be found here:
View Log Files
access spark log