I run sacct with -j switch, for a specific job-id. Depending on other command line switches two completely different results are reported for the same job. Here are three examples. The second one shows different result than the other two.
attar@lh> sacct -a -s CA,CD,F,NF,PR,TO -S 2020-07-26T00:00:00 -E 2020-07-27T23:59:59 --format=JobId,state,time,start,end,elapsed,MaxRss,MaxVMSize,nnodes,ncpus -j 1401 JobID State Timelimit Start End Elapsed MaxRSS MaxVMSize NNodes NCPUS
------------ ---------- ---------- ------------------- ------------------- ---------- ---------- ---------- -------- ----------
1401 CANCELLED+ UNLIMITED 2020-07-26T20:45:31 2020-07-27T08:36:10 11:50:39 1 2
1401.batch COMPLETED 2020-07-26T20:45:31 2020-07-27T08:36:17 11:50:46 103856K 619812K 1 2
attar@lh> sacct -a -s CA,CD,F,NF,PR,TO -S 2020-07-26T00:00:00 -E 2020-07-26T23:59:59 --format=JobId,state,time,start,end,elapsed,MaxRss,MaxVMSize,nnodes,ncpus -j 1401
JobID State Timelimit Start End Elapsed MaxRSS MaxVMSize NNodes NCPUS
------------ ---------- ---------- ------------------- ------------------- ---------- ---------- ---------- -------- ----------
1401 NODE_FAIL UNLIMITED 2020-06-15T09:38:38 2020-07-26T00:17:26 40-14:38:48 1 2
attar@lh> sacct -a -s CA,CD,F,NF,PR,TO --format=JobId,state,time,start,end,elapsed,MaxRss,MaxVMSize,nnodes,ncpus -j 1401
JobID State Timelimit Start End Elapsed MaxRSS MaxVMSize NNodes NCPUS
------------ ---------- ---------- ------------------- ------------------- ---------- ---------- ---------- -------- ----------
1401 CANCELLED+ UNLIMITED 2020-07-26T20:45:31 2020-07-27T08:36:10 11:50:39 1 2
1401.batch COMPLETED 2020-07-26T20:45:31 2020-07-27T08:36:17 11:50:46 103856K 619812K 1 2
Why are the start/end times different for the same job? One reports 11 hours run-time and the other 40 days run-time!
Any of your insight is highly appreciated!