Solaris svcs command shows wrong status

Question

I have freshly installed an application on solaris 5.10 . When checked through ps -ef | grep hyperic | grep agent, process are up and running . When checked the status through svcs hyperic-agent command, the output shows that the agent is in maintenance mode . Application is working fine and I dont have any issues with the application . Please help

Maybe your starter doesn't exit with zero (`SMF_EXIT_OK`) status? Please check logs for service (their location is available from `svcs -x hyperic-agent` component). — myaut, Apr 15 '15 at 10:18
Thanks a million for your reply !!!!!!! Please find the log snippet.. Oracle Corporation SunOS 5.10 Generic Patch January 2005 -n Starting HQ Agent... -n . -n . -n . running (3314). Oracle Corporation SunOS 5.10 Generic Patch January 2005 3671 [ Apr 14 10:18:01 Method "start" exited with status 0 ] ... ..Start method exited with zero . — manu endla, Apr 15 '15 at 12:01
Well, there are many ways SMF monitors application: forking and exiting processes, delivered signals, probably one of that bad event had been noticed and SMF marked it as maintenance. These events should be masked in app manifest. — myaut, Apr 15 '15 at 12:06
The actual system that provides SMF such facilities is System Contracts. You may try to `clear` your application status, carefully restart it, and issue `svcs -v hyperic-agent` to get CTID (contract-id) of your service than run `ctwatch CTID` to track that events (if service isn't already marked as maintenance) — myaut, Apr 15 '15 at 12:15
I greatly appreciate the response . After a clear and careful restart, hurrah !! I got the CTID .. Please find the o/p .. But after some time the status is again going into maintenance state .. root@rhmwsoss:/opt/hyperic-agent/agent-4.6.6.1-EE/bundles/agent-4.6.6.1/bin# ctwatch 37211 CTID EVID CRIT ACK CTTYPE SUMMARY 37211 28052 crit no process contract empty — manu endla, Apr 15 '15 at 13:17

score 3 · Accepted Answer · answered Apr 15 '15 at 14:41

There are several reasons that lead to that behavior:

Starter (start/exec property of service) returned status that is different from SMF_EXIT_OK (zero). Than you may check logs:
```
 # svcs -x ssh
 ...
 See: /var/svc/log/network-ssh:default.log
```
If you check logs, you may see following messages that means, starter script failed or incorrectly written:
```
 [ Aug 11 18:40:30 Method "start" exited with status 96 ]
```
Another reason for such behavior is that service faults during while its working (i.e. one of processes coredumps or receives kill signal or all processes exits) as described here: https://blogs.oracle.com/lianep/entry/smf_5_fault_retry_models

The actual system that provides SMF facilities for monitoring that is System Contracts. You may determine contract ID of online service with svcs -v (field CTID):
```
# svcs -vp svc:/network/smtp:sendmail
STATE          NSTATE        STIME    CTID   FMRI
online         -             Apr_14       68 svc:/network/smtp:sendmail
            Apr_14       1679 sendmail
            Apr_14       1681 sendmail
```
Than watch events with ctwatch:
```
# ctwatch 68
CTID    EVID    CRIT ACK CTTYPE   SUMMARY
68      28      crit no  process  contract empty
```
Than there are two options to handle that:
- There is a real problem with service so it eventually faults. Than debug the application.
- It is normal behavior of service, so you should edit and re-import your service manifest, to make SMF less paranoid. I.e. configure ignore_error and duration properties.

Thanks a lot for the info !!! it really helped me .. For my case the start method is exiting with zero but I see some error statements in the logs , that I am trying to fix it up . Anyway thanks again !!!!! — manu endla, Apr 16 '15 at 09:07

Solaris svcs command shows wrong status

1 Answers1