Scenario
We had a vCenter with External PSC. We converged them and converge job was successful execpt a cert warning.
After a week we tried to decommission the old PSC appliance and found that the status is shown in WebClient as "Unknown"
Up on checking we found applmgmt in stopped state. Tried to start it but it failed with below error
[ ~ ]# service-control --status
Running:
lwsmd pschealth vmafdd vmcad vmdird vmdnsd vmonapi vmware-analytics vmware-certificatemanagement vmware-cis-license vmware-cm vmware-rhttpproxy vmware-sca vmware-sts-idmd vmware-stsd vmware-vapi-endpoint vmware-vmon
Stopped:
applmgmt vmware-statsmonitor
[ ~ ]# service-control --start applmgmt
Operation not cancellable. Please wait for it to finish...
Performing start operation on service applmgmt...
Error executing start on service applmgmt. Details {
"detail": [
{
"translatable": "An error occurred while starting service '%(0)s'",
"id": "install.ciscommon.service.failstart",
"args": [
"applmgmt"
],
"localized": "An error occurred while starting service 'applmgmt'"
}
],
"componentKey": null,
"resolution": null,
"problemId": null
}
Service-control failed. Error: {
"detail": [
{
"translatable": "An error occurred while starting service '%(0)s'",
"id": "install.ciscommon.service.failstart",
"args": [
"applmgmt"
],
"localized": "An error occurred while starting service 'applmgmt'"
}
],
"componentKey": null,
"resolution": null,
"problemId": null
}
We has this issue on two infrastructures and we could fix it one
FIX that worked on first PSC
# List all disabled services for removal.
find /etc/systemd/system/ -lname '/dev/null' -exec ls {} \;
# Automatically remove them (or rm each file)
find /etc/systemd/system/ -lname '/dev/null' -exec rm {} \;
# Relaod systemctl daemon
systemctl daemon-reload
# Start services or Reboot
service-control --start --all
However second PSC was not happy still. So we had to manfully remove the replication manually
Manual Removal of the replication
1) Shutdown both PSC and vCenters and take an offline snap
2) Power on only vCenter. Do not start PSC
3) SSH to vCenter and run below commands
a) List all PSCs connected
]# ./vdcrepadmin -f showservers -h localhost -u administrator -w XXXX
cn=oldpscappliance.mydomain.com,cn=Servers,cn=Sites,cn=Configuration,dc=vsphere,dc=local
cn=vcenter.mydomain.com,cn=Servers,cn=Sites,cn=Configuration,dc=vsphere,dc=local
Note -- XXXX is the SSO password for administrator@vsphere.local
I can now see two, old PSC appliance and also the vCenter with PSC converged in to it.
Ran below command to make sure vCenter is pointing to converged PSC and not the old appliance
]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost
https://vcenter.mydomain.com:443/lookupservice/sdk
Output confirmed that the PSC appliance is not in use. So decided to manually remove the association.
# /bin/cmsso-util unregister --node-pnid oldpscappliance.mydomain.com --username administrator --passwd XXXX
Watch theoutput basically ends like this
2019-11-12T08:29:24.939Z Running command: ['/usr/lib/vmware-vmafd/bin/dir-cli', 'service', 'list', '--login', 'administrator']
2019-11-12T08:29:25.059Z Done running command
Stopping all the services ...
All services stopped.
Starting all the services ...
Started all the services.
Success
2019-11-12T08:33:13.071Z Running command: ['/usr/bin/sed', '-i', '-e', 's/cmsso-util.*/cmsso-util/g', '/var/log/vmware/procstate']
We had a vCenter with External PSC. We converged them and converge job was successful execpt a cert warning.
After a week we tried to decommission the old PSC appliance and found that the status is shown in WebClient as "Unknown"
Up on checking we found applmgmt in stopped state. Tried to start it but it failed with below error
[ ~ ]# service-control --status
Running:
lwsmd pschealth vmafdd vmcad vmdird vmdnsd vmonapi vmware-analytics vmware-certificatemanagement vmware-cis-license vmware-cm vmware-rhttpproxy vmware-sca vmware-sts-idmd vmware-stsd vmware-vapi-endpoint vmware-vmon
Stopped:
applmgmt vmware-statsmonitor
[ ~ ]# service-control --start applmgmt
Operation not cancellable. Please wait for it to finish...
Performing start operation on service applmgmt...
Error executing start on service applmgmt. Details {
"detail": [
{
"translatable": "An error occurred while starting service '%(0)s'",
"id": "install.ciscommon.service.failstart",
"args": [
"applmgmt"
],
"localized": "An error occurred while starting service 'applmgmt'"
}
],
"componentKey": null,
"resolution": null,
"problemId": null
}
Service-control failed. Error: {
"detail": [
{
"translatable": "An error occurred while starting service '%(0)s'",
"id": "install.ciscommon.service.failstart",
"args": [
"applmgmt"
],
"localized": "An error occurred while starting service 'applmgmt'"
}
],
"componentKey": null,
"resolution": null,
"problemId": null
}
We has this issue on two infrastructures and we could fix it one
FIX that worked on first PSC
# List all disabled services for removal.
find /etc/systemd/system/ -lname '/dev/null' -exec ls {} \;
# Automatically remove them (or rm each file)
find /etc/systemd/system/ -lname '/dev/null' -exec rm {} \;
# Relaod systemctl daemon
systemctl daemon-reload
# Start services or Reboot
service-control --start --all
However second PSC was not happy still. So we had to manfully remove the replication manually
Manual Removal of the replication
1) Shutdown both PSC and vCenters and take an offline snap
2) Power on only vCenter. Do not start PSC
3) SSH to vCenter and run below commands
a) List all PSCs connected
]# ./vdcrepadmin -f showservers -h localhost -u administrator -w XXXX
cn=oldpscappliance.mydomain.com,cn=Servers,cn=Sites,cn=Configuration,dc=vsphere,dc=local
cn=vcenter.mydomain.com,cn=Servers,cn=Sites,cn=Configuration,dc=vsphere,dc=local
Note -- XXXX is the SSO password for administrator@vsphere.local
I can now see two, old PSC appliance and also the vCenter with PSC converged in to it.
Ran below command to make sure vCenter is pointing to converged PSC and not the old appliance
]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost
https://vcenter.mydomain.com:443/lookupservice/sdk
Output confirmed that the PSC appliance is not in use. So decided to manually remove the association.
# /bin/cmsso-util unregister --node-pnid oldpscappliance.mydomain.com --username administrator --passwd XXXX
Watch theoutput basically ends like this
2019-11-12T08:29:24.939Z Running command: ['/usr/lib/vmware-vmafd/bin/dir-cli', 'service', 'list', '--login', 'administrator']
2019-11-12T08:29:25.059Z Done running command
Stopping all the services ...
All services stopped.
Starting all the services ...
Started all the services.
Success
2019-11-12T08:33:13.071Z Running command: ['/usr/bin/sed', '-i', '-e', 's/cmsso-util.*/cmsso-util/g', '/var/log/vmware/procstate']
2019-11-12T08:33:13.829Z Done running command
Login to the vCenter via WebClient and under Administration -> System Configuration makesure that the old PSC is listed anymore.
You may keep the old PSC appliance for a few days and delete it once it's all good.