The "The policy change was not implemented due to internal communication failure from the APIC to other APICs/nodes on the fabric, please retry the request" is most likely that the APIC cluster is "Not" Fully Fit.
The Policy updates may NOT execute until the APIC Cluster is "FULLY FIT". The APICS may be waiting for "Waiting for Cluster Convergence" and will not proceed with policy updates.
This "Waiting for Cluster Convergence" could be caused by a process(s) that has crashed or Core'd. If you discover that your cluster is not "Fully Fit", evaluate the processes of each APIC and try to start the process(s) that has crashed or core'd.
To evaluate processes of each APIC:
- Access APIC Admin GUI
- Select SYSTEM -> CONTROLLERS
- Expand CONTROLLERS in the left work pane
- Expand Each APIC
- Select the PROCESSES Folder for each APIC
- Change the OBJECTS PER PAGE from 15 to 50 so that you can see & scroll thru all listed processes
- Look at the PROCESS IDs to see if you see any processes with a PROCESS ID of 0 (Exclude the KERNEL process)
- If you see a PROCESS ID of 0 (that is not the KERNEL process), the associated PROCESS NAME is the process that needs to be STARTED.
- Open a terminal window and SSH to the APIC(s) with the failed process(s).
- Use the CLI command "acidiag start" to try to start the failed process(s).
For example:
Starting a failed "dhcpd" process on APIC1
deadbeef@fab1_apic1:~> acidiag start
usage: acidiag start [-h]
{xinetd,mgmt,ae,lldpad,observer,dbgr,idmgr,dhcpd,eventmgr,policymgr,reader,bootmgr,topomgr,nginx,vmmmgr,appliancedirector,scripthandler}
deadbeef@fab1_apic1:~> acidiag start dhcpd
{u'dme': {u'output': u'dhcpd start/running, process 4264\n', u'error_code': 0, u'error_string': u''}}
NOTE: If you try starting the failed processes and the APIC Cluster is "NOT FULLY FIT" and you are experiencing errors and can not make configuration or policy changes, Please open a Service Request with the Cisco TAC
Before opening a Service Request with the Cisco TAC please gather the following:
(SSH to an APIC in the CLUSTER and run the following commands)
1. version
deadbeef@fab2_apic1:~> version
node type node id node name version
---------- ------- ----------- --------------
controller 1 fab2_apic1 1.0(1k)
leaf 101 fab2_leaf1 n9000-11.0(1d)
leaf 102 fab2_leaf2 n9000-11.0(1d)
leaf 103 fab2_leaf3 n9000-11.0(1d)
leaf 104 fab2_leaf4 n9000-11.0(1d)
spine 201 fab2_spine1 n9000-11.0(1d)
spine 202 fab2_spine2 n9000-11.0(1d)
2. show cores
deadbeef@fab2_apic1:~> show cores
# Executing command: 'cat /aci/fabric/inventory/pod-1/troubleshooting/summary; cd /aci/system/controllers/; find . -name troubleshooting -exec echo ';' -exec cat '{}'/summary ';' '
troubleshooting:
node module creation-time file-size service-name process original-location exit-code death-reason last-heartbeat
---- ------ --------------------- --------- ------------- ------- ----------------------------------------------- --------- ------------ --------------
202 27 2014-08- 35002939 policy_mgr 4177 /var/sysmgr/logs/ 11 2 0.000000
26T14:45:57.000-04:00 1409078757_0x1b01_policy_mgr_log.4177.tar.gz
104 1 2014-09- 53567750 event_manager 4017 /var/sysmgr/logs/ 6 2 0.000000
04T16:07:05.000-04:00 1409861225_0x101_event_manager_log.4017.tar.gz
3. techsupport all
deadbeef@fab2_apic1:~> techsupport all
Triggering techsupport for Switch 201 using policy supNode201
Triggered on demand tech support successfully for node 201, will be available at: /data/techsupport on the controller.
Triggering techsupport for Switch 202 using policy supNode202
Triggered on demand tech support successfully for node 202, will be available at: /data/techsupport on the controller.
Triggering techsupport for Switch 102 using policy supNode102
Triggered on demand tech support successfully for node 102, will be available at: /data/techsupport on the controller.
Triggering techsupport for Switch 103 using policy supNode103
Triggered on demand tech support successfully for node 103, will be available at: /data/techsupport on the controller.
Triggering techsupport for Switch 101 using policy supNode101
Triggered on demand tech support successfully for node 101, will be available at: /data/techsupport on the controller.
Triggering techsupport for Switch 104 using policy supNode104
Triggered on demand tech support successfully for node 104, will be available at: /data/techsupport on the controller.
Triggering techsupport for APIC using policy ts_exp_pol
Triggered on demand tech support successfully for controllers, will be available at: /data/techsupport on the controller.
Use 'status' option with your command to check techsupport status
(Note: The "techsupport all" may have issues depending on process that has cored or crash. A "techsupport local" command may need to be run instead)
The Tech Support files will be located on the APIC(s) in the following directory: "/data/techsupport"
Thank you for using the ACI Cisco Support Forum.