cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
36454
Views
25
Helpful
8
Comments
gkumark
Cisco Employee
Cisco Employee

This document describes the steps to be followed to replace a leaf or spine switch to the ACI fabric.

 

When you unbox the new switch, note down the serial number of the switch. Power on the switch and connect a console to check if it is running in ACI mode or NxOS mode. If running in NxOS mode, follow the steps documented in Converting the switch from NxOS to ACI mode to convert the switch to ACI mode. 

 

Note: Customers in USA can choose the preferred version of ACI software to be pre-loaded when placing the RMA request.

 

Once you confirm the switch is in ACI mode, follow below steps. 

 

From the new switch console run the command "setup-clean-config.sh" and reload (Run the command reload) to cleanup any existing configurations on the switch. This will prevent issues due to some existing configurations in new switch conflicting with existing fabric, even if the new switch was configured with another ACI fabric before. 

Decommission the current/failed switch

1. In the ACI GUI, Navigate to Fabric -> Inventory -> Fabric Membership, identify the switch to be replaced. In this example, I will be replacing leaf 103.

 

1.jpg2. Right click on the switch to be replaced and Decommission the switch. Now a new pop up windows will open as shown in image below. Click on "Remove from Controller" and hit Submit. Now another pop-up window will appear to confirm the decommission process. 

2.jpg

TipThe 'Remove from controller' option completely removes the node from the ACI fabric and the serial number is disassociated from the Node ID. The 'Regular' option is used in order to temporarily remove the node from the ACI fabric, with the expectation that the same node will rejoin the fabric with the same Node ID. For instance, if the node needs to be temporarily powered down for maintenance.

 

Now the switch would disappear from the Fabric Membership page.

 

3. Now, disconnect the switch to be replaced from fabric and disconnect the power cable. Unmount the old switch and mount new switch. 

 

Commission the new switch

4. Power on the new switch and connect the new switch to the fabric.

Note: If you are replacing the leaf switch, make sure that the new leaf switch is connected to all the spine switches in the fabric. If you are replacing a spine switch, make sure to connect the new switch to all the leaf switches in the fabric. 

 

Considering, the switch is in ACI mode and you have connected it to the fabric, the fabric should now discover the new switch automatically using LLDP.

 

5. Go back to GUI -> Fabric -> Inventory -> Fabric Membership and look for a the new switch which doesn't have any IP address assigned (0.0.0.0) and no node ID assigned. Please confirm the new switch by verifying the serial number. 3.jpg

 

6. Right click on the new switch and click "Register Switch". Now you will see few editable fields. It is very important to fill right information for below fields. Rest of the fields can be left to default. 

  • POD ID: Default is 1. You need to change this to right POD ID if you have a multi-pod fabric.
  • Node ID: It is very important to configure the right node ID. Type in the same node ID as previous switch because the APIC will push the configurations based on the node ID. Once you assign and it gets registered, you cannot change this without decommissioning the switch. 
  • Node Name: Enter the name for the node same as before. 

4.jpg

5.jpg

 

7. Click "Update" and wait for the APIC to assign a TEP IP to the new switch. 6.jpg

 8. You can verify the switch status in GUI -> Fabric -> Inventory -> Topology. You can see new switch part of topology now.7.jpg

 

9. SSH to the APIC and run the command "acidiag fnvread" to confirm the new switch shows up as "active"

8.jpg

 

 

Troubleshooting

Scenario 1: The node is not discovered in the fabric
  • Connect a console and make sure that the switch is running in ACI mode. Run the command "show version". If running NxOS mode, convert to ACI mode. The steps to convert is available from the link listed at the beginning of this document. 
  • Run the command "show lldp neighbors"and check if it discovers the immediately connected switch. If it is not listed check and confirm the cable is good. Otherwise open a case with TAC for help.

Scenario 2: The newly added switch shows as "not supported"

In ACI GUI -> Fabric -> Inventory -> Fabric Membership page if the new switch is listed as "no" under "Supported Model" column, this could be the issue of your APIC catalog firmware is too old and doesn't have the model of new switch listed in there. To solve this, upgrade the APIC to the same version level as the new switch. After that the new switch should be able to join the fabric.

 

Scenario 3: SSL certificate issue

  • If the switch fails to get registered with the fabric after you assign a node ID and node name, there could be SSL certificate issue. You can verify the same using below method.
    • From the console, run the command "netstat -an | grep <TEP ip of APIC>" and check for a "ESTABLISHED" session with on port 12215 with APIC. This session could be established with any of the APIC in your fabric. So re-run the command with different APIC IP's
    • Below is an example of the above step.netstat.jpg

       

       

    • Established session with any of the APIC on port 12215 means the new switch is able to communicate with the APIC policy manager. If you don't see this session with any of the APIC, it could be a SSL certificate issue. Open a case with TAC for further assistance.  

       

Scenario 4: New switch doesn't get a TEP IP assigned
  • If the new switch doesn't get a TEP IP assigned after registering the switch, it could be because of some issue DHCP IP allocation from the APIC. Please open a case with TAC for assistance. 

 

 

Comments
Rick1776
Level 5
Level 5

Thanks for the great write up.

m1xed0s
Spotlight
Spotlight

Nice Article! One question: does the replacement switch have to be the same model/generation as the old switch? Such as 9732PX replacing 9396PX?

rsua
Cisco Employee
Cisco Employee

Hi m1xed0s,

 

The leaf switch model does not have to be the same model/generation UNLESS the leaf is part of a VPC pair.

 

Leaf Switches in VPC pair must be the same exact model.

Hi everyone.

I replace a switch leaf and these switch is registered on a lot NMS. It's possible keep the before ip address that el switch issue have.

Actually, the leaf switches have DHCP IP address.

 

Please, hope your comments.

 

Thank you.

m1xed0s
Spotlight
Spotlight
@rsua, Thanks for the information and sorry for the long delay on response... I have a scenario related and hope you could share some input. Hypothetically, I have issue on one 9396PX leaf but I got 9732PX as the RMA replacement due to the age or inventory or whatever...But my fabric uses two 9396PX switches as vPC pair, so how can I replace the bad 9396PX with the RMA 9732PX while still need vPC? Do I need to get another 9732PX to replace 9396PX? If so, how can I do the replacement while maintaining the data plane? Also what if I have FEX connected to 9396PX and how to replace the FEX? Thanks! /S
m1xed0s
Spotlight
Spotlight
@rsua, Does "Leaf Switches in VPC pair must be the same exact model."? So vPC would not work between two different leaf switch models even the same generation?
IslamOmar
Level 1
Level 1

Perfect

JanWillem
Level 1
Level 1

Hi nice overview, thanks.

 

If have a question regarding a topic i found on the internet which i don't see in yours, is this needed?

Updating your known_hosts file on the APIC

If you try to SSH to the new leaf switch from the APIC and you have connected to the faulty leaf switch previously via SSH, permission will be denied and you will be presented with warnings about “POSSIBLE DNS SPOOFING” and “REMOTE HOST IDENTIFICATION HAS CHANGED”.

This is because the RSA host key has changed for the replacement leaf switch and does not match the RSA host key in known_hosts. To fix this you need to remove the RSA host key from your known_hosts file on the APIC as follows:

 

apic1# bashcph-local@apic1:~> ssh-keygen -R <leaf-switch-hostname>

# Host <leaf-switch-hostname> found: line 2/home/cph-local/.ssh/known_hosts updated.

Original contents retained as /home/cph-local/.ssh/known_hosts.old

cph-local@apic1:~> exit

exit

apic1# ssh <leaf-switch-hostname>

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: