cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
49908
Views
15
Helpful
49
Comments
Eddie Chami
Cisco Employee
Cisco Employee

1    Glossary

 

  • nV – Network Virtualization
  • nV Edge – Network Virtualization on Edge routers
  • IRL – Inter Rack Links (for data forwarding)
  • Control Plane – the hardware and software infrastructure that deals with messaging / message passing across processes on the same or different nodes (RSPs or LCs)
  • EOBC - Ethernet Out of Band Channel - are the ports used to establish the control plane extension between chassis
  • Data Plane – the hardware and software infrastructure that deals with forwarding, generating and terminating data packets.
  • DSC – Designated Shelf Controller (the Primary RSP for the nV edge system)
  • Backup-DSC – Backup Designated Shelf Controller
  • UDLD – Uni Directional Link Detection protocol. An industry standard protocol used in Ethernet networks for monitoring link forwarding health.
  • FPD – Field Programmable Device (fpgas etc.. which can be upgraded)
  • Iron Man - Codename (nickname) for the ASR9001

 

2    Converting Single chassis ASR9K to nV Edge

 

This section assumes that the single chassis boxes are running 4.2.1 or later images, with the latest FPD versions. Check and correct this using the following commands on both chassis:

 

admin show hw-module fpd location all

admin upgrade hw-module fpd all location all

 

 

  • Find the serial numbers of each chassis. The serial number is found in the “SN:” field in the example below (the FOX.. values), the SN is also visually printed on the chassis.

 

(admin)#show inventory chassis

 

NAME: "chassis ASR-9006-AC", DESCR: "ASR-9006 AC Chassis"

PID: ASR-9006-AC, VID: V01, SN: FOX1435GV1C

 

NAME: "chassis ASR-9006-AC", DESCR: "ASR-9006 AC Chassis"

PID: ASR-9006-AC, VID: V01, SN: FOX1429GJSV

 

 

  • One of the chassis will end up being called “Rack0”, the other will be called “Rack1” – there are only two rack numbers possible.

 

  • Choose any one of the chassis as “Rack0”. Only on Rack0, enter the below config in admin config mode, this is an example with the above serial numbers.                 
    • (admin config) # nv edge control serial FOX1435GV1C rack 0
      (admin config) # nv edge control serial FOX1429GJSV rack 1
      (admin config) # commit

 

The above configuration builds a “database” on Rack0 for the chassis serial numbers and assigned rack numbers. One purpose of this is to verify whether a chassis that tries to become part of this nV Edge system is allowed to be part of this nV edge or not, as a security mechanism.

 

  • Wire up the control plane connections between the chassis (explained in detail in Section 3) and reload the chassis which is designated as “Rack1”
  • Important:  Regarding the wiring diagram below: The Control Ethernet cabling should be done only after all the previous  steps have been executed and both chassis are ready to “join” an nV  Edge system. The control plane network should not be connected before  the nV configuration is completed
  • NOTE: The ASR9001 (a.k.a. Iron Man) wiring will be different. If you wish to build an Iron Man cluster, please go to Section A1.1.1 in the Addendum at the end of this document for wiring information

 

1.JPG

 

 

  • The Rack1 chassis will reboot, Rack0 will “add” the Rack1 chassis to the nV Edge system after verifying its serial number and Rack1 chassis on booting up will communicate with Rack0. A software versioning check is now carried out and Rack1 will be requested to use the same XR software and/or SMUs that Rack0 has, initiating a reboot as required to achieve this consistency. After this reboot completes, Rack1 becomes part of the nV Edge system.

 

  • Now the nV Edge system is booted up perfectly. Any further reboots of either or both of the chassis do not need any further user intervention. The chassis will come up and both of them will “join” the nV Edge system.

 

  • Important: ALL the interfaces on the chassis having the backup-DSC RSP will be in SHUTDOWN state till at least one Inter-Rack Data Link is in forwarding state. Please refer Section 4.3 for more details.

 

At any time in the nV Edge system, one of the RSPs (in either Rack0 or Rack1) will be the “master” for the entire nV edge system. Another RSP in the system (again in Rack0 or Rack1) will be the “backup” for the entire nV edge system. The “master” is called a primary-DSC, using CRS Multi chassis terminology. The “backup” is called a backup-DSC. The primary-DSC will run all the primary protocol stacks (OSPF, BGP etc..) and the backup-DSC will run all the backup protocol stacks.

 

To find out which RSP is primary-DSC and which is backup-DSC, use the below command in admin exec mode.

 

RP/0/RSP0/CPU0:ios(admin)#show dsc

---------------------------------------------------------

   Node (   Seq#)     Role     Serial# State

---------------------------------------------------------

   0/RSP0/CPU0 (   0)   ACTIVE FOX1432GU2Z BACKUP-DSC

   0/RSP1/CPU0 ( 1223769) STANDBY FOX1432GU2Z NON-DSC

   1/RSP0/CPU0 ( 1279475)   ACTIVE FOX1441GPND PRIMARY-DSC

   1/RSP1/CPU0 ( 1279584) STANDBY FOX1441GPND NON-DSC

 

As can be seen above, the Rack1 RSP0 (1/RSP0/CPU0) is the primary-DSC and Rack0 RSP0 (0/RSP0/CPU0) is the backup-DSC. The Primary and Backup DSCs do not have any “affinity” towards any one chassis or any one RSP. Whichever chassis in the nV edge system boots up first will likely select one of its RSPs as the primary-DSC.

 

Another matter to note is that the “Active” / “Standby” states of the RSPs, which are familiar concepts in the single chassis mode of operation, are superseded by the primary-DSC backup-DSC functionality in an nV Edge system. For example, in a single chassis system, protocol stacks used to run on the Active and Standby RSPs in a single chassis as primary/backup protocol stacks. But as discussed in the preceding paragraph, that is no more the case in an nV Edge system – in nV edge, the primary-DSC and backup-DSC run the primary and backup protocol stacks.

2.1    Supported hardware and caveats

 

  • Only Enhanced Ethernet Linecards (Typhoon) and SIP-700 (Thor) line cards are supported in the chassis. Older Ethernet linecards (Trident) will not work
  • Only Enhanced Ethernet Linecards (Typhoon) can be used for IRL. Support for 100G/40G IRL is supported from 5.1.3
  • Tomahawk hardware(400G, 800G, 1200G LC) do not support NV cluster, nor can it exist in a cluster.
  • Only same chassis types can be connected to form an nV edge system
  • ASR9001 is also supported as an nV Edge system. In terms of High Availability functionality the 9001 chassis will support full nV Edge HA in a later release, so its expected to see an outage of about 30 seconds during a chassis shutdown or failover. Also some of the show commands used in this document might not appear on 9001 until a 5.1.0 release
  • nV edge on the 9922 chassis is in 4.3.1
  • Only Cisco supported SFPs allowed for all IRL connections
  • Important: The RSP front panel control plane SFPs HAVE TO BE 1Gig SFPs. 10Gig SFPs are NOT supported.
  • NOTE: The  ASR9001 (a.k.a. Iron Man) specific supported hardware and caveats, please go to Section A1.1.2 in the Addendum at the  end of this document

 

2.2    Booting with different images on each chassis

 

In an nV Edge system, if for whatever reason, both chassis end up having non-identical XR software and/or SMUs installed, this could happen if a system is forced to boot a particular image by a ROMMON setting, then the chassis that boots up later will tell the dSC chassis(normally rack0) about its version details – the dSC chassis will “reject” that version if it doesn’t match.

 

2.3    Configuring the Management Ethernet network for nV edge

 

Like a single chassis, one can configure the MgmtEth. interfaces on the nV edge cluster, the question often is which subnet to put the 4 interfaces in and what are the available options? Actually three options are available:

 

  • -     Flat management: Have all interfaces on one subnet, and use one virtual IP address to access the nV cluster.
  • -      Per Chassis Management (global and VRF): Have each chassis/Rack in their own network, and have one virtual address in the global table, and another virtual address/VRF for the second chassis.
  • -      Per Chassis Management (both VRF): Have both chassis each in their own VRF with a unique virtual address to each VRF.

 

 

3    nV Edge Control Plane

 

The nV Edge control plane provides software and hardware extensions to create a “unified” control plane for all the RSPs and line cards on both the nv Edge chassis. The control plane packets are forwarded from chassis to chassis “in hardware” as you will see in sections below. Control plane multicast etc.. is done in hardware for both the chassis – so there is no control plane performance impact because there are two chassis instead of one.

 

 

The nV Edge control plane links have to be direct L1 connections, there is no network or intermediate routing / switching devices allowed in between.  Some details of the control plane connections are provided below to provide a better understanding of what exactly is the reasoning behind our recommendations. The control Ethernet links (front panel SFP+ ports) are configured in 1Gig mode of operation.

 

2.JPG

(Note: Not applicable to the ASR9001. See Addendum for details.)

 

As seen in the diagram above, each RSP in each chassis has an Ethernet switch to which all the CPUs in the system (Line Card CPUs, RSP CPUs, any other CPUs in the system) connect to. So each CPU connects to two switches – one on each RSP. At any point in time, only one of the switches will be “active” and switching the control plane packets, the other will be “inactive” (regardless of whether system is nV edge or single chassis). And the “active” switch can be on either of the RSPs in the chassis, whichever switch can ensure the best connectivity across all the CPUs in the system.

 

The two SFP+ front panel ports on RSP-440 are just direct ports plugging into the switch on the RSP. So as shown in the diagram, of an nV Edge system, the simple goal is to connect each RSP (switch inside the RSP) to each switch on the remote chassis. So in the above case if any of the links go down, there are three possible backup links. Also at any point in time, only one of the links will be used for forwarding control plane data, all the other three links will be in “standby” state.

 

Connecting two chassis with just 2 EOBC links ie, RSP0 to RSP0 RSP1 to RSP method is NOT recommended and discouraged against, as it doesn’t provide the required resilience.

 

Important: The control Ethernet is the heart of the system – if there is anything wrong with it, it can seriously degrade the nV edge system. So it is HIGHLY recommended to use all four control Ethernet links.

 

 

Here is a view of an RSP440 EOBC ports, these ports cannot be used for anything other than EOBC, they cannot be used or configured as a L2 or L2 data port (EOBC design and instructions differ for the ASR9001 Iron Man chassis. Please go to Section A1.1.1 in the Addendum at the  end of  this document)

 

4.png

 

 

In the case of a single RSP-per-chassis nV Edge topology, the below will be the wiring model. But again, this is not recommended because of resiliency reasons. If the only RSP in a chassis goes down, the entire chassis and all the line cards in the chassis also go down.

 

5.JPG

3.3    Control Plane UDLD

 

UDLD runs on the control plane links to ensure bi-directional forwarding health of the links. The UDLD is run at 200 msecs interval x 5 - ie, an expiry interval of 1 second. Which means that if a control link is uni-directional for 1 second, then the RSPs will take action to switch the control plane link to one of the three standby links.

 

Note that the one second detection is only for unidirectional failures – for a physical link fault (like fiber cut), there will be interrupts triggered with the fault and the link switchover to the standby links will happen in milliseconds.

 

The front panel SFP+ ports are referred to as ports “0” and “1” in the show command below. So each RSP has two of these ports, and the command below shows which port on which RSP is connected to which other port on which other RSP.

 

In the example below:

 

  • Port “0” on 0/RSP0 is connected to port “0” on 1/RSP0.
  • Port “1” on 0/RSP0 is connected to port “1” on 1/RSP1
  • Port “0” on 0/RSP1 is connected to port “0” on 1/RSP1
  • Port “1” on 0/RSP1 is connected to port “1” on 1/RSP0

 

Also, the “port pair” that is “active” and used for forwarding control Ethernet data is the link between port “12” on 0/RSP0 and port “12” on 1/RSP0 as shown in the state Forwarding below. All other links are just backup links.

 

The “CLM table version” is also a useful number to note. This number if it changes means that the control link UDLD is flapping. So in a good “stable” condition, that number should not change.

 

RP/0/RSP0/CPU0:ios# show nv edge control control-link-protocols location 0/RSP0/CPU0

Priority lPort       Remote_lPort    UDLD  STP

======== =====     ============     ==== ========

0   0/RSP0/CPU0/0   1/RSP0/CPU0/0   UP   Forwarding

1   0/RSP0/CPU0/1   1/RSP1/CPU0/1   UP   Blocking

2   0/RSP1/CPU0/0   1/RSP1/CPU0/0   UP   On Partner RSP

3   0/RSP1/CPU0/1   1/RSP0/CPU0/1   UP   On Partner RSP

Active Priority is 0

Active switch is   RSP0

CLM Table version is 2

 

Each RSP has two front panel EOBC link which are numbered as 0 and 1. The CLI to shut the links is as below

 

RP/1/RSP0/CPU0:A9K-Cluster-IPE(admin-config)#nv edge control control-link disable <0-1 > location <>

On shutting a control port, the CLI will also set a rommon variable on that RSP like “CLUSTER_0_DISABLE = 1” if port 0 is disabled and “CLUSTER_1_DISABLE = 1” if port 1 is disabled. As long as this rommon variable is set, neither rommon nor IOS-XR will ever enable that port.

 

The behavior when ALL the control links are shut is obviously that both chassis become DSC. But if the IRL links are active, then one of the chassis will reload and again, as soon as the IRL link comes back up, it will again reboot.

 

Currently this is the recommended procedure if all the control links are shutdown...

 

  • 1.  Shut down the IRL links from one of the chassis (whichever chassis doesn’t reboot, remember one chassis comes up and reboots). This will get both chassis to stay UP.
  • 2.  Reload one chassis and keep BOTH the RSPs in rommon and unconfigure a rommon variable as below, do this on BOTH the RSPs               
    • a.  rommon> unset CLUSTER_0_DISABLE
    • b.  rommon> unset CLUSTER_1_DISABLE
    • c.  rommon> sync
    • d.  rommon> reset
  • 3.  On the other chassis which is still in XR, go to admin config and say “no nv edge control control-link disable <port> <location>” for each port and location where the port was shutdown.
  • 4.  On the RSPs in rommon, say the below               
    • a.  rommon> boot mbi:

 

NOTE: The above is indeed a cumbersome and lengthy procedure (but only if we shut all control links). In 4.2.3 the procedure to unshut would be very simple – on whichever chassis that doesn’t reboot, go to admin config mode and just enter “no nv edge control control-link disable <port> <location>” and that will automatically take care of syncing it with the other chassis also.

 

 

  • 1.  show nv edge control control-link-port-counters – this CLI displays the Rx/Tx packet statistics through the EOBC front panel ports (0 or 1)

 

  • 2.  show nv edge control control-link-sfp – this CLI dumps the SFP EEPROM that’s plugged into the front panel port. In addition it provides the data below

 

SFP Plugged in      : 0x00000001 (1)

SFP Rx LOS       : 0x00000000 (0)

SFP Tx Fault         : 0x00000000 (0)

SFP Tx Enabled       : 0x00000001 (1)

 

The “SFP Plugged in” should be value 1 if there is an SFP present. The “SFP Rx LOS” should be 0 or else there is Rx Loss of Signal (an error !). The “SFP Tx Fault” should be 0 or else there is an SFP Fault (an error !). The “SFP Tx Enabled” should be 1 or else the SFP is not enabled from the control Ethernet driver (also an error !).

 

 

Supported EOBC SFPs as this articale was written, the most recent is in page 71 (Search for EOBC) in the following link:

 

http://www.cisco.com/c/en/us/td/docs/interfaces_modules/transceiver_modules/compatibility/matrix/OL_6981.pdf

 

In 4.2.1

 

SFP-GE-S=

1000BASE-SX SFP (DOM), MMF, 550/220m

 

In 4.3.0

SFP-GE-S=

1000BASE-SX SFP (DOM), MMF, 550/220m

GLC-SX-MMD=

1000BASE-SX SFP, MMF, 850nm, 550m/220m,   DOM

  GLC-LH-SMD=

  1000BASE-LX/LH SFP transceiver module for MMF and

  SMF, 1300-nm wavelength

 

In 5.2.0

GLC-T

 

GLC-ZX-SMD

 

GLC-EX-SMD

 

 

 

 

 

  • 3.  show nv edge control control-link-debug-counts – this is mostly for Cisco engineering support debugging. Values that might be of interest are as below

 

Admin UP        : 0x00000001 (1)

SFP supported cached     : 0x00000001 (1)

PHY status register     : 0x00000070 (112)

 

The “Admin UP” 0 would mean that customer has configured “nv edge control control-link-disable <port> <location>” CLI. Without that config, it should be value 1 which is the default. The “SFP supported cached” indicates whether user plugged in a Cisco supported SFP – value 1 means the SFP is supported, 0 means SFP is not supported. If the control link has an SFP plugged in and has a cable connected to a remote end and the remote end is also up and laser is good, link is good etc.., then the “PHY status register” should have a value of 0x70, it is an internal PHY register which says that the link is all good. If there is no cable or no SFP or bad cable or bad link etc.., it will not be value 0x70, this can be sometimes useful for Cisco support during debugging.

 

4    nV Inter Rack Link (IRL) connections

 

The IRL connections are required for forwarded traffic going from one chassis out of interface on the other chassis part of the nV edge system. The requirement for the IRL link is that it has to be a 10 Gig link and that they have to be direct L1 connections – no routed/switched devices are allowed in between. There can be a maximum of 16 such links between the chassis. Also recommended is a minimum of 2 links to offer resiliency(section 4.7 discusses load balancing across links), also that the two links be on two separate line cards, again for resiliency reasons in case one line card goes down due to any fault. The number of IRL links will need to be considered, this is based on the number of cards in the system, the expected traffic over IRL during a failure.

 

The configuration of an interface as IRL is simple, as shown below:

 

interface tenGigE 0/1/1/1

nv

edge

   interface

!

 

Add this config to the IRL interfaces on both chassis of course! We run UDLD over these links to monitor bi-directional forwarding health.. Only when UDLD reports that the echo and echo response are all fine (standard UDLD state machine), then we place the interface into “Forwarding” state, till then the interface is in “Configured” state. So the IRL interface might be “Configured” but not “Forwarding”, once its both, then it will be used for forwarding the data across chassis.

 

RP/0/RSP0/CPU0:ios#show nv edge data forwarding location 0/RSP0/CPU0

nV Edge Data interfaces in forwarding state: 1

 

tenGigE 0_1_1_1     <--> tenGigE 1_1_0_1

 

nV Edge Data interfaces in configured state: 2

 

tenGigE 1_1_0_1

tenGigE 0_1_1_1

 

The above CLI says that there are two IRLs in “Configured” state (marked blue) – of course one on each Rack. The CLI also says that there is one “pair” of IRLs in “Forwarding” state (marked green). The “pair” is one from each rack. So the UDLD protocol automatically detects which interface is connected to which other and forms a “pair”.

 

So if you have configured IRLs, but you don’t see the line “nV Edge Data interfaces in forwarding state:” in your CLI output, then that means that something is wrong. We would recommend going through the standard interface checklist

 

-> Are the cables and SFPs all good ?

-> Are the interfaces unshut and Up/Up ?

-> Are there interface drops or errors ?

-> If you are conversant with the packet path, are there any other packet path drops ?

 

NOTE 1: IRL links can NOT be part of a link bundle.

NOTE 2: For IRL information specific to the ASR9001 (a.k.a. Iron Man), please please go to Section A1.1.3 in the Addendum at the  end of this document for wiring information

 

For more information about IRL dimensioning see the following page:

 

 

The UDLD timers on the IRL links are set to 20 milliseconds times 5 hellos, ie around 100 msecs as the expiry timeout. That means that any uni-directional problem with the IRL links will be detected & corrected in around 150 msecs (100 msecs + delta for processing overheads).

 

If you want to see the UDLD state machine on the line card hosting these links, then the below CLI can be used. The Interface [number in red] is what we call the “ifhandle”. The interface name corresponding to that can be displayed using the CLI “show im database ifhandle <number in red> location <line card>”.

 

In the example below, the UDLD state is Bidirectional, which is the desired correct state when things are working fine.

------------------------------
#show nv edge data protocol all location all
-----------------node0_0_CPU0------------------
 
Interface [0x40013c0][2048][0]
---
Port configured for UDLD: Enabled
Port operational state for UDLD: Enabled
Current bidirectional state: Bidirectional
Current operational state: Advertisement - Single neighbor detected
Message interval: 20 msec
Time out interval: 10000 msec
Global TTL2MSG: 12
Interface TTL2MSG: 12
 
    Entry 1
    ---
    Expiration time: 240 msec
    Device ID: 1
    Current neighbor state: Bidirectional
    Device name: CLUSTER_RACK_01
    Port ID: [0x44000bc0][2048][0]
    Neighbor echo 1 device: CLUSTER_RACK_00
    Neighbor echo 1 port: [0x40013c0][2048][0]
 
    Message interval: 20 msec
    Time out interval: 100 msec
    CDP Device name: ASR9K CPU
 
 
-----------------node1_RSP0_CPU0------------------
 
-----------------node1_0_CPU0------------------
Interface [0x44000bc0][2048][0]
---
Port configured for UDLD: Enabled
Port operational state for UDLD: Enabled
Current bidirectional state: Bidirectional
Current operational state: Advertisement - Single neighbor detected
Message interval: 20 msec
Time out interval: 10000 msec
Global TTL2MSG: 12
Interface TTL2MSG: 12
 
    Entry 1
    ---
    Expiration time: 220 msec
    Device ID: 1
    Current neighbor state: Bidirectional
    Device name: CLUSTER_RACK_00
    Port ID: [0x40013c0][2048][0]
    Neighbor echo 1 device: CLUSTER_RACK_01
    Neighbor echo 1 port: [0x44000bc0][2048][0]
 
    Message interval: 20 msec
    Time out interval: 100 msec
    CDP Device name: ASR9K CPU
-------------------------------------
 

 

The IRL links are used for forwarding packets whose ingress and egress interfaces are on separate racks. They are also used for all protocol Punt packets and protocol Inject packets. As explained in Section 2, the protocol stack “Primary” runs on the primary-DSC RSP in one of the chassis. So if a protocol punt packet comes in on an interface in another chassis, it has to be punted to the primary-DSC RSP in the remote chassis. This punt is done via the IRL. Similarly if the protocol stack on the primary-DSC wants to send a packet out of an interface on another chassis, that is also done via the IRL interfaces.

 

4.3    nV IRL “threshold monitor”

 

If the number of IRL links available for forwarding goes below a certain threshold, that might mean that the remaining IRLs will get congested and more and more inter-rack traffic will get dropped. So the IRL-monitor provides a way of shutting down other ports on the chassis if the number of IRL links goes below a threshold. The commands available are below

 

RP/0/RSP0/CPU0:ios(admin-config)#nv edge data minimum <minimum threshold> ?

backup-rack-interfaces   Disable ALL interfaces on backup-DSC rack

selected-interfaces   Disable only interfaces with nv edge min-disable config

specific-rack-interfaces   Disable ALL interfaces on a specific rack

 

There are three modes of configuration possible.

 

4.3.1   Backup-rack-interfaces config

 

With this configuration, if the number of IRLs go below the <minimum threshold> configured, ALL interfaces on whichever chassis is hosting the backup-DSC RSP will be shut down. Again note that the backup-DSC RSP can be on either of the chassis.

 

4.3.2   Specific-rack-interfaces config

 

With this configuration, if the number of IRLs go below the <minimum threshold> configured, ALL interfaces on the specified rack (0 or 1) will be shut down.

 

4.3.3   selected-interfaces config

 

With this configuration, if the number of IRLs go below the <minimum threshold> configured, the interfaces on any of the racks that are explicitly configured to be brought down will be shut down. How do we “explicitly” configure an interface (on any rack) to respond to IRL threshold events ?

 

RP/0/RSP0/CPU0:ios(config)#interface gigabitEthernet 0/1/1/0

RP/0/RSP0/CPU0:ios(config-if)#nv edge min-disable

RP/0/RSP0/CPU0:ios(config-if)#commit

 

So in the above example, if the number of IRLs go below the configured minimum threshold, interface Gig0/1/1/0 will be shut down.

 

4.3.4   What is the default config

 

The default config (if customer does not configure any of the above explicitly) is the equivalent of having configured “nv edge data minimum 1 backup-rack-interfaces”. Which means that if the number of IRLs in forwarding state goes below 1 (at least 1 forwarding IRL), then ALL the interfaces on whichever rack that has the backup-DSC, will get shut down. Meaning all traffic on that rack will stop being forwarded.

 

This might make some customers happy, some unhappy. This behavior can be turned off or changes through the following CLI “nv edge data minimum 0 backup-rack-interfaces” – basically this says that if the number of IRLs in forwarding state goes below 0 (which will never happen), only then we should bother shutting any interface on any rack.

 

 

When an interface is configured as an IRL link, we install 5 absolute priority queues on the port in both the ingress and egress directions. The priorities are below

 

  • 1.  All protocol punt / inject packets like protocol Hellos etc..
  • 2.  Multicast traffic
  • 3.  Fabric priority  0 traffic
  • 4.  Fabric priority  1 traffic
  • 5.  Fabric priority  2 traffic

 

The IRL links do not allow “user configurable” MQC policies on the IRL interfaces themselves. The classification of “punt / inject” and “multicast” are done “internally” in microcode – that is, other than being a punt/inject or multicast packet, there is no way by which we can “influence/force” a packet to go into the first two queues.

 

What packet gets into the last three queues can be influenced – just by having QoS ingress policies that mark packets appropriately to be a cos value of 0, 1 or 2.  There is no other way by which we can influence what gets into these queues. The queue id selected on the ingress chassis’s IRL links is carried across in the Vlan COS bits, the egress chassis’s IRL that gets this packet will use this queue id encoded in the Vlan COS to select the queues it uses on Ingress (when it receives the packets from the remote chassis).

 

The CLI to display the nV edge qos queues is as below for example using an IRL interface  with configs below. The subslot number 0 in the example is the “subslot” in which the MPA (the pluggable adaptor) is on the MOD-80/160 line card in the ASR9K. If the line card is not of a type that supports pluggable adaptors, just use 0 for subslot. The port number 1 used in the example is simply the last number in the 1/1/0/1 notation.

 

The drops (if any) in these queues are aggregated and reflected in the “show interface” drops also. The standard interface MIBs can be used for monitoring these drops. Note that the individual queue drops are not exported to MIBs, only the aggregate drops are exported as the interface drops. Also the IRL links are just regular interfaces, so the regular interface MIBs will all work on IRLs also.

 

RP/0/RSP0/CPU0:ios#sh running-config interface gigabitEthernet 1/1/0/1

interface GigabitEthernet1/1/0/1

nv

edge

   interface

!

 

 

RP/0/RSP0/CPU0:ios#show qoshal cluster subslot 0 port 1 location 1/1/cPU0

 

Cluster Interface Queues : Subslot 0, Port 1

===============================================================

Port 1 NP 0 TM Port 17

   Ingress: QID 0xa8 Entity: 0/0/0/4/21/0 Priority: Priority 1 Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f0348/0x0/0x5f0349

    Statistics(Pkts/Bytes):

     Tx_To_TM 681762/140538069

     Total Xmt 681762/140538069 Dropped 0/0

 

   Ingress: QID 0xa9 Entity: 0/0/0/4/21/1 Priority: Priority 2 Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f034d/0x0/0x5f034e

   Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Ingress: QID 0xab Entity: 0/0/0/4/21/3 Priority: Priority 3 Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f0357/0x0/0x5f0358

    Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Ingress: QID 0xaa Entity: 0/0/0/4/21/2 Priority: Priority Normal Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f0352/0x0/0x5f0353

    Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Ingress: QID 0xac Entity: 0/0/0/4/21/4 Priority: Priority Normal Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f035c/0x0/0x5f035d

   Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Egress: QID 0xc8 Entity: 0/0/0/4/25/0 Priority: Priority 1 Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f03e8/0x0/0x5f03e9

      Statistics(Pkts/Bytes):

     Tx_To_TM 3372382/697778537

     Total Xmt 3372382/697778537 Dropped 0/0

 

   Egress: QID 0xc9 Entity: 0/0/0/4/25/1 Priority: Priority 2 Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f03ed/0x0/0x5f03ee

   Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Egress: QID 0xcb Entity: 0/0/0/4/25/3 Priority: Priority 3 Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f03f7/0x0/0x5f03f8

      Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Egress: QID 0xca Entity: 0/0/0/4/25/2 Priority: Priority Normal Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f03f2/0x0/0x5f03f3

    Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

   Egress: QID 0xcc Entity: 0/0/0/4/25/4 Priority: Priority Normal Qdepth: 0

   StatIDs: commit/fast_commit/drop: 0x5f03fc/0x0/0x5f03fd

   Statistics(Pkts/Bytes):

     Tx_To_TM 0/0

     Total Xmt 0/0 Dropped 0/0

 

RP/0/RSP0/CPU0:ios#

 

 

4.5    Configurable QoS on IRL interfaces

 

To support more flexible QoS options for customers who want more than the default QoS mentioned in Section 4.4, we provide an option for configuring regular MQC policies on the EGRESS direction (There is no ingress support) with some limitations. The limitation in one simple sentence is that the MQC policy configured on an IRL does not have the ability to access the packet contents – that is, there is no way to figuring out whether the packet that goes out on IRL is ipv4 or ipv6 etc.. So none of the MQC features that need to look into the packet will work. So how exactly is it used ?

 

Typical use case is that customer will configure an ingress MQC policy map on any regular (non-IRL) ingress interface. That ingress MQC policy can parse the packet and set a “qos-group” for the packet. The egress IRL policy-map can then match on this qos-group and apply features like queuing and shaping. Random detect can also be applied (not based on dscp though – remember that needs access to packet contents) and of course marking is not supported.

 

The user is not prevented from applying any MQC policy on the IRL regardless of whether that policy has features unsupported on the IRL or not. There is no config level rejection of policies done on the IRL interface yet (this might be enforced in later releases), so user has to take care to configure only supported features or else the behavior is unpredictable. For example if user configures an egress MQC policy on the IRL that does marking, then the packet going out of the IRL will have contents changed in some random location and that might cause those packets to be dropped in the node or at the host!

 

 

The configuration of MQC on IRL and the show commands etc.. are exactly the same as MQC on a regular interface (remember IRL is just a regular interface !).

4.6    IRL packet encapsulation and overhead

 

The packet that goes out on the IRL will have a Vlan encapsulation with vlan hard-coded to vlan-id 1. The vlan-id really doesn’t matter, we just use the vlan COS bits to carry over the packet priority as mentioned in section 4.4. So that is 18 bytes overhead. In addition there is around 24 bytes of overhead, which depends very much on the kind of packet (l3 / l2 / mcast etc..) being transported. So on average we have around 42 bytes overhead.

4.7    IRL load balancing

 

IRL load balances packets based on flow. How a “flow” is defined varies from feature to feature. In general, for any given feature, if we ask the question “how does this feature packet get load balanced across link bundle members”, the same answer would apply to load balancing across IRLs also. In other words, IRL load balancing obeys the exact same principles as link bundle member load balancing. In other words, a “32 bit” hash value is calculated for each packet/feature and that 32 bit hash value (with some bit flips etc.. to avoid polarization) would get used for IRL load balancing as well as link bundles.

 

Let us examine the different kinds of features briefly. This is by no means meant to be an exhaustive documentation of all the load balancing algorithms on the router, rather just to give an overview of the major classes of load balancing.

4.7.1   Ingress IP packet

 

This is the standard tuple used for hash calculation for load balancing across link bundle members – like the source ip, dest ip, source port, dest port, protocol type. It does not matter whether the egress is IP or MPLS, the ingress is all that matters

4.7.2   Ingress MPLS packet

 

If the incoming packet is MPLS, the forwarding engine looks deeper to see if the underlying packet is IP. If it is IP, then the standard IP hash tuple is used for calculating the hash. If the underlying packet is not IP, then just the labels from the label stack are used for calculating the hash. The label allocation mode (per CE or per VRF) has no impact on the hash.

 

4.7.3   L2 Unicast

 

There load balancing will be done based on src/dst mac addresses. Again, as explained initially this doesn’t become an exhaustive answer because there are scenarios where the VC label hash is used in vpls scenario.

4.7.4   L2 Flood

 

For L2 flood traffic over link bundles, there are multiple elaborate modes of load balancing, the exhaustive documentation is probably best referred to along with the L2 link bundle documentation. But in general, there are two modes of load balancing that is tied to the flooding mode in L2.

4.7.4.1  Flood optimized mode

 

In this mode, to restrict the L2 floods from reaching too many line cards, the hash is “statically” chosen based on bridge group. So some bridge groups will be “tied” to one IRL, few others to another IRL – same behaviour chosen for L2 over link bundles.

 

4.7.4.2  Convergence / Resiliency mode

 

In this mode, the L2 flood is hashed in ucode based on the src/dst mac addresses.

 

4.7.5   L3 Multicast

 

L3 Multicast hashes multicast flows based on (S,G) and uses that hash to distribute packets across the IRLs – again the same technique used for distributing multicast packets across link bundle members.

 

5    nV Edge Redundancy model

 

There are four very simple rules that can always help in determining the primary-DSC and backup-DSC RSPs in an nV edge system.

 

  • Primary-DSC and backup-DSC both are always the “Active” RSP in each chassis. The “Active” here refers to the “Active” we know in the context of a single chassis ASR9K – where one RSP is “Active” and another is “Standby”
  • Primary-DSC and backup-DSC will always be on RSPs in different chassis.
  • If a Primary-DSC goes down, then the backup-DSC becomes primary-DSC. The chassis which hosts the Primary-DSC is the DSC chassis.
  • If any RSP other than the primary-DSC or backup-DSC goes down, there is no change in the state of the primary-DSC or backup-DSC.

 

With these four rules in place, in any give scenario, we can figure out what happens if any of the RSPs in any of the chassis go down.

 

NOTE: The redundancy model for the ASR9001 (a.k.a. Iron Man) will vary due to the lack of standby processors. Please please  go to Section A1.1.2 in the Addendum at the  end of this document for  wiring information

 

5.1    Redundancy switchover: Control Ethernet readiness

 

Before issuing redundancy switchover, it’s a good practice to check the control links in the system and check that there is at least one backup link available that can take over. For example in the output below, if we decide to issue “redundancy switchover” on 0/RSP0/CPU0, we have three more links (shown as “Blocking” nor as  “On Partner RSP”) and one of them can take over as the link connecting control planes of both chassis (see Section 3.1 for details).

 

Sometimes it might happen that because of some fault (say fiber cut or bad sfp etc..), a few links are down in which case you won’t see those links (neither as “Blocking” nor “On Partner RSP”). So unless there is at least one backup link, if we issue a switchover, then the only link that is “Forwarding” will go away and there won’t be any more control plane connectivity across the chassis.

 

NOTE: We are enhancing the “redundancy switchover” CLI to automatically check this condition and disallow the cli to go through if there are no backup links. Until this enhancement is implemented, it is recommended to do this manual procedure.

 

RP/0/RSP0/CPU0:ios# show nv edge control control-link-protocols location 0/RSP0/CPU0

Priority lPort     Remote_lPort     UDLD STP

======== =====    ============     ==== ========

0   0/RSP0/CPU0/12   1/RSP0/CPU0/12   UP   Forwarding

1   0/RSP0/CPU0/13   1/RSP1/CPU0/13   UP   Blocking

2   0/RSP1/CPU0/12   1/RSP1/CPU0/12   UP   On Partner RSP

3   0/RSP1/CPU0/13   1/RSP0/CPU0/13   UP   On Partner RSP

 

5.2    RSP/Chassis failure Detection in ASR9k nV Edge

 

In an ASR-9k nV Edge system, on failure of the Primary DSC node the RSP in the Backup DSC role becomes Primary, with the duties of being the system “master” RSP and hosting the active set of control plane processes. In the normal case for nV Edge, the Primary and Backup DSC nodes are hosted on separate racks. This means that the failure detection for the Primary DSC occurs via communication between racks.

 

The following mechanisms are used to detect RSP failures across rack boundaries:

  • 1)  FPGA state information detected by the Peer RSP in the same chassis is broadcast over the control links. This is sent if any state change occurs, and periodically every 200ms.
  • 2)  The UDLD state of the inter-chassis control links to the remote rack, with failures detected at 500ms
  • 3)  The UDLD state of the inter-chassis data links to the remote rack, failures detection at 500ms (clarify with Micah)
  • 4)  A keep-alive message sent between RSP cards via the inter-chassis control links, with a failure detection time of 10 seconds.

 

Additionally messages are sent between racks for the purpose of Split Node avoidance / detection. These occur at 200ms intervals across the inter-chassis data links, and optionally can be configured redundantly across the RSP Management LAN interfaces. Refer to section 6.5 below.

 

 

 

Example HA Scenarios:

 

  • 1.  Single RSP Failure of the Primary DSC node

 

The Standby RSP within the same chassis initially detects the failure via the backplane FPGA. On failure detection this RSP will transition to the active state and notify the Backup DSC node of the failure via the inter-chassis control link messaging.

 

  • 2.  Failure of Primary DSC node and its Standby peer RSP.

 

There are multiple cases where this case can occur, such as power-cycle of the Primary DSC rack or simultaneous soft reset of both RSP cards within the Primary rack.

 

The remote rack failure will initially be detected by UDLD failure on the inter-chassis control link. The Backup DSC node checks the state if the UDLD on the inter-chassis data link. If the rack failure is confirmed by failure of the data link as well, then the Backup DSC node becomes active.

 

UDLD failure detection occurs in 500ms, however the time between control link and data link failure can vary since these are independent failures detected by the RSP and LC cards. A windowing period of up to 2 seconds is needed to correlate the control and data link failures, and to allow for split-brain detection messages to be received.

 

The keep-alive messaging between RSP acts as a redundant detection mechanism, should the UDLD detection fail to detect a stuck or reset RSP card.

 

  • 3.  Failure of Inter-Chassis control links (Split Node)

 

Failure is initially detected by the UDLD protocol on the Inter-Chassis control links. Unlike the rack reload scenario above, the Backup DSC will continue receiving UDLD and keep-alive messages via the inter-chassis data link. Similar to the rack reload case, a 2 second windowing period is allowed to correlate the control/data link failures. If after 2 seconds the data link has not failed, or Split Node packets are being received across the Management LAN then the Backup DSC rack will reload to avoid the Split Node condition.

 

 

6    Split Node

 

There are primarily two sets of links connecting the chassis in the nV edge system.

 

  • 1.  Control links (recommended four of them)
  • 2.  IRL links (minimum one)

So the two sets of links together will be at least FIVE wires. Let us see what can happen when there is a fault and a complete set of control links or IRL links or both go away (become faulty ?)

7.JPG

 

 

 

In this case, refer to Section 4.3 – both chassis will be up and functioning, but the interfaces on one of the chassis “might” get shut-down based on what config is present on the box (or whether its just the default config). Again, Section 4.3 should be referred to to understand what config is appropriate for you.

 

 

The two chassis in the nV edge system cannot function as “one entity” without control links. We have beacons that each chassis periodically exchanges over the IRL links. So if control links go down, then each chassis will know via the IRL beacons that the other chassis is UP, and one of the chassis has to just take itself down and go back to rommon.

 

Which chassis should go back to rommon ? The logical choice is the chassis hosting the Primary DSC RSP stays up, and the Non-Primary rack resets. Reason being that the chassis hosting the primary-DSC has all the “primary” protocol stacks and hence we want to avoid disturbing the protocols as much as possible. So we take the non-primary rack down to rommon and it tries to boot and join the nV edge system again – at some point if one or more control links become healthy again, that chassis will bootup and join the nV edge system again.

 

Since IOS-XR cannot stabilize with the control links severed in this way, the non-primary rack will continue to bootup, detect that the control links are down and reset until the connectivity issue is resolved.

 

The CLI command “show nv edge control control-link-protocols” can be used to assess the current status of the control links in the event of a problem.

 

 

In this scenario, we can “potentially” enter what is called a “Split Brain” – where each chassis thinks that the other chassis has gone down and each of them declares itself as the master. So protocols like OSPF will start having two instances each with the same router-id etc.. and that can be a problem for the network.

 

So to try and mitigate this scenario, we provide one more set of “last gasp” paths via the management LAN network. On EACH RSP in the system, we should connect one of the two management LAN interfaces (any one of them) to an L2 network so that all four of those interfaces (from each RSP) can send L2 packets to each other. Then we can enter the below configuration on each of those management LAN interfaces.

 

interface MgmtEth0/RSP0/CPU0/1

nv

edge

   split-brain

!

 

So what this will do is that on each RSP, we will send high frequency beacons on these interfaces at 200 millisecond intervals. So if both chassis are functional, both chassis will get beacons from the other. And in such a scenario, if both chassis comes to know that both of them are working independently, then they know it’s a problematic scenario and one of them will take itself down. The chassis to reset will be the one that has been in the primary state for the least amount of time.

 

So this “Split Node” management lan path provides yet another alternate path to provide additional resiliency to try and avoid a nasty “Split Node” scenario.

 

But if the Control links AND IRL links AND split-brain management LAN links ALL of them go away, then there’s no way to exchange any beacons across the chassis and then we will enter the split-brain scenario where both chassis starts functioning independently. In scenario such that the mgmt network on both chassis are not in the same subnet, or not in the same location, a L2 connection should be facilitated to provide the last gasp.

 

NOTE: The Split Node interface messages are meant to be “best effort” messages, currently we do not monitor for the “health” of those links. Those links are regular Management Ethernet interfaces and will have all the usual UP/DOWN traps etc.. But for example there are intermittent monitoring message drops on those links, then we do not raise any alarm or complaint. We might enhance this in future to include some monitoring of the packet drops (if any) on these links to alert the user.

 

7    Feature configuration caveats

 

 

The link bundle / BVI configuration on nV Edge requires a manual configuration of mac-address under the interface. An example for link bundle shown below

 

interface Bundle-Ether15

  mac-address 26.51c5.e602 <== A mac like this needs to be configured explicitly

 

Also for link bundle, the below lacp global configuration is also required

 

lacp system mac 0201.debf.0000

 

This caveat / requirement will be fixed in later release, till then we need to have this configuration for link bundles / BVIs / any virtual interfaces to work on nV Edge system.

 

 

interface Bundle-Ether15

  lacp switchover suppress-flaps 15000

 

 

The “bundle manager” is a process that runs on the primary (DSC) and backup (backup-DSC) RSPs and is responsible for the configuration and state maintenance of the link bundle interfaces. When the primary (DSC) chassis in an nV Edge system is reloaded, the bundle-manager on the backup-DSC needs to “go active” and start connections to some external processes that provide other services (ICCP as an example). A Chassis reload is a much more “heavy” operation compared to a regular RSP switchover because a chassis reload involves the restart of all RSPs and all line cards on that chassis and this cause quite a lot of control plane churn compared to a regular rsp switchover where theres only one node that goes away (one rsp). For example the basic infrastructure processes that handle the IPC (Inter Process Communication) in the system has to do a lot of “Cleanups”, they have to cleanup data structures corresponding to all the nodes that went away and flush packets from/to those nodes etc.. The routing protocols / rib has to process a lot of interface down notifications and start NSF / GR Etc.. Owing to this additional control plane load, when the bundle-manager asks for connecting to external “services”, those services will take more time to respond because they are already busy processing node down events.

 

Hence, the bundle-manager process might be “blocked” for a longer period of time compared to a regular swover scenario. So during this “blocked” time period, the remote end might time out and declare the bundle down. To prevent this, we have the “lacp switchover suppress-flap <seconds>” command. This needs to be configured on the nV Edge system AND the remote boxes (if remote is not IOS-XR box, whatever is the equivalent of that config in that box). This basically tells the link bundle to tolerate more control packet losses during this period.

 

In the example here, we have configured a 15 second tolerance – note that this DOES NOT mean that there will be a 15 second packet drop. Bundle manager will update the data plane to use a newly active link as soon as it gets the event which decides who is active (notification from peer in case of MC-LAG) and data can start flowing. All this does is to prevent bundle from going down if the rest of the bundle manager control plane is busy doing other stuff (like connecting to services) while the peer is expecting some control packets Rx/Tx.

 

7.3    IGP protocols and LFA-FRR

 

ASR9K nV Edge High Availability mode is unique in that it is probably the only High Availability model where we “expect” topology changes during a Backup to Primary Switchover like during a Rack / Chassis reload. If the Primary (DSC) chassis is reloaded, and if that chassis had IGP interface(s) on its line card(s), then when the Backup-DSC takes over as Primary-DSC, it has to do switchover processing AND at the same time process topology changes due to the loss of interfaces.

 

But as we know, for handling switchover cases gracefully, it is normal that customers configure Non Stop Forwarding (NSF) under IGP protocols like ISIS and OSPF. So now when the DSC Chassis is reloaded, the new DSC (old backup-DSC) will immediately start NSF on IGP (say ISIS) and as we know about regular NSF, it can take many seconds (default 90 seconds, can be changed by the nsf lifetime CLI) for NSF to be completed and the RIB will be informed about topology changes only AFTER NSF is complete.

 

So during this time frame, the new DSC chassis will have stale routes pointing to interfaces that are not existing any more (which were on the chassis that was reloaded). And this can lead to a large period of traffic loss. So what is the solution ? If we think through this problem, what we are asking for is the CEF / FIB to change the forwarding tables even though Routing Protocols / RIB has not asked it to do so. And this exactly fits the bill for the LFA-FRR feature. So without LFA-FRR, the convergence time during a chassis reload in an nV Edge system will be bad, LFA-FRR is a simple configuration, a basic example below. Note that LFA FRR can work with ECMP paths – one path in the ECMP list can backup the other path in the ECMP list.

 

router isis Cluster-L3VPN

<snip>

interface Loopback0

address-family ipv4 unicast

!

!

interface TenGigE0/1/0/5

address-family ipv4 unicast

   fast-reroute per-link

 

 

7.4    Multicast convergence during RACK reload or OIR

 

When you do rack OIR/Reload, the PIM in old standby/new active rack starts fresh (PIM is not hot standy). It triggers NSF for first 3 minutes.

By the time NSF ends, it downloads the routes to mfib and further to PD. Until this time, the A flag is not set on the rpf interface. Packets are dropped.

 

The difference in the case of rack OIR is, LC also goes through restart which results in topology change. However, since the new change cannot be downloaded to

PD, the update does not happen and packets are dropped. Compare this with the case of regular Switch over where only the RP node under goes a reload. In this case

Since LC remains unaffected even though mrib is under NSF window, the packets continue  to be switched using old route.

 

To mitigate this, one needs to configure link bundles on all interfaces that have multicast flows on this, and this bundle needs to have member links in both racks, this allows a rack OIR without changing the state of the bundle interfaces.

 

 

8    Feature Gaps

 

BFD Multihop is one feature that is supported on a single chassis, but not on the nV Edge system.

 

The nV Edge system also doesn’t support clock / syncing features like syncE.

 

nV Edge is only recommended with dual RSPs in each chassis due to the EOBC redundancy design. The EoBC of the ASR9001 is designed without RSP redundancy in mind, so it’s not exactly the same as chassis that support dual RSP.

9    Convergence numbers

 

After configuring all the required caveats mentioned in Section 7, at the time of writing this in 4.2.3 24I early image time frame, the convergence number for an L3VPN profile with Access facing Link bundle (one member each from each chassis) and Core facing ECMP (two IGP links one from each chassis) with 3K eBGP sessions and one million routes is around 8 seconds for a Chassis Reload (any of the chassis) in the nv Edge System. The number for sure will be different for different profiles, each profile needs separate measurement and qualification / tuning. The obvious question can be that how much lower can it get ? The natural comparison that we end up doing is a comparison with an RSP failover. The factors that are (very) different between RSP failover and chassis reload are

 

  • 1.  Chassis reload is a “software detected” event .. A regular RSP switchover in an ASR9K system is a “hardware detected” event because both RSPs are in the same chassis and one going down will trigger an interrupt for the other. Whereas a chassis going away is detected by loss of keep alive packets from one chassis to the other. And how fast we detect a failure is a fine balance between speed and stability. If we detect keep alive time outs too fast, the margin for errors / packet losses in the system is narrow and we might have false triggers. If we detect too slow, then the convergence suffers

 

  • 2.  Chassis reload involves a heavy amount of control plane churn – line cards go away, hence interfaces go away, so the control plane protocols, control plane infrastructure (like IPC – Inter Process Communication) etc.. has to do work to update this state and make sure that it clears up data structures related to entities that went away. Imagine if the Chassis that went away had like 128K interfaces ! That will trigger quite some control plane activity

 

  • 3.  Chassis reload involves updating data plane on the surviving chassis whereas RSP failover does not touch the data plane. And based on scale, this can be a time consuming activity also.

 

  • 4.  Chassis reload can involve topology change and updates triggered by the neighboring boxes whereas RSP switchover is practically unknown to the peers (especially if NSR is enabled for all protocols).

 

Because of all these reasons, its almost impossible to achieve anything better than say 3 to 4 seconds (currently 8 seconds) for the L3VPN profile mentioned in the beginning of this section. And the delta 5 seconds might come after quite a high engineering investment towards it.

 

10  “Debugging mode” CLIs – cisco support only

 

These clis are visible only for cisco-support users. There are many more CLIs than explained below, many of them are purely related to tuning the internal control port error-retry logic etc.. inside the driver and unlikely to be of use to anyone other than the engineers. Some of those explained below are quite “generic”, related to the UDLD protocol etc.. and hence explained below.

 

  • 1.  nv edge control control-link udldpriority – this CLI sets the thread priority of the process handling the UDLD packets to higher / lower value. Maximum is 56 and minimum is 10. We sometimes try tweaking this to higher values when we find that the CPU is being loaded by some other high priority activity and hence UDLD flaps. We also tweak it sometimes to be lower in case we find that the UDLD thread itself is hogging too much CPU.

 

  • 2.  nv edge control control-link udldttltomsg – this is a multiplier that affects the UDLD timeout. For some reason (say high CPU utilization or too many link errors etc..) if we want to make UDLD run slower, then this value can be set to a larger value. The UDLD timeout will be 50msecs times this multiplier

 

  • 3.  nv edge control control-link allowunsupsfp – we allow only Cisco supported 1Gig SFPs in the front panel control ports, this CLI allows any SFP (that the PHY on the board supports) to be plugged in.

 

  • 4.  nv edge control control-link noretry – by default if the front panel control ports have some error, a retry algorithm kicks in a backoff timer mode to bring the port up again. If we don’t want a retry, this CLI disables the retry algorithm.

 

  • 5.  nv edge data allowunsup – by default only 10Gig interfaces are allowed as IRLs. If some other interface type (like 1Gig) has to be enabled as IRL for some debugging / testing, this CLI has to be configured first before the IRL config will be allowed under the unsupported interface.

 

  • 6.  nv edge data stopudld – again for any debugging reasons, if the UDLD protocol has to be stopped on the IRL, this CLI can be used. Any **configured** IRL interface will be declared as available for forwarding regardless of the interface state (UP or DOWN). So be careful while using this CLI.

 

  • 7.  nv edge data udldpriority - – this CLI sets the thread priority of the process handling the UDLD packets (on the line card hosting the IRL) to higher / lower value. Maximum is 56 and minimum is 10. We sometimes try tweaking this to higher values when we find that the CPU is being loaded by some other high priority activity and hence UDLD flaps. We also tweak it sometimes to be lower in case we find that the UDLD thread itself is hogging too much CPU.

 

  • 8.  nv edge data udldttltomsg - this is a multiplier that affects the IRL UDLD timeout. For some reason (say high CPU utilization on the LC hosting IRL or too many link errors on IRL etc..) if we want to make UDLD run slower, then this value can be set to a larger value. The UDLD timeout will be 20msecs times this multiplier

 

11  nV Edge MIBs

 

 

The SNMP agent and MIB specific configuration have no differences for the nV Edge scenario.

 

 

With upto four RSPs in an nV Edge system, and each chassis having an “Active / Standby” pair of RSPs and the nV Edge altogether having a “primary-DSC / backup-DSC” pair, there are multiple redundancy elements that come into picture. There is “node redundancy” which says in a given chassis, which node is “Active” and which node is “Standby”. There is a node-group redundancy which says in an nV Edge system, which is the “primary-DSC” and which is the “backup-DSC”. And there are “process groups” which have their own redundancy characteristics – for example protocol stacks (say ospf) have redundancy across the primary-DSC/backup-DSC pair. Whereas some other “system” software elements will have redundancy across the “Active / Standby” RSPs in each chassis. This relationship is called “process groups” and their redundancy. The table below summarises the mibs.

 

MIB

Node Redundancy

Process Redundancy

Description

CISCO-RF-MIB

   

Currently provides DSC chassis   active/standby node pair info. In nV   Edge scenario should provide DSC primary/backup RP info. Provides switchover notification.

ENTITY-STATE-MIB

Status only; no relationships

 

Provides redundancy state info for each   node. No relationships indicated.

CISCO-ENTITY-STATE-EXT-MIB

   

Extension to ENTITY-STATE-MIB which   defines notifications (traps) on redundancy status changes.

CISCO-ENTITY-REDUNDANCY-MIB

Both status and relationships

Process group redundancy relationships   & node status

Define redundancy group types:

  • 1)   Node   redundancy group type
  • 2)   Process group   redundancy type

Node   redundancy pairs would be shown in groups with the node redundancy group   type. Primary/backup nodes for each   process group placed on them.

 

11.1.1     Node Redundancy MIBs

 

CISCO-RF-MIB is currently used to monitor the node redundancy of the DSC chassis’ active/standby RPs. The MIB definition is limited to representing redundancy relationships, status, and other info of only 2 nodes

 

CISCO-ENTITY-REDUNDANCY-MIB is used to model the redundancy relationships of pairs of nodes. The redundant node pairs are defined as redundancy groups with a group type indicating the group is a redundant node pair. The members of the group would be the nodes within the node-redundant pair.

 

11.1.2     Process Redundancy MIBs

 

Support for the CISCO-ENTITY-REDUNDANCY-MIB is used to model the redundancy relationships of pairs of nodes pertaining to the specific process groups. The redundant process groups are defined as redundancy groups with a group type indicating the group is a redundant process group. The members of the group would be the nodes where the primary and backup processes are placed for that process group.

 

11.1.3     Inventory Management

The inventory information for each chassis and the respective physical entities will be available just as in the single chassis. The difference for ASR9K nV Edge (as in CRS multi-chassis) is the presence of a top-level entity in the hierarchy which acts as a container of the chassis entities. This entity will have entPhysicalClass value of ‘stack’.

 

8.JPG

 

11.2   IRL monitoring MIBs

 

IRL interface are in ALL respects just a regular IOS-XR interface. All the standard interface mibs for reporting errors / alarms / faults on the link will apply to the IRL links. Also all the standard mibs for the interface statistics will also apply to these links.

 

One missing MIB is for the “uni-directional” forwarding state of the IRL. For example if there is excessive packet loss on IRL which makes it go into a UDLD state of “uni-directional”, that is a fault scenario and that IRL link is removed from all forwarding tables, even though the physical state of the interface remains UP. This will be an enhancement required to get this event reported to MIB. One approach would be to just shut the link down on uni-directional fault so that the standard ifmib can trap this event.

 

11.3   Control Ethernet monitoring MIBs

 

The CRS Multi chassis system has implemented some MIBs for the Control Ethernet aspects of the system :- they are currently not implanted for the nv Edge system. But since the nV Edge system control Ethernet is very similar to CRS Multi Chassis Control Ethernet, we can implement those exacts MIBs for the nV Edge system also. That would be an enhancement work item.

 

The Control Ethernet MIB frontend is a collection of MIBs as below.

 

  • 1.  IF-MIB implementation upgraded to support Control Ethernet interfaces
  • 2.  CISCO-CONTEXT-MAPPING MIB implementation.
  • 3.  Context aware implementation of BRIDGE-MIB
  • 4.  Implementation of MAU-MIB
  • 5.  Implementation of CISCO-MAU-EXT-MIB, which will distinguish the MAUs associated with Control Ethernet interfaces from those associated with other data-plane interfaces
  • 6.  ENTITY-MIB upgraded to support Control Ethernet related entities like Control Ethernet Bridges and associated bridge-ports and all Control Ethernet interfaces

.

11.4   Control Ethernet Syslog / error messages

 

Below we down the most important syslog error messages that indicates some fault with the control Ethernet module or links.

 

  • 1.  Front panel nV Edge Control Port <port> has unsupported SFP plugged in. Port is disabled, please plug in Cisco support 1Gig SFP for port to be enabled

 

LOG_INFO message: This message pops up if user inserts a Cisco unsupported SFP in the front panel SFP+ port. User has to replace the SFP with a Cisco supported one and the port will automatically get detected / used again.

 

  • 2.  Front Panel port <port>  error disabled because of UDLD uni directional forwarding. There will be automatic retries to try and bring up the port periodically

 

LOG_CRIT message: This message pops up if a particular control Ethernet links has a fault and keeps “flapping” too frequently. If that happens then this port is disabled and will not be used for control link packet forwarding.

 

  • 3.  ce_switch_srv[53]: %PLATFORM-CE_SWITCH-6-UPDN : Interface 12 (SFP+_00_10GE) is up

ce_switch_srv[53]: %PLATFORM-CE_SWITCH-6-UPDN : Interface 12 (SFP+_00_10GE) is down

 

 

These messages pops up whenever the Control Plane link (the front panel links) physical state changes up up/down – more like a regular interface up/down event notification. The “Interface 12andInterface 13(the 12 and 13) are just internal numbers for the two front panel ports. These messages will pop up anytime a remote RSP goes down or boots up because at those instances the remote end laser goes down/up. But during normal operation of the nV Edge system when there are no RSP reboots etc.., these messages are not expected and indicates a problem with the link / sfp etc..

 

 

Here we describe the syslog / error messages related to the IRL links that can appear in the logs and describe them so that user is aware of what those messages mean.

 

  • 1.  Interface <interface handle> has been uni directional for 10 seconds, this might be a transient condition if a card bootup / oir etc.. is happening and will get corrected automatically without any action. If it’s a real error, then the IRL will not be available for forwarding inter-rack data and will be missing in the output of show nv edge data forwarding CLI.

 

Here the interface name being referred to can be found by saying “show im database ifhandle <interface handle>” – that particular interface has encountered a uni directional forwarding scenario and will be removed from the forwarding tables – no more data will be forwarded across those IRLs. We will try re-starting UDLD on that link again after 10 seconds to see if the UDLD can become bi-directional again, so this retry will keep happening every 10 seconds until the link goes bi-directional or the user decides to unconfigure “nv edge interface” on that link forever.

 

 

  • 2.  <count> Inter Rack Links configured all on one slot. Recommended to spread across at least two slots for better resiliency.

 

All the IRL links are present on the same line card (slot). This is not good for resiliency reasons. If that line card goes down, all the IRL links also go down. So the message periodically pops up asking the user to configure the IRLs to be spread across at least two slots.

 

  • 3.  Inter Rack Links configured on <count> slots. Recommended to spread across maximum 5 slots for better manageability and troubleshooting.

 

The total number of IRLs in the system (maximum 16) is recommended to be spread across NO MORE than 5 line cards (slots). This is purely for debuggability reasons, debugging problems across more than 5 IRL LCs becomes a complex affair and hence a recommendation is to limit the spread to maximum 5 slots.

 

  • 4.  Only one Inter Rack Link is configured. For Inter Rack Link resiliency, recommendation is to have at least two links spread across at least two slots.

 

We recommend having at least two IRL links for resiliency reasons.

 

12  Debugs and Traces

 

The output of show tech mentioned below can be redirected to a file / tftp server etc.. Use when in doubt as to which module traces to collect.

 

  • 1.  show tech nv edge

 

 

13 Cluster Rack-By-Rack Upgrade

 

13.1   Overview

 

 

ISSU is not supported on cluster, let that be very clear. Though in the event of a software upgrade from any to any release, or during a SMU installation, it’s highly recommended that the following steps are following to avoid the standard 10 minutes or so of reload time after an upgrade. The method used here upgrades each system separately. The assumption here is the network is fully redundant and all links are dual homed to each of the chassis in the cluster, which translates to continuous connectivity while any one of the chassis in the cluster is down. The method here is scripted and an off system server/pc must be used to execute the script.

Rack-By-Rack reload is a method of upgrading, or installing disruptive software (ie reload SMUs) on the Cluster one rack at a time, in order to reduce the amount of traffic downtime compared to a full system reloads...

At a high level, the upgrade steps are as follows:

  • Rack 1 Shutdown  Phase - Rack 1 is isolated from the Cluster and the external  network, and made into a standalone node.               
    • IRL links are   disabled
    • External LC   interfaces are disabled
    • Control Link   interfaces are disabled
  • Rack 1 Activate  Phase - The target software is activated on Rack 1               
    • Install Activate occurs   on Rack 1 using the parallel reload method.
  • Critical Failover  Phase - Traffic is migrated to Rack 1              
    • All interfaces on   Rack 0 are shut down.
    • All interfaces on   Rack 1 are brought into service.
    • Protocols relearn   routes from neighboring routers and convergence begins.
  • Rack 0 Activate  Phase - The target software is activated on Rack 0               
    • Install Activate   occurs on Rack 0 using the parallel reload method
  • Cleanup Phase
    • Control links are   reactivated
    • IRL Links are   reactivated
    • Rack 0 rejoins the   cluster as Backup
    • Any external links   disabled as part of the upgrade are brought back into service

Due to the complexity of the CLI steps used, it is recommended to use the scripted method below.

 

13.2   Prerequisites

  • Rack By Rack Upgrade is not  compatible with the Management LAN Split Brain detection feature. This  feature should be disabled prior to upgrade.
  • Any Install operations in  progress need to complete (install commit) prior to this upgrade.
  • All Active install packages  must be committed prior to this upgrade procedure.
  • Support for this method is added in 4.3.1. For 4.2.3 it’s part of NV edge SMU 1 CSCue14377
  • The script does only  minimalist checking for any errors that occur. It is recommended to use  "install activate test" on the router prior to script execution  to validate the set of images.
  • It is highly recommended to backup your router config  prior to upgrade.

 

 

13.3   Software Upgrade Instructions (Scripted Method) & Cisco Software Manager

The following section discusses software installation and SMU management tools designed to ease repetative actions:

13.

13.3.1     Script Setup

The upgrade script may be obtained by copying it from the router to a tftphost via the "copy" command.

RP/0/RSP0/CPU0:ASR9006#dir disk0:/asr9k-base-4.3.2/0x100000/bin | i nv_edge
Mon Mar 31 16:28:07.153 UTC
788835      -rwx  3866        Tue Jan  7 01:03:06 2014  nv_edge_upgrade.exp
RP/0/RSP0/CPU0:ASR9006
copy disk0:/asr9k-base-4.3.2/0x100000/bin/nv_edge_upgrade.exp tftp:

Note: This file will always be <boot-disk>:/asr9k-base-<release>/0x100000/bin/nv_edge_upgrade.exp.

 

 

The required changes are:

    • The management telnet  access for Rack 0 (rack0_addr, rack0_port, rack0_stby_addr,  rack0_stby_port)
    • The management telnet  addresses for Rack 1 (rack1_addr, rack1_port, rack1_stby_addr,  rack1_stby_port)
    • The login credentials for  the router (router_username, router_password)
    • The set of images to  activate (image_list), space delimited.
    • The set of IRL ports  configured (irl_list), TCL list format.

 

An example of the script configuration variables is below:

set rack0_addr "172.27.152.19" 
set rack0_port "2002"
set rack0_stby_addr "172.27.152.19" 
set rack0_stby_port "2004"
set rack1_addr "172.27.152.19" 
set rack1_port "2005"
set rack1_stby_addr "172.27.152.19" 
set rack1_stby_port "2007"
 
set router_username "root" 
set router_password "root"
 
set image_list "disk0:asr9k-mini-px-4.2.3 \
disk0:asr9k-services-p-px-4.2.3 \
disk0:asr9k-px-4.2.3.CSCuc40191-0.0.2.i"
 
set irl_list {{Teng 0/1/1/2} {Teng 1/1/0/2}}

In this example, the console ports of all four RSP's of the cluster are connected to 172.27.152.19, and the ports are specified. The router login is root/root, three software packages are intended to be activated, and the script expects only one IRL link as specified.

 

13.3.2     Script execution

To begin the install activation via the script exit all consoles completely (exit to login prompt), and disconnect all serial and telnet connections to the management console of the router. Execute the script from an external linux workstation as below:

 

sjc-lds-904:> <strong>nv_edge_upgrade.exp </strong>
########################
This CLI Script performs a software upgrade on
an ASR9k Nv Edge system, using a rack-by-rack
parallel reload method. This script will modify
the configuration of the router, and will incur
traffic loss.
 
Do you wish to continue [y/n]   <strong>y</strong>
spawn telnet 172.27.152.19 2002
Trying 172.27.152.19...
Connected to 172.27.152.19.
Escape character is '^]'.
 
 
RP/0/RSP0/CPU0:ios#

In the example here, the script is executed by typing "nv_edge_upgrade.exp". Please ensure that the script is given execution file privileges. When prompted if you wish to continue the software activation, enter "y" to continue.

At various points during the upgrade process the script will enter into a waiting period and display a message as below:

--- WAITING FOR INSTALL ACTIVATE RACK 0 60 SECONDS (~~ to abort / + to add time) ---

CLI commands may be entered at this time to check the router status during the upgrade process. This is intended to allow sufficient time for the various steps of the upgrade to complete, and for the router to achieve a stable state before continuing. It is important that no configuration changes are made while the prompt is available.

The script will run to completion in approximately 45 minutes.

 

13.3.3     Verification

Once the script runs to completion, please connect to the router, verify that the platform is in working order, and that routing and traffic have resumed. Loss of topology and some loss of traffic is expected during the upgrade process. Expected traffic loss is between 30 seconds and 4 minutes on "normal scale" systems, and can be as long as 10 minutes in high scale scenarios.

Install commit is included in the script execution. To revert to the prior release after script completion, a separate install operation is needed. Reload of the system will not cause an install revert.

 

13.3.4     Cisco Software Manager (CSM)

The Cisco Software Manager (CSM) provides SMUs recommendations to  users and reduces their effort in manually searching, identifying, and  analyzing SMUs that are needed for a device as well as SMU dependencies. To provide the  recommendations, CSM must be connected through the Internet to the  cisco.com domain. The CSM can connect to multiple devices and provide  SMUs management for multiple Cisco IOS XR platforms and releases.

 

The Cisco Software Manager (CSM) is a standalone Java application  that can be installed on Windows, MAC, and UNIX. The CSM supports Cisco  CRS and Cisco ASR 9000 devices.

 

CSM Links (Some require Cisco login)

 

 

 

13.4   Upgrade Instructions (Manual Method)

The upgrade process can be executed entering the CLI commands directly onto the console instead of using the provided script. This is not recommended, as the upgrade process is sensitive to the ordering and timing of various steps of the upgrade. If a CLI command is omitted, or the commands entered in the incorrect order it may have catastrophic effect.

Within the script a variable is defined "debug_mode". Set this to "1", and then execute the script from the linux prompt. This will cause the script to output the CLI commands to the terminal window, and can be used as a basis for the manual upgrade.

 

13.5   Install Abort Procedure

Abort of the software installation is allowed at or any time prior to the following output message:

 

--- WAITING FOR INSTALL COMMIT 10 SECONDS (~~ to abort / + to add time) ---

The Abort procedure is as follows:

    • Use "ctrl-c" to  terminate the script operation. You may be required to enter  "~~" to terminate a wait period, and then "ctrl-c" to  terminate, depending on the state of the script.
    • Log into the router console  connection on rack 1.               
      • Enter "admin   reload rack 1", and confirm.
      • Halt the RSP bootup   for rack 1 (both active and standby).
      • unset the variables   "CLUSTER_0_DISABLE" and "CLUSTER_1_DISABLE" from both   RSP cards.
    • Log into the router console  connection on rack 0.               
      • Configure the nv-edge   control links to be enabled.
      • Configure the IRL   links to be no-shut.
      • Remove any "nv  edge data minimum" link configuration.
    • Boot-up Rack 1.

Rack 1 will automatically sync to the prior software load running on Rack 0.

 

13.6   Converting a nV Edge Cluster to single chassis system

It’s possible to change an nV Edge system back to two separate single chassis systems. The steps to do this are fairly simple, though console access is required to all RSPs.

 

(Note: console access is required to both racks)

  1. Prepare system to stop at the ROMMON prompt after a reload performed in later steps           
    • This can be done by sending a break signal to the RP after it reload
    • But the preferalbe method is to configure it in admin mode as follows:        
      • (admin)#config-register 0x0
  2. Remove all "nv edge" commands in admin-config mode
  3. Remove cables (in the following order)       
    • Shut down all IRL links
    • (Optional) Remove the working Exec-level configuration using:     
      • (config)# commit replace
      • Reason:  A saved configuruation will have interface naming/numbering based on  the cluster (indicating if they are on Rack 0 or on Rack 1). In order to  save yourself the step of clearing the configuration inconsistencies, a  commit replace will do the trick.

    • Configure the following in admin-config mode       
      • (admin-config)#nv edge control control-link disable
  4. Reload all RPs in the system    
    • (admin)#reload location all
  5. If you configured the registers to 0x0 in Step 1 you should be at the ROMMON prompt        
    • If not, you should send the break signal to each console in order to reach the ROMMON prompt
  6. From the ROMMON prompt type the following on each chassis:     
    • unset CLUSTER_RACK_ID
    • sync
  7. Unplug all EOBC and IRL cables running between the two chassis
  8. Reset the configuration registers to 0x102
  9. Reload the systems from ROMMON by typing:     
    • reset
  10. The systems will now come up as individual systems

 

At this point both chassis are separated. Care needs to be taken if the optional step in section 3 above was skipped, since the chassis will have the same config, hence the same router id, that could lead to protocol instability and duplicate system IDs.

 

ADDENDUM

This section will contains updates to the deployment guide in areas that require further clarification.

 

A1. ASR9001 (Iron Man) Specific Differences in Procedures

Added by Dan Segovia - 25-Feb-2014

 

The majority of information and instructions on this guideline apply to all flavors of nV Edge cluster systems. This section will serve to highlight only the differences to the ASR9001 (Iron Man) system. So, other than the information you read in this section, the reminder of deployment guide still applies.

18. A1.1 Converting Single ASR9001 Units to nV Edge

A1.1.1 ASR9001 EOBC (Control Plane) Wiring

The  standard ASR9000 design allows for up to four processors which are used  in classic nV Edge (cluster) solutions. The Iron Man chassis, however,  has a single fixed processing design and, thus, differs in connection  requirements.

 

      • Ports to Use - The EOBC ports for nV Clustering are labeled "Cluster 0" and "Cluster 1"
      • Port Types - These ports are 1Gig SFP ports (not SFP+)
      • How to Connect - These are directly connected as follows:                    
        • Cluster 0 port of Rack 0 is directly wired to Cluster 0 port of Rack 1
        • Cluster 1 port of Rack 0 is directly wired to Cluster 1 port of Rack 1

 

(click on image to enlarge)

ASR9001-FrontPanel.png

 

A1.1.2 Supported Hardware and Caveates

      • EOBC Ports - Are SFP only (1 Gbit/s). 10 Gbit/s ports are not supported in the EOBC role as of the writing of this document
      • Supported Modular Port Adapters (MPA):                      
        • A9K-MPA-20X1GE
        • A9K-MPA-2X10GE or A9K-MPA-4X10GE
      • Cluster  Chassis Restrictions - Only the same types of chassis can be clustered  together. Therefore, an Iron Man chassis will not properly work with an  ASR9000 chassis if paired

 

The Inter Rack Links (or IRL) are the data plane extension between nV Edge cluster systems.

      • Ports Used - Any available 10 Gbit/s links on the chassis (minimum of 2)                      
        • This  includes any combination of 10 Gbit/s ports either on modular port  adapters or any of the four SFP+ ports built into the front panel of the  chassis
      • Other than these Iron Man clarifications, the  original ASR9K NV Edge Deployment Guide will complete the remaining  instructions/tasks. Please visit the original document for the remaining  IRL instructions.

19.

20. A1.2 ASR9001 nV Edge Redundancy Model

Non-ASR9001 clusters will typically have a total of four processors  (2 in each chassis). The Iron Man however, has built-in single  processors.

 

A1.2.1 What do these differences mean for Iron Man?

      • ASR9001 will only have Active RPs (and will not have Standby RP)               
        • One in Rack 0 and the other in Rack 1
      • The lack of standby redundancy can lead to slightly longer traffic outages.

A1.2.2 Processor Roles Explained (for all chassis types)

      • Active - The Active RP in a single chassis
      • Standby - The Standby RP in a single chassis (not applicable to Iron Man)
      • Primary DSC - When clustered together, this is the Active RP in the chassis that is in control of the entire cluster system
      • Backup  DSC - When clustered together, this the the Active RP in the chassis  that is NOT in charge of the cluster (but waiting to take over if  anything goes wrong)

 

ASR9001 (Iron Man) Output

      • RP/0/RSP0/CPU0:im_cluster#admin show dsc
        ---------------------------------------------------------
                   Node  (     Seq)     Role       Serial State
        ---------------------------------------------------------
            0/RSP0/CPU0  (       0)   ACTIVE  FOC1710N0YE PRIMARY-DSC
            1/RSP0/CPU0  ( 1166830)   ACTIVE  FOC1710N0YA BACKUP-DSC
        RP/0/RSP0/CPU0:im_cluster#

 

Non-ASR9000 Output

      • RP/0/RSP0/CPU0:Valkyrie3#admin show dsc
        ---------------------------------------------------------
                   Node  (     Seq)     Role       Serial State
        ---------------------------------------------------------
            0/RSP0/CPU0  (       0)   ACTIVE  FOX1228GOVJ PRIMARY-DSC
            0/RSP1/CPU0  (    9860)  STANDBY  FOX1228GOVJ NON-DSC
            1/RSP0/CPU0  (    9106)   ACTIVE  FOX1438GTQR BACKUP-DSC
            1/RSP1/CPU0  (    9065)  STANDBY  FOX1438GTQR NON-DSC
        RP/0/RSP0/CPU0:Valkyrie3#

 

 

A1.3 ASR9001 Software Installations

Iron Man software installations do NOT vary from any other AR9K  system. Please visit the general information page here for installation  instructions (Note: May require cisco.com login):

 

 

 

(End of Addendum)

 

Contributed by:

Sam Milstead - XR TAC

Eddie Chami - ASR9K Escalation

Babu Peddu - ASR9K Escalation

Comments
harindhafdo
Level 1
Level 1

Hi,

what is the reason that ONLY GLC-SX-MMD supported in EOBC ports ? why it cannot support GLC-LH-SMD in EOBC ? this would limit the deployment of clustering only within 550m isn't it ?

next question is what is the limitation if I want to send these control links over a DWDM link with AR-XP as the transponder ?

Hope there is no restriction for DATA Links between two chassis.

Rgds

Harin

harindhafdo
Level 1
Level 1

Hi,

As there was no feedback in this thread I went ahead and tested the nV edge with SFP-GE-L in EOBC ports and it works. Only problem was that this document was missing the configurations to be done in rack1 RSPs which was found in configuration guide.

unset CLUSTER_RACK_ID
unset CLUSTER_NO_BOOT
unset BOOT
sync

without those commands in rack1 RSPs clustering did not happen.

dfranjoso
Level 1
Level 1

Hello Guys,

Did anyone tested the GLC-LH-SMD for the EOBC? Will it work?

David

harindhafdo
Level 1
Level 1

I've tested with SFP-GE-L and it works. so GLC-LH-SMD should work as it is the replacement PID for SFP-GE-L.

Rgds

Harin

Hi Lenin,

In the upgrade rack-by-rack method Chapter 13 you mention:

Support for this method is added in 4.3.1. For 4.2.3 it’s part of NV edge SMU 1 CSCue14377

The rack I want to upgrade is running 4.3.0 and the target version should be 4.3.4

Any thoughts on this? What are the steps that are different?

Thanks, kind regards,

Edi

Eddie Chami
Cisco Employee
Cisco Employee

Edi, even though you don't have the upgrade script in 4.3.0, we can manually use the same steps to do the upgrade, it will give you the least downtime. Does that work for you?

Any plans to support extended reach SFPs on EOBC in the future?

Eddie Chami
Cisco Employee
Cisco Employee

Florian, which SPF part number exactly so i can respond with an accurate answer, we do support GLC-LH-SMD in 4.3.0.

Hi,

Currently there is no official ER or ZR support, therefore ANY module in this range would be helpful, like GLC-EX-SM[D] or

GLC-ZX-SM[D].

Eddie Chami
Cisco Employee
Cisco Employee

Yes we plan to support those. Just hasn't happened yet.. Do you have an immediate need?

Hi,

I was lucky as I had a customer project where the locations where less than 10km away, but clusters are asked for more and more, so the chance that the location distance is more than 10km is very likely in the future.

Eddie Chami
Cisco Employee
Cisco Employee

We will address it..

Carlos A. Silva
Level 3
Level 3

Hi, Lenin:

I'm trying to find the upgrade script within disk0:  in an ASR9000 with 4.3.2, but no luck. Can you point me in the right direction?

Thanks,

c.

Eddie Chami
Cisco Employee
Cisco Employee

Hi Carlos,

See the following:

13.3.1     Script Setup

The upgrade  script may be obtained by copying it from the router to a tftphost via  the "copy" command. The file is located on the router at: "#run  /pkg/bin/nv_edge_upgrade.exp".

Carlos A. Silva
Level 3
Level 3

Yup, I saw that, that's why I'm asking. Not so sure about the "#run" part or where within the box it resides(the script). re: how to get to /pkg/bin.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Quick Links