Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. And see here for current known issues.

New Member

Nexus 5548 PFC issue

Hello, I'm having a strange problem with PFC. I have a server that has a QLogic 8142 dual port CNA and one port is connected to NexusA and the other to NexusB. DCBX negotiation succeeded and no-drop is set for CoS 3. The TLV on both Nexus and both card ports match up and they seem to be in agreement...

However, PFC gets applied to all CoS evenly for some reason on the CNA and it really wrecks throughput. At first we just thought it was a NIC issue because the retransmission rate gets up to 300% when I/O heavy stuff like backups are happening. The odd thing though, was that there were never any errors or collisions reported. Finally I look at the traffic coming into the CNA and this is what I see:

Receive Broadcast Packets          141675248

Receive CBFC Pause Frames 0          73150267

Receive CBFC Pause Frames 1          73150267

Receive CBFC Pause Frames 2          73150267

Receive CBFC Pause Frames 3          73150267

Receive CBFC Pause Frames 4          73150267

Receive CBFC Pause Frames 5          73150267

Receive CBFC Pause Frames 6          73150267

Receive CBFC Pause Frames 7          73150267

Receive Control Packets          73150267

Receive FCoE Packets          1128698130

Receive Jabber Packets          0

Receive Mgmt Packets          585890

Receive Multicast Packets          1854877772

Receive Octets          4661675314982

Receive Octets Ok          4656993697750

Receive Oversize Packets          0

Receive Packets          20478644522

Receive Packets 1024to1518Octets          854370673

Receive Packets 128to255Octets          10293526277

Receive Packets 1519toMaxOctets          128131115

Receive Packets 256to511Octets          170181538

Receive Packets 512to1023Octets          154389360

Receive Packets 64Octets          78297449

Receive Packets 65to127Octets          8799748110

Receive Packets Discarded Priority 0          1923

Receive Packets Discarded Priority 1          0

Receive Packets Discarded Priority 2          0

Receive Packets Discarded Priority 3          0

Receive Packets Discarded Priority 4          0

Receive Packets Discarded Priority 5          0

Receive Packets Discarded Priority 6          0

Receive Packets Discarded Priority 7          0

Receive Packets Ok          20405494253

Receive Packets Priority 0          19225451771

Receive Packets Priority 1          0

Receive Packets Priority 2          0

Receive Packets Priority 3          1128698132

Receive Packets Priority 4          584

Receive Packets Priority 5          12099267

Receive Packets Priority 6          12033228

Receive Packets Priority 7          8810261

Receive Pause Packets          0

Receive Undersize FCS error Packets          0

Receive Undersize Packets          0

Receive Unicast Packets          18408941233

Transmit Broadcast Packets          651143

As you can see the CBFC pause frames are applying to all priorities which effectively means my link is pausing like standard flowcontrol is on but I can't find much of anything on the switch that would point me in the right direction. I'm hoping a second set of eyes might be able to point me in the right direction.

The policies on both switches are:

class-map type qos class-fcoe

class-map type qos match-any class-default

class-map type queuing class-fcoe

  match qos-group 1

class-map type queuing class-default

  match qos-group 0

policy-map type qos default-in-policy

  class class-fcoe

    set qos-group 1

  class class-default

    set qos-group 0

policy-map type queuing default-in-policy

  class type queuing class-fcoe

  class type queuing class-default

policy-map type queuing default-out-policy

  class type queuing class-fcoe

  class type queuing class-default

class-map type network-qos class-fcoe

  match qos-group 1

class-map type network-qos class-default

  match qos-group 0

policy-map type network-qos jumbo

  class type network-qos class-fcoe

    pause no-drop

    mtu 2158

  class type network-qos class-default

    mtu 9216

system qos

  service-policy type queuing input default-in-policy

  service-policy type queuing output default-out-policy

  service-policy type qos input default-in-policy

  service-policy type network-qos jumbo

  fex queue-limit

Interface configuration (same on both - and I'm thinking the fact that the FCoE VLAN (1070) is the same on both and that LACP is active may be an issue...):

interface Ethernet1/29

  description ORACLE

  switchport mode trunk

  switchport trunk allowed vlan 111,1070

  channel-group 149 mode active

Port Channel info:

SwitchA:

port-channel149 is up

vPC Status: Up, vPC number: 149

  Hardware: Port-Channel, address: 0005.9b72.0ce4 (bia 0005.9b72.0ce4)

  Description: ORACLE

  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA

  Port mode is trunk

  full-duplex, 10 Gb/s

  Beacon is turned off

  Input flow-control is off, output flow-control is off

  Switchport monitor is off

  EtherType is 0x8100

  Members in this channel: Eth1/29

  Last clearing of "show interface" counters never

  30 seconds input rate 3343488 bits/sec, 417936 bytes/sec, 254 packets/sec

  30 seconds output rate 4113880 bits/sec, 514235 bytes/sec, 1993 packets/sec

  Load-Interval #2: 5 minute (300 seconds)

    input rate 3.82 Mbps, 227 pps; output rate 4.43 Mbps, 2.04 Kpps

  RX

    3458452690 unicast packets  1164669 multicast packets  650875 broadcast packets

    3460268234 input packets  5922693717615 bytes

    2696809069 jumbo packets  0 storm suppression packets

    0 runts  0 giants  0 CRC  0 no buffer

    0 input error  0 short frame  0 overrun   0 underrun  0 ignored

    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop

    0 input with dribble  0 input discard

    0 Rx pause

  TX

    18407152373 unicast packets  1924699542 multicast packets  141651327 broadcast packets

    20473503242 output packets  4660392023519 bytes

    370065327 jumbo packets

    0 output errors  0 collision  0 deferred  0 late collision

    0 lost carrier  0 no carrier  0 babble

    73139330 Tx pause

  6 interface resets

SwitchB

port-channel149 is up

vPC Status: Up, vPC number: 149

  Hardware: Port-Channel, address: 547f.eedd.9d64 (bia 547f.eedd.9d64)

  Description: ORACLE

  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec

  reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA

  Port mode is trunk

  full-duplex, 10 Gb/s

  Input flow-control is off, output flow-control is off

  Switchport monitor is off

  EtherType is 0x8100

  Members in this channel: Eth1/29

  30 seconds input rate 3702752 bits/sec, 282 packets/sec

  30 seconds output rate 638704 bits/sec, 162 packets/sec

  Load-Interval #2: 5 minute (300 seconds)

    input rate 3.82 Mbps, 241 pps; output rate 163.26 Kbps, 88 pps

  RX

    3501912802 unicast packets  1169034 multicast packets  196916 broadcast packets

    3503278752 input packets  5936848468592 bytes

    2698117037 jumbo packets  0 storm suppression packets

    0 runts  0 giants  0 CRC  0 no buffer

    0 input error  0 short frame  0 overrun   0 underrun  0 ignored

    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop

    0 input with dribble  0 input discard

    0 Rx pause

  TX

    1233207065 unicast packets  100406251 multicast packets  16368627 broadcast packets

    1349981943 output packets  435290532701 bytes

    165671680 jumbo packets

    0 output errors  0 collision  0 deferred  0 late collision

    0 lost carrier  0 no carrier  0 babble 0 output discard

    0 Tx pause

  7 interface resets

I think the odd thing here is that only SwitchA seems to transmit any pause. As you can see here on SwitchA, the pause count for the port-channel, the interface in priority-flow-control and in flowcontrol are all incrementing at the same pace. I was told that flowcontrol incrementing at the same pace as PFC was just a bug, but now I'm not really sure because this doesn't happen at all on SwitchB. The Pause count on SwitchA increments at the same pace as the CBFC frames on the CNA. However SwitchB has PFC pause counters but they don't match at all.

SwitchA PFC:

Ethernet1/29       Auto On  (8)       0          73139266

SwitchB PFC (I don't even know how to account for the different numbers because this output was gathered before the CNA and the CNA matches only SwitchA):

Ethernet1/29       Auto On  (8)       0          79050794

SwitchA Flowcontrol (increments at the same pace as PFC)

Eth1/29      off      off      off      off         0                 73139296

SwitchB Flowcontrol (Stays at 0)

Eth1/29      off      off      off      off         0                 0 

Thanks for taking a look.

434
Views
0
Helpful
0
Replies