I've two 4500X configured in VSS mode and I use it as my main gateway.
I try to ping 192.168.125.247 (vlan 125) from a machine in vlan 120 (192.168.120.164 / 255.255.254.0 / 192.168.120.254 ) and the ping is not working.
The 4500X cluster is the gateway for these two vlans :
ip address 192.168.120.252 255.255.254.0
standby version 2
standby 1 ip 192.168.120.254
standby 1 priority 110
standby 1 preempt
ip address 192.168.125.252 255.255.255.0
standby version 2
standby 1 ip 192.168.125.254
standby 1 priority 110
standby 1 preempt
I can see an arp entry in the 4500X :
gw01#sh arp 192.168.125.247
Protocol Address Age (min) Hardware Addr Type Interface
Internet 192.168.125.247 32 00e0.8615.8775 ARPA Vlan125
The mac address is correct but I can not see it in the mac address-table :
gw01#sh mac address-table address 00e0.8615.8775
No entries present.
if I try this command on my 4500X, it's working :
gw01#ping 192.168.125.247 source 192.168.120.252
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.125.247, timeout is 2 seconds:
Packet sent with a source address of 192.168.120.252
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/11/36 ms
Then I can see the mac in the address-table :
gw01a#sh mac address-table address 00e0.8615.8775
vlan mac address type protocols port
125 00e0.8615.8775 dynamic ip,ipx,assigned,other Port-channel13
Now, I can also ping from my machine in vlan 120 but after 5 minutes, this entry will disapear and I will not be able to ping it from my machine in the vlan 120.
If I clear the arp entry it's also working for 5 minutes...
Any idea ?
This is due to a diffenrence between the CAM and ARP aging timer. Default ARP is 4 hours, where as CAM is 5 minutes as you point out.
Try using the 'mac-address-table aging-timer' :
...either globaly or per-VLAN.
Hi, we had a very similiar problem a while back, but with vpc and nexus 7000s. One VLAN, and one VLAN only, certain hosts could not communicate out the subnet. I also picked up clearing the arp table of their entry fixes the problem for about 5 minutes. One 7000 could not ping them at all. TAC had us do packet captures to see where the packets are dropping, and it turned out one 7000 was not forwarding the packets over its peer link. But at the end of the day the problem could only be resolved by an IOS upgrade. They never identified the exact problem or issue, but presumably it was a bug. This just started out of nowhere, no config changes or anything.
So try tracing where the packets are getting lost. That might help you track the problem.
My VSS cluster is connected to 2 Nexus 5548P (VPC) (Port-channel13)
Now I've changed mac address-table aging to 14400 for vlan 125 and everything is ok, but for me, this is not normal.
I'll check later when I've more time...
Thynk you guys
I think we might have got exactly the same problem.
This is my theory:
- The switches are connected via etherchannels
- If you could follow the interface which is followed within the etherchannel, I think the source client is coming in on the first 4500x. The destination client is coming in on the second 4500x.
- It is most likely a destination with little network traffic. If it would connect to many other devices (dhcp, ad, dns etc) it will probably also sent a packet to the other interface within the etherchannel, and that solves the problem.
A ping on the 4500x also solves the problem temporarily.
Unfortunately I can not check my theory because on the VSS 4500 I can not determine the interface which is used within an etherchannel.
This is important because maybe it only goes wrong when also the path from the 4500x to the source and destination is on a different interface.
After some tests I had to adjust mij theory:
If a device only sends packets via the interface of an etherchannel to the passive VSS switch, the mac address entry is lost in mac address table.
If a ping from a source is via the active 4500x it works. If a ping from a source via the passive 4500x it doesn't come up.
To be continued...
That's interresting Rudi !
I also see this open Caveats for Cisco IOS XE Release :
Packets that are routed on the same Layer 3 interface (or SVI) that entered on are dropped if received on the VSS standby switch.
Workaround: None. CSCub63571
what is the default gateway in the pc (4500 switch or firewall).
You are able to ping vlan 125 with source addresss of vlan 120 in 4500 switch,
try this in your pc in command prompt with admin privilage and check.
route add 192.168.125.252 mask 255.255.255.0 192.168.120.252 -p
I have done some additional tests:
- Packet coming in on the passive switch which can be directly sent to an interface on the passive switch don't update the mac address table.
- If incoming packet on the passive switch has destination to lost mac address it is dropped. (prevents flooding)
- If incoming packet on the active switch has destination to lost mac address it is sent to all interface (This can be seen via a wireshark pc on the destination vlan. 1 ICMP packet is seen, coming from source to destination ip address.)
Question is if really no incoming packets on the passive switch updates the mac address table, or that it depends on more variables in the path from source to destination.
Maybe to point which might be relevant
- The passive 4500X was completely broken, it was replaced by a new one.
- We have an additional module:
2 8 10GE SFP+ C4KX-NM-8
I don't know if only these interfaces have problems
sw4500#sho platform hardware floodset vlan 4
Executing the command on VSS member switch role = VSS Active, id = 1
Po16(848) Po21(853) Po23(855) Po24(856) Po31(863) Po51(883) Po14(846)
Executing the command on VSS member switch role = VSS Standby, id = 2
PROBLEM NO INTERFACES
Solution might be: add port in the specific vlan which triggers that in the portlist above a new port is added.
I have corrected this issue by adding two links between my 4500X Cluster and my two Nexus. Now, each 4500X is connected to each Nexus (So, I've four links in the etherchannel instead of two).
I remove also the following command mac address-table aging-time 14400 vlan 125 and everything is working.
I think that the problem was that some packets had to cross the VSL links between 4500X when I had only two links.
I also upgrade the 4500X cluster to version 03.04.03.SG