multicast mac-address Nexus 7k

Unanswered Question
Jul 12th, 2009

Hi,

i'm going to use Nexus 7000 in Data Center.

During analysis configuration, I need define mac-address-static configuration for multicast mac address for Firewall Checkpoint cluster.

In "Layer 2 Switching Configuration Guide, Release 4.1.pdf" documentation speak about

"Configuring a Static MAC Address

[..]You cannot configure broadcast or multicast addresses as static MAC addresses[..]"

Have you a suggestion to manage this problem and why is it not possible configure mac address static multicast?

Regards

Dino

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 3.7 (3 ratings)
ltd Sun, 07/12/2009 - 03:47

Hi Dino,

its a standards violation and therefore not allowed.

specifically, RFC 1812 "Router Requirements" says in section 3.3.2:

A router MUST not believe any ARP reply that claims that the Link

Layer address of another host or router is a broadcast or multicast

address.

having said that, we know that some vendors (Checkpoint firewall One, Nokia firewalls) DO have such a requirement and as such, we are going to be allowing this in the Ankara (4.2) NX-OS release.

this is due to be released in the next 2-6 weeks, so if you can hang in that long, that is my suggestion.

if you're not in a production network environment, it may be possible for you to participate in the NX-OS 4.2 beta programme if you wish to test this functionality sooner.

feel free to contact me (ltd@cisco.com) off forum if this is something you wish to persue.

cheers,

lincoln.

dinoantonucci Sun, 07/12/2009 - 10:13

Hi Lincon,

thank you so much for careful answer.

unfortunately we cannot hang for that period and i will speak with customer to evaluate a second way: partecipate nx-os 4.2 beta programs.

But i have another question: in network infrastracture network/VLAN does not have a

router that can take on the multicast router role and firewalls routes these network. Could it be a workaround implement igmp querier only on nexus platform during temporary period?

cheers.

dino

tstevens Sat, 08/01/2009 - 07:26

Lincoln, Dino -

Actually, what is introduced in 4.2 is static *ARP* entries, which will now allow mapping unicast IP to multicast MAC (eg, ip arp 1.1.1.1 0100.5e00.0011). It does *not* introduce static multicast mac entries (eg, mac-address 0100.5e00.0011 int e1/1-2).

The former will result in the routed packets hitting such an ARP entry being flooded in the output VLAN, so it makes sense only when the FWs are in an L3-bordered VLAN by themselves.

The latter will also be supported longer term, but that would not be before mid CY10 most likely.

Hope that helps.

Tim

e.hoehn Wed, 08/12/2009 - 03:21

Hello,

one of our Customers updates to the 4.2(1) to use the static ARP entry.

They recognized that every packet is beeing dupplicated.

Is this already a knowen problem ?

Is there a fix / workarround available ?

THX in advance for a response.

Emanuel

tstevens Wed, 08/12/2009 - 06:32

Hi Emanuel -

Do you mean 2+ copies of each packet *on a single port*? Or do you mean, a single given packet is seen on all ports in the vlan (ie, flooding)?

The latter is expected, the former is not. Please let me know. Thanks,

Tim

e.hoehn Wed, 08/12/2009 - 06:53

Hello Tim,

unfortunately every pkt get reproducecd on a single port.

PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.

64 bytes from 192.168.0.1: icmp_seq=1 ttl=252 time=0.500 ms

64 bytes from 192.168.0.1: icmp_seq=1 ttl=252 time=1.96 ms (DUP!)

64 bytes from 192.168.0.1: icmp_seq=2 ttl=252 time=0.375 ms

64 bytes from 192.168.0.1: icmp_seq=2 ttl=252 time=1.57 ms (DUP!)

tstevens Wed, 08/12/2009 - 14:56

Hi,

Please describe the exact topology here. How are things connected, and where are you pinging from/to?

Based on the delay, it looks like packets are looping back around somewhere in the network. Clearly with flooding, if there is a loop, this would occur.

Thanks,

Tim

e.hoehn Thu, 08/13/2009 - 13:35

Hello,

We disabled all other ports, so here the setup:

--------------------------------------------------------------------------------

Port Name Status Vlan Duplex Speed Type

--------------------------------------------------------------------------------

mgmt0 -- up routed full 100 --

Eth1/1 -- down trunk full auto 10g

Eth1/2 -- down trunk full auto 10g

Eth1/3 -- down 55 full auto 10g

Eth1/4 -- down 1 full auto 10g

Eth1/5 -- down 165 full auto 10g

Eth1/6 -- down trunk full auto 10g

Eth1/7 -- down 55 full auto 10g

Eth1/8 -- down 55 full auto 10g

Eth1/9 -- down 55 full auto 10g

Eth1/10 -- down 167 full auto 10g

Eth1/11 -- down 55 full auto 10g

Eth1/12 -- down 55 full auto 10g

Eth1/13 -- down 60 full auto 10g

Eth1/14 -- down 55 full auto 10g

Eth1/15 -- down 55 full auto 10g

Eth1/16 -- down 55 full auto 10g

Eth2/1 -- down 55 full auto 10g

Eth2/2 -- down trunk full auto 10g

Eth2/3 sniffer up 60 full 10G 10g

Eth2/4 -- down 55 full auto 10g

Eth2/5 -- down 55 full auto 10g

Eth2/6 -- down 55 full auto 10g

Eth2/7 -- down 55 full auto 10g

Eth2/8 -- down 55 full auto 10g

Eth2/9 -- down 55 full auto 10g

Eth2/10 testclient up 167 full 10G 10g

Eth2/11 -- down 55 full auto 10g

Eth2/12 -- down 55 full auto 10g

Eth2/13 -- down 55 full auto 10g

Eth2/14 -- down 55 full auto 10g

Eth2/15 -- down 55 full auto 10g

Eth2/16 -- down 55 full auto 10g

Po1 -- down trunk full auto --

Po2 -- down trunk full auto --

Lo0 -- down routed auto auto --

Vlan1 -- down routed auto auto --

Vlan60 testsniffer up routed auto auto --

Vlan131 -- down routed auto auto --

Vlan165 -- down routed auto auto --

Vlan167 testclient up routed auto auto --

interface Vlan60

no shutdown

description testsniffer

ip address 137.226.78.57/29

ip arp 137.226.78.60 5142.4242.4242

interface Vlan167

no shutdown

description testclients

ip address 137.226.185.5/24

There is one test-Client connected to Eth2/10 which is pinging the "Multicast IP" 137.226.78.60

This IP addr. is not present, but on the Sniffer we see the ICMP requests (2 pkts per ICMP request).

When a client with the IP 137.226.78.60 is present it gets the requests (every packet twice) and sends and answer for every packet. but the reply is not beeing duplicated again, so the testclint gets for every request 2 replies, not 4.

tstevens Thu, 08/13/2009 - 18:51

Hi,

We are checking into it, thanks for the heads up.

Tim

e.hoehn Tue, 08/18/2009 - 00:17

Hi Tim,

have you guys been able to follow up on this ? Any idea for when a fix will be available ? Or do I have to open a Service Request to get a fix ?

tstevens Tue, 08/18/2009 - 05:56

Hello everyone -

Unfortunately, this is a bug, I reproduced it & filed CSCtb39810.

Basically what's happening here is hardware is incorrectly setting the so-called "CAP1" bit in these packets. The routed packet gets hardware switched, but the CAP1 bit causes the packet to also get copied to the inband interface (ie, sent to the sup CPU), where it ends up getting software routed.

That's why the 2nd copy has higher latency & you'll also see that the TTL has been decremented twice (once by hw & then again by sw).

Note that there's a hardware rate limiter in place that throttles the copied packets to 30Kpps toward the sup. However, other packets require use of the CAP1 bit (new multicast sources for example) so it is potentially dangerous to lower this rate limiter too drastically.

Anyway, engineering has a candidate fix, I have a test image I need to load up today to confirm it's working in my testbed.

Assuming it does, and there is no collatoral damage/issues with the fix, we would need to release a new image w/the fix.

Hope that helps - apologize for the inconvenience this issue may cause you...

Tim

ahoejmark Tue, 05/11/2010 - 10:28

Tim,

Is the possibility of doing static MAC still roadmapped for "mid CY10"?

What release should I be looking out for? 5.0?

TIA,

-A

tstevens Tue, 05/11/2010 - 15:58

This feature is now tracking for more like end of the CY, possibly later.

Tim

Weber88_2 Fri, 12/10/2010 - 13:43

We are about to setup our datacenter soon with Nexus 7Ks and checkpoint firewalls

runnning HA.  Has anyone found an answer on to how to setup the static arp

entry correctly?  We are running NX-OS 5.1(1a).  One thing that we need is the ability to have a static arp entry for a vlan that does not have a SVI on the Nexus 7k.  In IOS 12(2)SXI, this was easy, all you had to do was put in a static entry like so in global configuration mode:

arp 10.70.60.10 0100.5e11.1111 ARPA

The switch would automatically answer up when an arp request was sent to 10.79.60.10.

We have tested static arp entries on SVIs for the Nexus 7Ks and they don't work at all in 5.1(1a).  Is there any way around this?

Here is our setup:

interface Vlan10

  no shutdown

  ip address 192.168.100.1/24

  ip arp 192.168.100.10 0100.5e11.1111

I am using a workstation plugged directly into the 7k on Vlan 10:

interface Ethernet1/2

  description test_arp

  switchport access vlan 10

  no shutdown

I am testing by pinging 192.168.100.10 and then looking in the arp cache of

my workstation.  No arp entry ever shows up.  It briefly has an "invalid"

(all 0s) but that goes away after the arp request times out.

Any help would be greatly appreciated.  I have contacted Cisco TAC but haven't recieved a workaround or fix.  Thanks!

cjosborne Mon, 09/26/2011 - 19:29

Joseph did you ever get this to work with the latest NX-OS  ?

Weber88_2 Tue, 09/27/2011 - 05:53

Yes there is a fix or a workaround for this.  What we did was to have the Checkpoint Firewall host the multicast MAC and respond using proxy-arp for the VIP.  We also turned off igmp snooping on the Nexus 7k.  For a layer 3 interface on the switch the 'ip arp' command worked to allow the switch to communicate to the VIP of the firewall.

The other options would be to turn on broadcast HA on the checkpoints but that would be a less desireable setup in our situation - or - in later releases of the checkpoint software you can actually setup the firewall to join an igmp group.

I hope this helps.

cjosborne Tue, 09/27/2011 - 22:15

what we are seeing now that the checkpoint guy has explained it is the CPP traffic is never making it to each of the cluster members when in the 7k.  If you move the interface to a 3750 the cpp packets work. It doesn't matter if its in mulitcast or broadcast mode with IGMP snooping disabled.  Any ideas ?

Cisco says upgrade the the 7k to 5.2 code so we can do a static mac entry with the multicast address of the CCP packet source.

Weber88_2 Thu, 09/29/2011 - 06:09

We are only running 5.1(1a).

You must disable IGMP snooping.  It sounds like you did.

If you don't have a layer 3 interface on the 7K for the subnet then you must enable the checkpoint firewall to advertise the arp entry for that subnet.

If you are routing through the 7K for the end host then all you should need is the static MAC entry for the VIP of the checkpoint cluster to allow the traffic to be routed correctly.

This all stems from how Checkpoint accomplishes its active/active load balancing.  In multicast mode (which in our lab when we first tested checkpoint worked the best), the checkpoint cluster IP is a unicast IP address and is associated with a multicast MAC address.  You must disable igmp snooping for that subnet to allow the mulicast packets to be treated as a broadcast by the switch (forwarded out all interfaces in that VLAN except for the interface that it was recieved on).  When the end host sends a packet destine for another host on a different network and the checkpoint firewall is next hop/Defualt gateway, an arp request has to be generated by the end host asking the MAC of the DG (and in checkponts case is a multicast MAC).  The fix for this if you have the 7K routing to the firewall is to statically set the MAC in the layer 3 interface configuration, like so:

interface Vlan10

no shutdown

ip address 192.168.100.1/24

ip arp 192.168.100.10 0100.5e11.1111

As for the CCP packets between the checkpoints, we never had any issues with those packets traversing the 7K.  Make sure that the checkpoint firewalls are running proxy-arp as stated in my previous post.

I hope this helps!

cjosborne Tue, 10/11/2011 - 09:36

Thanks for the reply. So what actually happened.  We didn't have an active/active Checkpoint cluster we had an active/passive.  This checkpoint was the eggress gateway for the Nexus but also plugged into the nexus core VDC via 1 Gig.  After working with TAC and doing packet captures we saw the CCP packets both in Broadcast mode and Mulitcast mode packets. The checkpoint vendor basically show us where the cluster was flipping back and forth on the nexus because it was not recieve the CCP heartbeats on that interface.   TAC found us two bugs and we able to fix one of them.

This one:

CSCtl67036  basically pryer to NX-OS 5.1(3) the nexus will discard packets that have a source of 0.0.0.0.  Which in broadcast mode is exactly what the CCP heartbeat is. 

We bypassed this one.

CSCsx47620 is the bug for the for static multicast MAC address feature but it requires 5.2 code on the 7k and that then requires.

dhewes Tue, 10/11/2011 - 15:25

Joseph - The ClusterXL A/A configuration is a variation of the  StoneSoft or Rainfinity clustering technologies that have been used to  cluster Solaris and other *NIX flavored servers and firewalls for  years.  (In fact, StoneSoft filed suit against Check Point in Europe 8  or 9 years ago for patent violations, and lost.)  These configurations  were very common on Check Point clusters running on Solaris from the  late 90's forward - and, as you describe, have unicast IP's with a  multicast MAC for the VIP.  Even from the days of installing these on  the brand new (at the time) 2900 series switches you had to do exactly  as you state above - static MAC entries (or in some cases port mirrors)  so traffic was directed to both active switch ports.  In Active/Passive  mode Check Point ClusterXL clusters are almost always "plug and play"  today - rarely do the switches need anything beyond speed/duplex  settings.  The VIP assumes the MAC of the physical NIC it is currently  bound to, and therefore there are no issues as far as switch config or  proxy ARP entries on the gateways.  All of these issues have to do with  traffic flowing to the VIP and through the firewall, and the ability of  the switch to correctly identify which physical switch port(s) the VIP  is currently patched in to.  This is one of three types of traffic  associated with ClusterXL itself.  The second is state synchronization,  which is accomplished through a crossover cable and therefore not  relevant.  Even when using a switch state sync is a typical TCP 18181  connection from a unicast IP/unicast MAC on one gateway to the other  through a dedicated interface pair.

The challenge described by CJ is not with the traffic  flowing to the VIP, however.  It is an entirely separate process - Check  Point Clustering Protocol (aka CPHA if filtering in WireShark) is  essentially the heart beat traffic.  Every interface pair within a Check  Point cluster continually communicates with its "partner" interface on  the other cluster members.  If any packet takes over 100ms or shows more  than a 5% loss the gateway is forced in to "probing" mode where it  falls back to ICMP to determine the state of the other cluster member.   Depending on the CPHA timing settings an active gateway will failover to  the passive in as quickly as 500ms or so.  ClusterXL will fail over the  entire gateway to the standby to avoid complications with asynchronous  routing.

Out of the box, CCP is configured to use  multicast, but it supports broadcast as well. To change this in real  time (no restart required) simply issue the command:

cphaconf set_ccp {broadcast/multicast}

At  the Ethernet level, CCP traffic will always have a source MAC of the  Magic MAC of 00:00:00:00:xx:yy where XX is the “Cluster ID” – something  identical on each cluster member but unique from one cluster to another,  and YY is the cluster priority (00, 01, etc.) based on the priority  levels set on cluster members within Dashboard on the cluster object.  The destination MAC will always be the Ethernet broadcast of  ff:ff:ff:ff:ff:ff.

At the IP level the source of CCP  will always appear as 0.0.0.0. The destination will always be the  network address (ie, x.x.x.0).

Similarly in multicast mode you will see the same traffic  at the IP level but at the Ethernet level the destination will now be a  IPv4 multicast MAC (ie, 01:00:5e:4e:c2:1e).

In a tcpdump  with the –w flag opened in WireShark and a filter applied of just “cpha”  (without the quotes) you should see a continual stream of traffic with  the same source and destination IPs on all packets (0.0.0.0 and network  IP), the destination of either a bcast or mcast MAC and the source MAC  alternating between 00:00:00:00:xx:00 and 00:00:00:00:xx:01.

Long story short, the problem CJ is describing is a  behavior on the 7K where a packet capture taken on the Check Point  interface itself (ie, tcpdump –i eth0 –w capture.cap) ONLY shows CPHA  traffic from it’s own source MAC and no packets from it’s partner. A  tcpdump on the 7K itself will show traffic from both.

As CJ mentioned, a simple NxOS upgrade will fix the issue per:

This one:

CSCtl67036  basically pryer to NX-OS 5.1(3) the nexus will discard packets that have a source of 0.0.0.0.  Which in broadcast mode is exactly what the CCP heartbeat is. 

We bypassed this one.

CSCsx47620 is the bug for the for static multicast MAC address feature but it requires 5.2 code on the 7k

(NOTE:Additional RAM may be required for the 5.2 update)

Also note that Check Point gateways do support IGMP  multicast groups, given that you have the correct license. It is a  feature of SecurePlatform Professional on the higher end gateways or as a  relatively inexpensive upgrade on the lower end boxes or open  platforms. For lab purposes you can simply type “pro enable” at the CLI  (without the quotes). As of the latest build there is no technical  limitation (no license check) so you can enable advanced routing  features as needed for testing in a lab. For step by step details on  configuring IGMP on SPLAT Pro go to the Check Point support site and  search for sk32702.

This can be a frustrating issue to troubleshoot, so hopefully this helps someone avoid the headaches I ran in to.

Actions

Login or Register to take actions

This Discussion

Posted July 12, 2009 at 12:29 AM
Stats:
Replies:20 Avg. Rating:3.66667
Views:7352 Votes:0
Shares:0
Tags: No tags.

Discussions Leaderboard