Multicast Image Deployment failing

Unanswered Question
Jun 26th, 2009

Hi,

I am having some problems when deploying images via Multicast using Windows Deployment Server. All imaging sessions are failing after a few minutes. While troubleshooting this issue we tried different images and they fail also - unicast works fine. We have also moved the server to the same switch as the clients and the imaging completes successfully.

Hardware:

Core: Cisco 4507re Sup 6E (All line cards are classic)

Access: Mix of 2950G, 3550, 3560

Switch stacks consist of mix of up to three 2950g and 3550/60 each stack connects to our single 4507re.

Setup:

Clients and Deployment Server are in the same VLAN

All access Switches have default IGMP settings (enabled, v2)

Core 4507 has IP Multicast Routing

Int VLAN x

ip pim sparse-dense mode

Any ideas what I am doing wrong?

Thanks

Chris

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (2 ratings)
Loading.
Giuseppe Larosa Fri, 06/26/2009 - 04:01

Hello Chris,

>> Clients and Deployment Server are in the same VLAN

You should verify that the the 4507 is sending IGMP queries and that the clients answer to them.

The issue could be related to IGMP snooping that attempts to optimize multicast forwarding.

You can verify IGMP snooping settings with

sh ip igmp snooping on switches

Hope to help

Giuseppe

chris.macleod Fri, 06/26/2009 - 05:39

Hi Giuseppe,

Thanks for your reply. Here is the output for the show ip igmp snooping command. The Multicast source is on the 2960 and destination is one client on each 2950.

Core_Switch#show ip igmp snooping

Global IGMP Snooping configuration:

-----------------------------------

IGMP snooping : Enabled

IGMPv3 snooping : Enabled

Report suppression : Enabled

TCN solicit query : Disabled

TCN flood query count : 2

Last Member Query Interval : 1000

Vlan 1:

--------

IGMP snooping : Enabled

IGMPv2 immediate leave : Disabled

Explicit host tracking : Enabled

Multicast router learning mode : pim-dvmrp

Last Member Query Interval : 1000

CGMP interoperability mode : IGMP_ONLY

ESK_2950_5#show ip igmp snooping

Global IGMP Snooping configuration:

-----------------------------------

IGMP snooping : Enabled

IGMPv3 snooping (minimal) : Enabled

Report suppression : Enabled

TCN solicit query : Disabled

TCN flood query count : 2

Last member query interval : 1000

Vlan 1:

--------

IGMP snooping : Enabled

Immediate leave : Disabled

Multicast router learning mode : pim-dvmrp

Source only learning age timer : 10

Last member query interval : 1000

CGMP interoperability mode : IGMP_ONLY

ESK_2950_6#show ip igmp snooping

Global IGMP Snooping configuration:

-----------------------------------

IGMP snooping : Enabled

IGMPv3 snooping (minimal) : Enabled

Report suppression : Enabled

TCN solicit query : Disabled

TCN flood query count : 2

Last member query interval : 1000

Vlan 1:

--------

IGMP snooping : Enabled

Immediate leave : Disabled

Multicast router learning mode : pim-dvmrp

Source only learning age timer : 10

Last member query interval : 1000

CGMP interoperability mode : IGMP_ONLY

CALC1_2960_2#show ip ig snooping

Global IGMP Snooping configuration:

-------------------------------------------

IGMP snooping : Enabled

IGMPv3 snooping (minimal) : Enabled

Report suppression : Enabled

TCN solicit query : Disabled

TCN flood query count : 2

Robustness variable : 2

Last member query count : 2

Last member query interval : 1000

Vlan 1:

--------

IGMP snooping : Enabled

IGMPv2 immediate leave : Disabled

Multicast router learning mode : pim-dvmrp

CGMP interoperability mode : IGMP_ONLY

Robustness variable : 2

Last member query count : 2

Last member query interval : 1000

I will enable debugging to verify that queries are being sent/received.

chris.macleod Fri, 06/26/2009 - 06:22

Hi,

I have attached some of the debug output for debug ip igmp snooping groups.

Source is connected on 2960

Destination is on seperate stack on the 2950.

Multicast address begins 232.

The 2950 and 2960 look to be sending/receiving queries but I can't see the 4507 sending and receiving any. Are reports the same thing?

Thanks for your help.

Attachment: 
Giuseppe Larosa Fri, 06/26/2009 - 06:55

Hello Chris,

IGMP snooping activity looks like correct there are queries and replies.

Also core actvity looks like fine the output is different but messages like

003737: Jun 26 14:51:03: IGMPSN: group: Received V2 report for group 239.192.83.80 received on Vlan 1, port Gi1/3

003738: Jun 26 14:51:03: IGMPSN: group: Adding client ip 172.16.62.5, port_id Gi1/3, on vlan 1

003739: Jun 26 14:51:03: IGMPSN: group: Added port Gi1/3 to group 239.192.83.80

are sign of correct behaviour for a L3 device.

Check the trunk ports between the swiches looking for error counters.

What is the bit rate of the streaming?

Are these high definition images in health environment?

Hope to help

Giuseppe

chris.macleod Fri, 06/26/2009 - 07:11

Hi,

I have checked both sides of each link from the source to the clients - no errors at all.

The deployment server shows a transfer rate of 11000-12000KBp/s. The rate remains pretty stable for around 3 minutes then drops to 0Kbp/s.

The images are just for a classroom - just under 10GB in size

Thanks again.

Giuseppe Larosa Fri, 06/26/2009 - 07:25

Hello Chris,

>> The deployment server shows a transfer rate of 11000-12000KBp/s. The rate remains pretty stable for around 3 minutes then drops to 0Kbp/s.

sorry for the basic question:

But do you mean you see the rate that drops to zero on the switch port connected to the server ?

Hope to help

Giuseppe

chris.macleod Fri, 06/26/2009 - 07:29

Sorry I should have said.

I am getting the transfer rate from the Windows Deployment Services console on our Win2008 Server.

Thanks

Chris

Giuseppe Larosa Fri, 06/26/2009 - 08:12

Hello Chris,

another thought about igmp snooping:

igmp reports are suppressed by L2 switches.

I wonder if server expects to see igmp reports as a form of feedback from receivers

if so it could stop not hearing the reports

or it expects some different form of feedback.

To check this have an MRTG to graph the usage link on the switch port during an attempt

Hope to help

Giuseppe

chris.macleod Fri, 06/26/2009 - 10:49

Hi Giuseppe,

We have actually been using Windows Deployment Services successfully until now. Is it still worth checking? We don't use MRTG but have some solar winds software that we could get some usage graphs from will this be okay?

Thanks for your time.

Chris

Giuseppe Larosa Fri, 06/26/2009 - 11:34

Hello Chris,

you could use solar winds as well.

these are the most difficult problems to imvestigate.

You say it worked well up to now.

Verify what have you changed in the network in recent times

Hope to help

Giuseppe

chris.macleod Mon, 06/29/2009 - 01:56

Hi,

I have attached a graph that plots rx+tx on the source and two receivers switch ports. The 2960 is the source (1GB Interface) and the two 2950 are the receivers (100Mb).

As for changes I have added some VLAN's recently and added pim config to them as well as a helper address pointing to the deployment server. I have removed this and am now struggling to find anything else that may be causing this problem.

I have checked the switches and they have found a querier and mrouter (4507re).

Thanks

Chris

Attachment: 
nate-miller Sat, 06/27/2009 - 07:16

3 minutes is the default group membership time.

http://www.cisco.com/application/pdf/paws/68131/cat_multicast_prob.pdf

There's headaches when you've got multicast across multiple switches- disable IGMP snooping (and turn the thing into a flooding scenario!) or enable the igmp querier feature. This may help. I've been having the same problem in the same vlan on the SAME 6500, for what it's worth.

Jake Gillen Fri, 06/26/2009 - 14:14

Hi Chris,

You say that the sessions fail after a few minutes. Do the clients ever get any data, or is it like they are hanging and waiting?

If there is more than a single router hop between the clients and the server, I would see if there is a TTL setting and set it to a higher value.

Hope that's not too dumb!

Jake

chris.macleod Mon, 06/29/2009 - 07:52

I have found that when the source is in the same stack as the receiving devices the deployment is slower but works.

Giuseppe Larosa Mon, 06/29/2009 - 10:53

Hello Chris,

I've seen the traffic graphs and the traffic stays up for 6 minutes so this is not related to the usual timers.

The debug log files showed clearly that 4507 is acting correctly as the IGMP querier for the Vlan.

So we can say the issue shouldn't be related to IGMP snooping activity.

Now, you have added an interesting note that says that you see multicast delivery working when source and receivers are on the same stack.

This would point to a stack issue when dealing with multicast traffic originated / destinated outside the stack itself.

This is strange but possible.

Hope to help

Giuseppe

chris.macleod Wed, 11/25/2009 - 04:12

/* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

Hi,

Thought I would update this thread with some new info.

After some playing around I found that only certain images would fail.  The images that Multicast okay could be taken anywhere in the Campus and deployed successfully.  Images that failed at the exact same percentage each time – transfer rate would immediately drop to zero and never recover.

I found that disabling DHCP Snooping on our Core Switch solved the problem.  I can actually disable DHCP Snooping on our core switch after a Multicast session has failed (transfer rate dropped to zero) and the session recovers.

DHCP Snooping is enabled on all our access switches so I am happy to leave DHCP snooping off on the core, maybe it is advised?  Can anyone shed any light on why DHCP Snooping would cause this issue?

Thanks

Chris

Giuseppe Larosa Thu, 11/26/2009 - 13:17

Hello Chris,

you have been very kind to provide a feedback on this open issue.

The relationship between multicast and DHCP snooping is not immediate.

Generally speaking DHCP snooping should be enabled only at access layer if core switches have no end users connected on them.

In other cases other colleagues have reported high cpu usage also on C6500 switches after having enabled DHCP snooping, probably caused by a bug.

Hope to help

Giuseppe

chris.macleod Fri, 11/27/2009 - 02:44

Hi Giuseppe,

The CPU usage was always low.  All the access switches have DHCP Snooping enabled so I will leave it off on the core.

Thanks for taking time to help with this issue!

Thanks

Chris

Actions

This Discussion