Maximum throughput on Cisco 2621 router

Unanswered Question
Mar 8th, 2008

I have two dell server 2950-III dual quad-core processors

3.0Ghz with 8GB RAM and 1TB SATA drive. Dell_1 has an

ip address of 192.168.1.10/24 Dell_2 has an ip address

of 192.168.1.20/24. Both of the dell is connected to

a Cisco 2960 Catalyst switch copper Gig. I am running

Redhat Linux ES 3 on these servers. I hard code the

interface to 1000/full

When I perform FTP between the servers, I can get about

800Mbps throughput. That's the good part.

Now, I have a Cisco 2621 (64RAM/16F) I connect both

F0/0 and F0/1 to the catalyst 2960. The router is

running IOS version 12.3(24). I set both the interface

of the router and the catalyst to 100 full-duplex.

I give F0/0 192.168.1.1/24, F0/1 192.168.2.1/24. I

give Dell_2 192.168.2.10/24 with the gateway to

be 192.168.2.1. Dell_1's default gateway is 192.168.1.1.

My FTP transfer is peaking out at 5Mbps between

Dell_1 and Dell_2 across the 2621. The CPU on the Cisco

2621 peaks at 99% cpu utilization. I see no

errors on both the catalyst switchports and on the router

interfaces. I thought I could get much better on the

Cisco 2621 than 5Mbps throughput. With

either SecureFTP (sFTP) or SecureCopy (scp), the through put drops to 2Mbps.

In other words, it gets worse.

Anyone know what the throughput for Cisco 2621 router?

IOS on the router is c2600-ik9o3s3-mz.123-24a.bin.

Thanks.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Average Rating: 0 (0 ratings)
Richard Burts Sat, 03/08/2008 - 08:52

David

According to a router performance doc from Cisco the 262x may get up to 12 Mbps. These figures are based on 64 byte packets when no services are configured and all traffic is CEF switched. The throughput drops to less than 1 Mbps when traffic is process switched. Depending on your traffic (and FTP should be sending maximum size packets) and how the router is configured your performance would be different. And given your observation that router CPU goes to 99% it strongly suggests that something is configured that is making the router do something that is impacting performance.

I am not clear in your post whether the switch ports where the servers are connected are in the same VLAN or whether you configured separate VLANs for them.

HTH

Rick

cisco24x7 Sat, 03/08/2008 - 09:09

Rick,

- When both servers are in the same VLAN,

I get 800Mbps throughput,

- There is no other services configured on

the router. This is a test router.

- I have CEF enable,

- Dell_1 and F0/0 is in vlan 10 and Dell_2

and F0/1 is in vlan 20. I get 5Mbps

throughput with 99% CPU.

- I repeat the test with a Cisco 7204VXR

and I can only push 40Mbps and 99% CPU. I

have CEF enable in this case as well.

Any ideas why?

David

Richard Burts Sat, 03/08/2008 - 09:40

David

If all it were doing were forwarding the traffic and only using CEF I would certainly not expect the CPU to go that high. The high CPU suggests that the router is doing something other than just CEF forwarding packets. Is there any possibility that it is logging anything about the traffic? Is there any possibility that it is having to fragment packets?

HTH

Rick

cisco24x7 Sat, 03/08/2008 - 10:05

this is what I did to the router:

1- write erase,

2- reload

3:

conf t

hostname C2621

no logging on

no logging console

no logging buffered

no logging monitor

ip cef

int f0/0

speed 100

dup full

ip address 192.168.1.1 255.255.255.0

no shut

ip cef

int f0/1

speed 100

dup full

ip address 192.168.2.1 255.255.255.0

no shut

ip cef

I'm still getting 5mbps throughput at 99%

CPU. On the VXR7204, repeat with the same

process, throughput is at 40Mbps, with CPU

at 99%.

In both cases, with sFTP and scp

applications, throughput is around 20mbps

on the VXR7204 and between 1-2mbps on the

2621.

If I replace the Cisco router with a pair of

Checkpoint SPLAT Active/Active ClusterXL

firewall, I get wire-speed file transfer at

about 90Mbps.

Richard Burts Sat, 03/08/2008 - 10:24

David

Thanks for posting config information. That certainly seems to indicate that logging is not an issue. I do not know why it would need to fragment or do anything to the packets but I am wondering what would drive CPU so high. I wonder if you would clear the counters, run a transfer and check the number of packets in on each interface against the number of packets out on the other interface?

HTH

Rick

cisco24x7 Sat, 03/08/2008 - 10:53

Rick,

I already tried what you did and the # of

packets in is almost the same as the # of

packets out on the other side.

I dont' think packet fragmentation is

an issue because I run tcpdump on the

servers on both side I do not see any

fragmentation packets at either end.

glen.grant Sat, 03/08/2008 - 11:06

When you are doing these transfers and you look at those 100 meg router uplinks are they pretty much buried ? You also have to remember you have gigabit nics and you are trying to shove all that information up a 100 meg pipe , so I calculate at best you would get maybe 12 megabytes per second transfer with overhead . In your first post are you talking bits or bytes ????

cisco24x7 Sat, 03/08/2008 - 11:15

12 megabytes per second = 96 megabits per sec.

I get about 5 megabits per second which

is not a lot.

I even change the speed on the Dell servers

from 1G to 100m and the same thing on the

catalyst 2960 as well. Same result.

Edison Ortiz Sat, 03/08/2008 - 11:33

Please post what you are seeing

1) Clear the counters from all interfaces

2) Initiate the transfer once again

3) Post the show interface, this would give us a bit more information.

You may also post the show processes while the transfer is taking place.

Thanks

__

Edison.

JosephDoherty Sat, 03/08/2008 - 12:34

http://www.cisco.com/en/US/products/hw/routers/ps359/products_tech_note09186a00801c2af3.shtml

Your high IP Input is of concern, the above might assist troubleshooting it.

PS:

Process switching could account for your low file transfer rates.

[edit]

PPS:

The packets within the input queues could be of concern too. See Input Queue Drops in: http://www.cisco.com/en/US/products/hw/routers/ps133/products_tech_note09186a0080094791.shtml#topic2

[edit2]

Regarding Input Queue Drop, above, none noted as being dropped, I think we still might be concerned about number shown within input queues.

JosephDoherty Sat, 03/08/2008 - 12:21

When both the 2621 and 7204VXR max out their CPU, interrupt CPU is also about maxed out?

What NPE is within the 7204VXR?

When you jump across either router, are the packets 1500 bytes?

PS:

The performance numbers we have for a 262x are: 1.5 Kpps process switching and 25 Kpps fast switching.

Ethernet packet size to bandwidth table (for gig):

Gig

Packet Size (Bytes) 64 128 256 512 1024 1518

Theoretical Maximum Kpps 1488 845 453 235 120 81

Remember for file transfer, traffic is bidirectional, need some pps for return ACKs.

cisco24x7 Sun, 03/09/2008 - 10:17

I actually read all of those links prior to

posting my question in the forum.

Any more ideas anyone? Thanks.

JosephDoherty Sun, 03/09/2008 - 11:36

What do you think might be account for all the broadcasts seen on fastE 1/0? Looks to be about 21% of the inbound packet count on that interface.

I'm wondering whether the router, as a host processing broadcasts, is what might be consuming the CPU.

cisco24x7 Sun, 03/09/2008 - 14:36

The Linux is running Samba Apps (aka Microsoft

file-sharing services). It is sending

broadcast WINS and other chatty Microsoft

protocols. That should NOT have any effect on

the router.

If what you said is true, the same thing would

have applied if I replace the router with a

checkpoint firewalls. When I replace the

router with a Checkpoint firewall, I get wire

speed file transfer at 90 Mbps. I would think

that I would get better throughput on the

router than I would with firewalls due to

stateful by nature in firewalls. With router

just forwarding packets, that's my assumption.

However, the opposite is true. Weird.

Edison Ortiz Sun, 03/09/2008 - 14:52

Ok, let's review what you are seeing.

1) You mentioned you are peaking a 5Mbps

2) The router's CPU pegged at 100%

3) Based on links provided by other members, it seems the IP Input is the caused of the high utilization of the CPU

4) The link describes as one of the causes is oversubscription in the router

5) The router is rated at 12Mbps when using IP Only and 64Bytes packet size

Now... Based on the output you posted from one of the interfaces.

5 minute input rate 4170000 bits/sec, 390 packets/sec

5 minute output rate 123000 bits/sec, 217 packets/sec

We do the math to see what the packet size is and let's take the input rate values.

4170000 / 390 = 10692.31 we convert that to Bytes

10692.31 / 8 = 1336 Bytes average for each packet.

You are sending over 20 times the packet size from the spec rate and you should reduce the number accordingly.

We come to the conclusion that a 2621 maxes out at 5Mbps when the packet average size is 1336 Bytes.

HTH,

__

Edison.

JosephDoherty Sun, 03/09/2008 - 15:01

My understanding is, as packet size increases, and with a constant PPS rate, effective transfer bandwidth should increase. The increase isn't exactly linear, but the table I provided in a prior post (from Cisco) has the required PPS rates to obtain Ethernet gig line rates for different packet sizes. (You can scale it divide by 10 for fast Ethernet.)

Edison Ortiz Sun, 03/09/2008 - 15:10

Is that based on tests you've done ?

The PPS rated for a 2621 is 25,000 based on 64Byte packet size. That's how they get the 12.80Mbps figure.

Math as follow:

25,000 * 64 = 1,600,000 Bytes Per Second

Convert to Bits

1600000 * 8 = 12,800,000 Bits Per Second

You are saying, if you have a packet 20 times bigger, you still maintain the same PPS ? I doubt it.

I suggest the OP to lower the packet size at the application layer and see it improves the throughput.

JosephDoherty Sun, 03/09/2008 - 15:31

Personally tested, no. Nor could I guarantee, nor do I actually know, a 2621 will maintain its PPS rate regardless of actual packet size. However, effective bandwidth usually jumps with increased packet size.

PPS for 64 byte packets are often quoted since for IP it normally represents the worst case to provide line rate. (For 100 Mbps Ethernet, for 64 byte sized packets, requires about 148,809 PPS; 1500 byes size packets only require about 8,100 PPS.)

The table I drew the gig Ethernet PPS rates from can be found: http://www.cisco.com/en/US/products/hw/modules/ps2643/products_white_paper09186a0080091db8.shtml

What's interesting in table 1 and table 2, we do see the actual PPS rate fall with increased packet sizes, but the necessary PPS often decreases even more. So, the graphs show a higher percentage of theoretical line rate being achieved as packet size increases. (Your mileage, err bandwidth, might vary.)

Edison Ortiz Sun, 03/09/2008 - 16:07

You are basing this argument on the 7500 with a VIP? LOL - talk about comparing apples and apples.

The bottom line is this, we've circulated the routerperformance document many times. No need to repost it. There are numbers published on such document and the values that were used to come up with those numbers.

On a regular router 2621 (No VIP, No Layer3 Distributed Switching), you get a maximum of 12.80Mbps based on 25k PPS * 64Byte Size.

JosephDoherty Sun, 03/09/2008 - 16:59

Edison,

I don't disagree about the math on how the reference sheet you refer to arrives at 12 Mbps for the 2621 for fast path switching. Nor do I know anything beyond the sheet's 25 Kpps rating for a 2621's fast path switching; except for the much, much lower rating for process switching.

I wasn't directly comparing a 2621 against a 7500, either. What I was arguing was effective bandwidth normally improves as packet size increases (if for no other reason the improved ratio between actual payload vs. packet/frame overhead). What the 7500 VIP document showed was both the reduced PPS requirements to obtain line rate as packet sizes increases and one example such impact to a particular piece of hardware makes.

Bottom-line: 12.8 Mbps for 64 byte packets should represent worst case, not necessary best case.

I see David did try your suggestion to decrease packet size to 64 bytes and saw a 5:1 performance reduction. Although, I wouldn't be able to predict whether there would be any reduction, from my argument, I'm not surprised there was. I would have been surprised if there was an improvement.

[edit]

PS:

BTW: I am surprised from time to time. ;)

JosephDoherty Sun, 03/09/2008 - 15:12

"It is sending broadcast WINS and other chatty Microsoft protocols. That should NOT have any effect on the router."

Well, from the link both I and Glen provided contains under:

IP Input

Traffic that cannot be interrupt-switched arrives

Broadcast traffic

Check the number of broadcast packets in the show interfaces command output. If you compare the amount of broadcasts to the total amount of packets that were received on the interface, you can gain an idea of whether there is an overhead of broadcasts. If there is a LAN with several switches connected to the router, then this can indicate a problem with Spanning Tree.

So, uncertain it wouldn't have any effect on the router. (What's the CPU load on the router when there is no transfer?)

With regard to router vs. your Checkpoint firewall, perhaps apples to oranges comparison of hardware capabilities?

One thing you might also try, if your IOS supports it, is using CoPP.

cisco24x7 Sun, 03/09/2008 - 16:05

Very interesting discussion.

"If there is a LAN with several switches connected to the router,

then this can indicate a problem with Spanning Tree."

There is NO other switches. The test was done on a single switch

Catalyst 2960.

"(What's the CPU load on the router when there is no transfer?)"

The load is 0% when there is no transfer

"The PPS rated for a 2621 is 25,000 based on 64Byte

packet size. That's how they get the 12.80Mbps figure."

Well, I use a program call "Iperf" and test the throughput.

When I lower the packet size to 64Byte, the throughput goes

from 5Mbps down to 1Mbps.

Any more thoughts or ideas folks?

Edison Ortiz Sun, 03/09/2008 - 16:21

Ok, clear counter, execute a FTP session, then post the show buffers from this router.

Thanks

cisco24x7 Sun, 03/09/2008 - 16:45

C2621#sh buffers

Buffer elements:

1118 in free list (1000 max allowed)

4698929 hits, 0 misses, 1119 created

Public buffer pools:

Small buffers, 104 bytes (total 84, permanent 50, peak 114 @ 09:38:29):

73 in free list (20 min, 150 max allowed)

5951510 hits, 925 misses, 196 trims, 230 created

131 failures (0 no memory)

Middle buffers, 600 bytes (total 81, permanent 25, peak 104 @ 23:34:47):

79 in free list (10 min, 150 max allowed)

768179 hits, 310 misses, 158 trims, 214 created

168 failures (0 no memory)

Big buffers, 1536 bytes (total 63, permanent 50, peak 63 @ 00:00:31):

29 in free list (5 min, 150 max allowed)

2230804 hits, 301 misses, 6 trims, 19 created

152 failures (0 no memory)

VeryBig buffers, 4520 bytes (total 11, permanent 10, peak 11 @ 23:35:43):

11 in free list (0 min, 20 max allowed)

127 hits, 25 misses, 1 trims, 2 created

25 failures (0 no memory)

Large buffers, 5024 bytes (total 1, permanent 0, peak 1 @ 23:35:43):

1 in free list (0 min, 10 max allowed)

1 hits, 24 misses, 8 trims, 9 created

24 failures (0 no memory)

Huge buffers, 18024 bytes (total 1, permanent 0, peak 1 @ 23:35:43):

1 in free list (0 min, 4 max allowed)

0 hits, 24 misses, 8 trims, 9 created

24 failures (0 no memory)

Interface buffer pools:

CD2430 I/O buffers, 1536 bytes (total 0, permanent 0):

0 in free list (0 min, 0 max allowed)

0 hits, 0 fallbacks

Header pools:

Header buffers, 0 bytes (total 137, permanent 128, peak 137 @ 23:35:59):

9 in free list (10 min, 512 max allowed)

125 hits, 3 misses, 0 trims, 9 created

0 failures (0 no memory)

128 max cache size, 128 in cache

73 hits in cache, 0 misses in cache

Particle Clones:

1024 clones, 0 hits, 0 misses

Public particle pools:

F/S buffers, 256 bytes (total 384, permanent 384):

128 in free list (128 min, 1024 max allowed)

256 hits, 0 misses, 0 trims, 0 created

0 failures (0 no memory)

256 max cache size, 256 in cache

0 hits in cache, 0 misses in cache

Normal buffers, 1548 bytes (total 512, permanent 512):

384 in free list (128 min, 1024 max allowed)

320 hits, 0 misses, 0 trims, 0 created

0 failures (0 no memory)

128 max cache size, 128 in cache

0 hits in cache, 0 misses in cache

Private particle pools:

FastEthernet0/0 buffers, 1548 bytes (total 192, permanent 192):

0 in free list (0 min, 192 max allowed)

192 hits, 0 fallbacks

192 max cache size, 128 in cache

677145 hits in cache, 0 misses in cache

FastEthernet0/1 buffers, 1548 bytes (total 192, permanent 192):

0 in free list (0 min, 192 max allowed)

192 hits, 0 fallbacks

192 max cache size, 128 in cache

6021707 hits in cache, 0 misses in cache

C2621#

Edison Ortiz Sun, 03/09/2008 - 17:48

You have some misses but nothing to be concerned about.

I find it interesting that once you changed the packet size from 1336 Byte average to 64 Byte, the number dropped 5Mbps to 1Mbps.

The 1Mbps figure is illustrated as the maximum throughput for a 2621 when using process switching.

You mentioned you have CEF enabled on the interfaces, can we quickly verify that with a show ip cef output ?

Thanks

JosephDoherty Sun, 03/09/2008 - 17:08

David,

The point about broadcasts was their possible impact to the router's performance. The Cisco mention of Spanning Tree concerns a possible source.

Other ideas? Yep, one interesting one, I think.

Can you place your 2621 behind the Checkpoint firewall so that the 2621 doesn't see any broadcasts from the destination network?

[edit]

PS:

There one possible very simple explanation, the PPS rating we're using doesn't accurately reflect the performance of the box.

However, I still suspect the high process switching ratio is causing the box to peak its CPU too soon. The issue is the cause, and if we identify it, a work around.

cisco24x7 Sun, 03/09/2008 - 18:17

"Can you place your 2621 behind the Checkpoint firewall so that the 2621 doesn't see any broadcasts from the destination network?"

Yes I can do that but I don't think that will

help because checkpoint clusterXL itself also

use multicast/broadcast to talk to each other

to maintain state, called Checkpoint Clustering

Protocol (CCP). Similar to VRRP not quite.

Therefore, the router would see this traffic

as well.

Edison,

I just logged off from my test network but I

can assure that I have cef enable as follows:

config t

ip cef

interface F0/0

ip cef

interface F0/1

ip cef

end

wr mem

Regarding the 1Mbps throughput, unless I am

mistaken, the smaller the size of the packet,

at 64bytes, it will cause high CPU usage.

The higher the size of the packet, let say

1300 bytes, the lower the cpu on the router,

unless you exceed 1460 or 1480 bytes.

Therfore, what I am seeing @ 64bytes for 1mbps

throughput is normal, isn't it?

JosephDoherty Sun, 03/09/2008 - 18:30

Okay then, how about behind the 7204VXR? (I recall you got 40 Mbps through it; still trying to see whether we can increase speed of 2621 if broadcasts not hitting its destination interface.)

Actions

Login or Register to take actions

This Discussion

Posted March 8, 2008 at 7:04 AM
Stats:
Replies:31 Avg. Rating:
Views:1715 Votes:0
Shares:1
Tags: No tags.

Discussions Leaderboard

Rank Username Points
1 15,007
2 8,150
3 7,730
4 7,083
5 6,742
Rank Username Points
155
77
70
69
45