10mbps throughput on 1gbps GRE...huh?

Unanswered Question
Feb 8th, 2010
User Badges:

WAN connection is via 1gbps connections to a 20+ gbps (yes 20 gig) layer 3 network, latency point to point is less than 10ms. GRE performance over the entire setup is measuring round 10mbps.  My memory says that measured performance to the test system was 6-10 times the current 10mbps.


I can find nothing wrong, devices outside the layer 3 network (both ends of the tunnel) are well below 50% memory and CPU. All interfaces are below 25% utilization.


What else can I check? where else can i check?


Keep in mind the connection does work....just really low performance for the hardware used, and i can't see where the limit on the perfomance is.




Basic diagram via ASCII


Client <-> 3000 or higher cisco switch that is a GRE end point <-> really fast layer 3 network <-> another 3000 or higher cisco switch another GRE end point <-> Speed mesurement server

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
brent.lund Mon, 02/08/2010 - 18:46
User Badges:

Forgot to add.....already doing "IP TCP ADJUST-MSS" in order do resolve those issues.  And a just for fun reducing that size resulted in no improvement.

Richard Burts Mon, 02/08/2010 - 21:37
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

  • Cisco Designated VIP,

    2017 LAN, WAN

Brent


If the GRE thoughput was 10mbps  and before GRE it was 10 times that rate then it was doing 100 mbps on a 1 gbps connection it souds like throughput was impaired before. You do not provide much detail on the connections. But I wonder if it is possible that there is something like a duplex mismatch between end device and switch.


My other question would be about the end points of the GRE. You seem to indicate that the GRE terminates on some switch, described only as 3000 or higher. So we can not know whether the switch has support for GRE done in hardware or done in software (processed by CPU = process switched = slow).


HTH


Rick

brent.lund Tue, 02/09/2010 - 08:00
User Badges:

Thanks for the reply,

Sorry for the confusion.


The connection measured 100mb upon imlementing GRE, there was no solution in place prior. Speed has DROPPED, hence my thinking there is something wrong.


Speed measures 100mb on a 1gb system, after overhead, latency, limitations of the speed testing system...and so on. We all know a 1gb system will never measure at 1.0gb

If speed measured 100mb, I'd be a happy!


One GRE is a 6500 endpoint to a 4500 Sup V endpoint.

Correct this functionality is implented in software.



I will check speed/duplex, but it should all be auto/auto on cisco-cisco 1gb connections, I thought those issues ussually showed up in the logs... Good idea to check though!!!!




MY QUESTION IS.....

If implented in software and that is the limit shouldn't I see fairly high CPU utilization? Since in that case the CPU would be the limit? Or is there another limit?

Right now a "Sh proc cpu hist" shows the CPU on my 4500 at 20% utilized....I don't belive that this issue is due to the software implentation of GRE on that product line. Or am I missing something important?




    2222211111222222222222222111112222211111222222222222222111
    1111199999555550000011111999991111199999222224444422222999
100
90
80
70
60
50
40
30           *****
20 **********************************************************
10 **********************************************************
   0....5....1....1....2....2....3....3....4....4....5....5....
             0    5    0    5    0    5    0    5    0    5
               CPU% per second (last 60 seconds)


    6663222222223233333333444533322232222532423533433422223222222222233333
    3328897788860704122124001316099716688279283103933197870757798698800015
100
90
80
70
60 ***
50 ***                      *           *     *  *
40 ****                  **** *         ** *  *  *  *                   *
30 **********************************************************************
20 ######################################################################
10 ######################################################################
   0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.
             0    5    0    5    0    5    0    5    0    5    0    5    0
                   CPU% per hour (last 72 hours)
                  * = maximum CPU%   # = average CPU%

Giuseppe Larosa Mon, 02/08/2010 - 23:19
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Brent,

>> 3000 or higher cisco switch that is a GRE end point

multilayer switches perform very poorly with GRE because they do it in software.

The only exception should be C6500 with sup720.

ISR sw based routers have not very high performance work in SW and so they don't suffer dramatic performance penalties when using GRE tunnels like it happens for multilayer switches.


Hope to help

Giuseppe

brent.lund Tue, 02/09/2010 - 08:03
User Badges:

Thanks for the reply,



Like my other response to the other poster, I understand that it's a software implentation, and possibly not supported.


BUT....


If it is a limit of being done in software, should I not see HIGH CPU utilization?  At 20%, I don't belive that is the issue.


What can i do to check the health of a GRE???

paolo bevilacqua Tue, 02/09/2010 - 08:14
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    Founding Member

Giuseppe is correct, you should not enable GRE on switches.


The CPU utilization is not relevant, some internal mechanism may prevent GRE from monopolizing CPU.

brent.lund Tue, 02/09/2010 - 08:16
User Badges:

I've seen that note on 3550's being unsupported, however i didn't think that was the case on a 4500????

paolo bevilacqua Tue, 02/09/2010 - 08:18
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    Founding Member

The point is not if it is supported or not. The point is that anything that runs in software on switches, will give poor performances. This is because they use slow CPUs.

brent.lund Tue, 02/09/2010 - 08:38
User Badges:

Thanks for the input.



I fully agree, that software implentation will often be a bottleneck.


However I'm too much of a geek to accpt the thought that there is some magical-random-hidden-mysterical force that makes it slow.


If the CPU or RAM was at a high utilization, i'd say fine it's a limit of the device.


If there is some other choke point, there must be a way to "see" or monitor that point.



I can accept that it's a limit.....I can't accept not being able to "see" that limit.



Thanks again!

paolo bevilacqua Tue, 02/09/2010 - 08:43
User Badges:
  • Super Gold, 25000 points or more
  • Hall of Fame,

    Founding Member

As mentioned above. There could be some scheduling mechanism to prevent a single process from starving others. That is a good thing.


Bottom line doesn't change, use routers for tunnelling (and many other things).

Actions

This Discussion

Related Content