I am trying to set up a solution using GRE Tunnels between Cisco 3825 and 2811.
This solution is going to use about 40 tunnels on the same physical interface (100Mbps). I would like to get a bandwidth of 2Mbits per tunnel.
First, I did a test with only one GRE tunnel between the two Cisco and I got a really low bandwitdth.
Here it is, the tests' results on a Cisco 3750 using an IXIA tool.
GRE encapsulation performances : about 1565 fps
Corresponding to 0.8Mbps for a 64 bytes frame size
Corresponding to 1.6Mbps for a 128 bytes frame size
Corresponding to 3.2Mbps for a 256 bytes frame size
Corresponding to 6.4Mbps for a 512 bytes frame size
Corresponding to 12.8Mbps for a 1024 bytes frame size
Corresponding to 16Mbps for a 1280 bytes frame size
We made the same bandwidth tests without GRE encapsulation and we get the best results we can get with the IXIA tool.
i.e : - 76Mbps for a 64 bytes frame size
- 86Mbps for a 128 bytes frame size
- 92Mbps for a 256 bytes frame size
- 96Mbps for a 512 bytes frame size
- 98Mbps for a 1024 bytes frame size
-99Mbps for a 1280 bytes frame size
Is it normal to get so bad results in using GRE encapsulation ? Performances are so reduce like this when you use GRE encapsulation?
Is there a mean to improve theses results ?
Thanks for your help
try to add
int tunnel x
ip route-cache cef
you can check the switching mode using
sh ip int tunnel x | inc flags
a so big difference makes me to think the GRE packets are process switched
Hope to help
What was the size of the packet you sent ? If you sent packets larger than 1476 B, the GRE packet sent on the physical interface will be more than 1500 B due to at least 24 B of GRE header.
In this case the router will have to fragment it and fragmentation is a CPU killer.
"Here it is, the tests' results on a Cisco 3750 using an IXIA tool. "
Could you explain the test topology further?
Non-GRE was 76 Mbps or better on the 2811?
Thanks for all your responses. Here is some precisions on the tests I made :
I used IXIA 400T tool (http://www.ixiacom.com/products/chassis/display?skey=ch_400t) to do my test.
This product can determinate the available bandwidth on the link you test in using dichotomy process (It first, send 100% of the line rate, if it note packet loss, I begin again it test at 50% of the line rate, then if it detect no data loss, it try to send 75% of the line rate and so on...).
In using this process, this tool can calculate accurate result on the available bandwidth.
This tool made this test for the following frame size (bytes) : 64, 128, 256, 512, 1024, 1280.
And I obtained the results that you can see above.
The traffic type send is a non connected traffic as UDP.
I made this test on Cisco 3750 to have an ideas of the result I could hope with GRE tunnel, but the solution will be set up on Cisco 3825 and 2811 (Currently in order in my company).
Guiseppe, the ip route cache cef, was already in place on the cisco.
But as I said you above, the tests are very bad and make me reconsidering the solution built on GRE tunnels that we have to set up.
I've used IXIA instruments in the past.
I would suggest you to perform some testing manually setting a packet rate on interface 1 and see if you receive all frames.
My only doubt is how the built-in test treats out of order packets received on the second port:
I mean if out of order packets are traced they can influence the test results
I imagine your lab setup is:
IXIA:port1 --- R1 -- R2 -- IXIA port2
Hope to help
As Laurent noted in his post, a 3750 would not be representive of GRE performance for either the 3825 or 2811. (A 3750 should forward transit GRE traffic as it would non-GRE traffic.)
Both the 3825 and 2811 should have some performance loss for pushing traffic in/out a GRE tunnel. It's generally minor for the later IOSs unless GRE fragments the packet. (This can be avoided on later IOSs, for TCP traffic, by spoofing MSS during TCP setup.)
Although 100 Mbps (or gig) Ethernet interfaces can be found on the 2811 or 3825, neither router might deliver 100 Mbps (duplex) performance (especially the 2811), even if non-GRE.
It seems like the test environment must have an issue. You shouldn't see such degradation using GRE. It is effectively IP routing as normal, just with an added IP header plus 4 bytes of GRE header.
It doesn't take the router much extra time at all (compared to straight IP traffic) to strip that extra header.
GRE tunnel is not supported on 3750 switches because it's 100% handle in software so your results are expected. You will have better results with the ISR routers.
"(Catalyst 3750 or 3560 switches and Cisco EtherSwitch service modules) The switch does not support tunnel interfaces for unicast routed traffic. Only Distance Vector Multicast Routing Protocol (DVMRP) tunnel interfaces are supported for multicast routing. "
>> I am trying to set up a solution using GRE Tunnels between Cisco 3825 and 2811.
OP Julien did tests on C3750 just to validate the instrument and tool he's using.
Hope to help
Joseph and Laurent are all right. Performance of layer 3 switch cannot be compared to router performance. So, I feel rassured to get such result, I will get 3825 and 2811 routers in the next weeks.So, my tests should be much more efficient.
I received document from my cisco contact which gives some information on performances we should get with 3845 router using GRE Tunnels (see enclosed file).
Thanks a lot for all your help.
I've another question about GRE performances :
I've to set up several nodes localized at different places. Each node have the same architectur with a Cisco router and a NETASQ Firewall :
Cisco will set up GRE tunnels with its neigbhour and NETASQ will set up IPSEC tunnel with the same neigbour.
So, traffic from a node A which wants to speak with a neighbour node, will have to be encapsulated two times (GRE + IPSEC) and so decapsulate two times.
Here is the problem :
Traffic from one node can transit over 2 nodes (3 bounds) to join the destination node. And as much as decapsulation and encapsulation it will be necessary.
So, there will be 3 GRE encapsulation + 3 IPSEC encapsulation + 3 GRE decapsulation + 3 IPSEC decapsulation.
Is there any huge impact on jitter and latence?
Does this process could impact real-time traffic such as voice or video?
Anyway,I gonna try to evaluate this in the next weeks.
But is someone could say me if this process could really affect the Quality of Services of real-time traffic?
I just want to be sure you're aware that before sharing any information Cisco gives to you, you should check that information is no part of a NDA contract between you and Cisco.
Regarding your question, it will impact the delay and Jitter as you add more processing steps in the path but it's difficult to say if the impact will be acceptable or not. Anywhere you can, prioritize your voice traffic.
There will be some impact everytime a packet is encapsulated or decapsulated.
Real-time traffic for voice and video has a certain amount of tolerance or time budget. If this isn't exceeded, even though the traffic has been impacted, it won't be noticable to the users of the application.
The biggest impact used to be IPSec encryption/decryption, but both the 3825 and 2811 (I believe) have hardware encryption modules that minimize the impact. (They both, I also believe, can accept optional hardware encryption modules for additonal performance.) Also note, encryption's impact also depends on the encryption being used.
I don't have the numbers to predict your on-order ISRs performance, but in general, to guarantee real-time performance you'll want to assure timely processing for your real-time traffic. You may need to "size" ISRs much larger to insure an ample cushion of reserve performance (which guarantees the necessary performance for less traffic).
For instance, say you find the 2811 runs out at 20 Mbps on your test, you may also find you shouldn't exceed 5 Mbps to insure the quality of voice. (Again, these numbers probably bear no relationship to how each ISR actually performs.)
One last note, normally your try to configure tunnels end-to-end. Tunnel traffic is forwarded across transit notes much like any other traffic. Only the end points normally need to perform encapsulation/decapsulation. Unclear why you might need to perform encapsulation/decapsulation on multiple nodes; you might investigate whether that's really required.