cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
374
Views
0
Helpful
11
Replies

BGP on OC3

dong-lee
Level 1
Level 1

We have a couple OC3 from different providers. Few days ago, one of our OC3 interface was started to flap (bounce) due to instability of BGP protocol which eventually affected other OC3 later on. We talked with our OC3 service providers, and we were told that it could be routing process memory issue since we are getting full and default routes from them.

We have two Cisco 7500 routers and each router has 256mb memory on it. CPU utilization is only 4% and using only 70mb of memory. How can it be memory issue when we have plenty resource on the router??

The strange thing that we found was that we could ping our next hop (point to point IP) without any problem when we turned off BGP session on the interface, but we got huge packet lost while we tried to ping our next hop (p2p) after we started BGP session.

And also, we experienced a huge packet lost only on one side at a time when BGP is on. We had a huge packet lost on one of our OC3(ISP1) when other OC3(ISP2) was clean(no packets lost), and when other OC3(ISP2) got better the other OC3(ISP1) started to have a packet lost.

Does anyone have any idea?? Any suggestion??

Thank you in advance.

Dong

11 Replies 11

donewald
Level 6
Level 6

Dong,

Just a shot in the dark here, but if you've got distributed cef on and not enough memory on the line cards you could be having memory issues on the VIP (assuming your OC3 is PA-A3/VIP based). So, if this is the case, you might try disabling distributed cef (no ip cef dist gloabally)... There are many things that could potentially

be going on here (ios defect, configuration error, etc) but this might help out.

Hope it helps,

Don

We've got distributed CEF on. Is 126mb on vip2 too small to handle defualt and full BGP route??

And we have 12.1(7)E as router IOS, and we had a consultant checked our configuration on router( but he didn't find anything wrong on config).

Thanks

126? I will assume you mean 128. If you have a full BGP routing table the CEF table will be very large. To see how large you might look at the following:

show proc mem | inc CEF . In this table one of the colmns is "holding" this is the amount of memory held by this process. Other memory type commands will show you if you are having issues. Are you seeing "MALLOC" messages in your log on your RSP or VIP? Normally if a process is trying to get more memory than is available

you will get a memory allocation type message sent to your logging buffer (MALLOC).

Dis-abling D-Cef might be a work around if this (Cef memory consumption) is your issue. This will cause a performance impact of switching packets.

Hope this helps,

Don

126? I will assume you mean 128. If you have a full BGP routing table the CEF table will be very large. To see how large you might look at the following:

show proc mem | inc CEF . In this table one of the colmns is "holding" this is the amount of memory held by this process. Other memory type commands will show you if you are having issues. Are you seeing "MALLOC" messages in your log on your RSP or VIP? Normally if a process is trying to get more memory than is available

you will get a memory allocation type message sent to your logging buffer (MALLOC).

Dis-abling D-Cef might be a work around if this (Cef memory consumption) is your issue. This will cause a performance impact of switching packets.

Hope this helps,

Don

Don,

First of all, I would like to thank you for your time looking into my

problem. This website is really helpful to novice network admin. like

me.

I meant to say 128mb of ram on vip2. Sorry, it was a typo. :)

We've gotten the following result when we ran "show proc mem | i CEF"

command, and it looked like it held about 15mb of memory. Is CEF

process holding too much process memory on out router?

Router2#sho proc mem | i CEF

65 0 14854096 4601096 14846564 0 0 CEF process

71 0 284 0 7092 0 0 CEF Scanner

90 0 328 82336984 7136 0 0 CEF IPCBackgrou

Yes, we got a "Mallocfail" message only one time after OC3 flapped more

than couple of times. So, we thought we got this message because ip

table was updated so often that increased the CPU utilization and memory

process.

We took your advice and tried to disable CEF distriuted, but CPU

utilization did shoot up high as soon as we disabled CEF distriuted. So we

had to add back on.

Is this a normal behavior when you disble CEF distriuted?

Thank you

Dong

Dong,

When you disabled Dist Cef it would and might cause a CPU spike due to the CEF table being removed on the VIP. Essentially it means that all packets would be switched at interupt by the RSP rather than being able to be locally switched by the VIP...

CEF holding 15 mb should not be causing your Malloc conditions. This might be due to your BGP scanner or some other process. Without more (# of neighbors/router config/etc) it would be hard to say for sure what the issue is.

When you disabled DIST Cef did you leave it for a while? CPU should spike initially but, I would think within a minute or less, should begin to come down. Just curious. Also your MALLOCs will reference the process that was trying to get the memory. You might also send that information or use it with your support personnel to help you further.

Hope this helps,

Don

We left it for a little more then a minute when we disabled DIST CEF. I guess we should have wait a little bit longer. :)

These are the error messages we got on the day.

Jul 21 10:41:37 bcr2 17: .Jul 21 10:42:11: %FIB-3-FIBDISABLE: Fatal error, slot 1: no memory

Jul 21 10:41:39 bcr2 18: .Jul 21 10:42:13: %IPC-5-SLAVELOG: VIP-SLOT1:

Jul 21 10:41:39 bcr2 19: 00:01:52: %SYS-2-MALLOCFAIL: Memory allocation of 65556 bytes failed from 0x600C5E8C, pool Processor, alignment 16

Jul 21 10:41:39 bcr2 20: -Process= "CEF IPC Background", ipl= 2, pid= 26

Jul 21 10:41:39 bcr2 21: -Traceback= 600C9300 600CAB80 600C5E94 600C67A0 602DECDC 602DF5DC 602DF88C 602C553C 602C91F4 602C98EC 602CF220 602D7828 602D59CC 602D5C3C 602D5DE4 602D63CC

Jul 21 15:24:48 bcr2 34: Jul 21 15:25:21: %FIB-3-FIBDISABLE: Fatal error, slot 1: no memory

Jul 21 15:24:50 bcr2 35: Jul 21 15:25:24: %IPC-5-SLAVELOG: VIP-SLOT1:

Jul 21 15:24:50 bcr2 36: 00:13:01: %SYS-2-MALLOCFAIL: Memory allocation of 2068 bytes failed from 0x602DCFD8, pool Processor, alignment 32

Jul 21 15:24:50 bcr2 37: -Process= "CEF LC Stats", ipl= 0, pid= 27

Jul 21 15:24:50 bcr2 38: -Traceback= 600C9300 600CAF58 602DCFE0 602D6674 600BFCC4 600BFCB0

Thank you

Dong

Dong,

CEF is your issue, in regards to memory, as you can see from your MALLOCs.

Depending on the memory on your RSP, what your doing with CEF (do you have accounting enabled??), and your configuration. If you are still having problems I would suggest either changing the amount of routes you allow into this router or upgrade the memory (always would max all on RSP and VIP and your doing full BGP).

Using CEF has it's drawbacks. Memory consumption is one of the big ones. Disabling CEF might get you running better in regards to memory until you can get this upgrade done. All really depends on what your doing (config).

Hope this helps,

Don

I see. We already put max amount of memory on our routers. We put 256mb on RSP4 and put 128mb on each VIP2 cards.

We have a default CEF setting on router configs and I don't think accounting is enabled.

The problem only comes once in two to three months. I guess disabling CEF distritued is only option we have. :)

I have one silly question if you don't mind. :)

Does Cisco and Juniper routers work well with each other in term of running BGP? Maybe it was coincident, but our OC3 interface that connected to juniper router is the alway first one which started to flap.

Thank you

Dong

Cisco and Juniper have had issues in the past from things I've worked on but more to do with advanced type (MP-BGP/etc) configurations. Basic BGP configurations you should have no issues with this inter vendor connection.

Regards,

Don

Thanks again for your help. :)

Dong

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: