Netflow table on CAT6500 SUP720

Unanswered Question
Oct 17th, 2007

Hi,

we have quite strange problem, we have SUP720, starting the Netflow statistic like:

mls flow ip interface-destination-source

mls nde sender version 5

mls sampling packet-based 1024 8192

interface TenGigabitEthernet7/1

ip nat outside

mls netflow sampling

ip flow-export source Loopback1

ip flow-export version 5 origin-as

ip flow-export destination SERVER1 PORT1

the utilisation of Netflow table is 100%, I'm starting the command "sh mls netflow table-contention summary" every second (actually as fast as get the answer from previous command) and see that after 8-10 seconds the netflow table is full and then it will be empted:

########################################

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 95%

ICAM Utilization : 0%

Netflow Creation Failures : 155991

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 39%

ICAM Utilization : 0%

Netflow Creation Failures : 112038

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 55%

ICAM Utilization : 0%

Netflow Creation Failures : 0

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 71%

ICAM Utilization : 0%

Netflow Creation Failures : 0

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 83%

ICAM Utilization : 0%

Netflow Creation Failures : 0

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 99%

ICAM Utilization : 0%

Netflow Creation Failures : 55

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 100%

ICAM Utilization : 0%

Netflow Creation Failures : 80510

Netflow CAM aliases : 0

edge1#sh mls netflow table-contention summary

Earl in Module 5

Summary of Netflow CAM Utilization (as a percentage)

====================================================

TCAM Utilization : 40%

ICAM Utilization : 0%

Netflow Creation Failures : 0

Netflow CAM aliases : 0

########################################

I tried to use very small age and fast age parameter but didn't see difference. according to the documentation SUP720 can hanlde up to 115K Netflow enties, I don'T think that we're getting 115K Netflow entries in 10 seconds, currently we have ~300Mbit of taffic and most of that traffic goes to the loadbalancers, and they show that per second they get ~5000 connection / sec.

and I don't undestand why the netflow table is empted after some seconds? (you see after utilisation reaches 100% it is reduced then to the something less then 100% on the next output and then is rising again)

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 3 (1 ratings)
Loading.
Jan Nejman Wed, 10/17/2007 - 07:15

Hello, did you set mls aging or ip flow-cache timeouts? Try to configure mls aging instead of flow-cache timeouts... I prefer fast aging to 16 sec with threshold 30 packets, normal aging 64 and long aging 120.

Best regards

Jan Nejman

Caligare, co.

http://www.caligare.com

Konstantin Dunaev Wed, 10/17/2007 - 08:10

hi,

yes I tried "mls aging" and reduced all 3 types: normal fast and long, without any improvemnents. I'm already thinking may be one should somehow restart the Netflow in order to activate the new parameters?

Jan Nejman Wed, 10/17/2007 - 10:50

Which SUP720 do you have?

From Cisco data sheets:

WS-SUP720: 128k netflow entries

WS-SUP720-3B: 128k netflow entries

WS-SUP720-3BXL: 256k netflow entries

So if you haven't 3BXL one solution is upgrade... but in our company we have 3BXL and still netflow tables overflow... Do you account bridged VLAN (ip flow ingress layer2-switched vlan ...)? If yes, try disable this feature. Please, let me know what PFC version do you using.

Kind regards,

Jan Nejman

Caligare, Co.

http://www.caligare.com/

Konstantin Dunaev Fri, 10/19/2007 - 00:55

Hi,

it's SUP720-3B with PFC3/MSFC3.

we don't use L2 accounting (at least I didn't configure it explicilty)

Jan Nejman Fri, 10/19/2007 - 01:27

It seems that you have only one solution: upgrade to SUP720-3BXL that have a memory to store 256k flows. I saw that you have configured sampling. Sampled NetFlow exports data for a subset of traffic in a flow, which can greatly reduce the volume of statistics exported. Sampled NetFlow does not reduce the volume of statistics collected! Sampled netflow can help only if you have a slow collector, but not with overloaded netflow TCAM.

Kind regards

Jan Nejman

Caligare, Co.

http://www.caligare.com/

Konstantin Dunaev Fri, 10/19/2007 - 01:38

hi,

yes, sampled Netflow is used only to reduce the load on accounting server.

I don't think that upgrade is option for us because as I can see the Netflow needs much more the 115K, but currently we have only ~20% of the traffic that whould be (we're migrating the DC to the new location). I mean 720-3XL will not help us as well.

Jan Nejman Fri, 10/19/2007 - 01:52

Yes, I know what you are talking about. I hope that Cisco will develop a new HW with bigger flow cache (for 1M flows), but question is when it will be....

Jan

PS.: Did you register that in SUP720 are not valid TCP flags in the NetFlow export? Cisco said that TCP flags are not supported for this supervisor at this moment ;-(

Konstantin Dunaev Fri, 10/19/2007 - 05:16

hi,

what is the difference between

command "ip route-cache flow"

and

"mls netflow"

according the documentation "ip route-cache flow" exports the data procesed in MSFC in software (which packets are they?). I don't use this command on my interface

and

"mls netflow" exports the data procesed in hardware by PFC (all packets which procesed by CEF?).

what happens if I disable the mls netflow and configure only "ip route-cache flow". which packets will be exported? The only (big) problem is the "ip route-cache flow" doesn'T support sampled netflow.

Jan Nejman Fri, 10/19/2007 - 05:26

Hi.

I think, that you will see only the first packet of the flow. The first received packet goes to the routing process and it creates a flow. Next packets are switched in PFC. I think that 'ip route-cache' or 'ip flow ingress' are required commands for netflow collection. If you don't enable it and you configure only 'mls netflow' no netflow statistics will be exported.

You can also try lower your fast aging timer.

Try the following:

mls aging fast time 5 threshold 3

Kind regards,

Jan Nejman

Caligare, Co.

http://www.caligare.com/

Konstantin Dunaev Fri, 10/19/2007 - 05:41

Hi.

>I think, that you will see only the first

>packet of the flow. The first received

>packet goes to the routing process and it

>creates a flow. Next packets are switched

>in PFC.

but SUP720 uses CEF for the switching?

I asked because one can configure the "ip flow-cache entries" upto 500K and I don'T really understand it, why the table only for first packet is so large and flexable and the table for mls switching so smal and fixed, they should be at least the same size.

>I think that 'ip route-cache' or 'ip flow

>ingress' are required commands for netflow

it works without 'ip route-cache' you should configure the only "ip flow-export" commands

>mls aging fast time 5 threshold 3

nothing helps :)

Jan Nejman Fri, 10/19/2007 - 06:05

I tried find some text about "what is flow-cache entries command?" So see below...

But I'm not sure what is purpose of this command. It seems that it is used for "72xx and 75xx series", but is it also used for Catalyst 6500 or 7600? Maybe some Cisco's guy

helps...

Jan

***

After you enable NetFlow on an interface, NetFlow reserves memory to accommodate a number of entries in the NetFlow cache. Normally the size of the NetFlow cache meets the needs of your NetFlow traffic rates. The cache default size is 64K flow cache entries. Each cache entry requires 64 bytes of storage. About 4 MB of DRAM are required for a cache with the default number of entries. You can increase or decrease the number of entries maintained in the cache, if required. For environments with a large amount of flow traffic (such as an internet core router), we recommend a larger value such as 131072 (128K). To obtain information on your flow traffic, use the show ip cache flow.

A NetFlow cache can be resized depending on the platform and the amount of DRAM on a line card. For example, the NetFlow cache size is configurable for software-based platforms such as Cisco 75xx and 72xx series routers. The amount of memory on a Cisco 12000 line card determines how many flows are possible in the cache.

Using the ip flow-cache entries command, you can configure the size of your NetFlow cache between 1024 entries and 524,288 entries. Using the cache entries command (after you configure NetFlow aggregation), you can configure the size of the NetFlow aggregation cache from 1024 entries to 524,288 entries.

Caution We recommend that you not change the values for NetFlow cache entries. Improper use of this feature could cause network problems. To return to the default value for NetFlow cache entries, use the no ip flow-cache entries global configuration command.

Actions

This Discussion