cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
282
Views
0
Helpful
1
Replies

65xx line cards - proactive health checks?

patrick.guerin
Level 1
Level 1

Hi Folks,

Besides the occasional s/w bugs we are challenged with from time to time the biggest downtime we encounter are failed line cards especially ws-x6248-rj45 and ws-x6348-rj45 (not always full HW fail, sometimes soft reset or reinsertion will recover)

There are a multitude of alerts we can receive once a card goes into a failed state (reactive) but I was wondering if anyone has developed a means of proactively checking a line cards health. The aim being to have advance warning and perform a scheduled swap out thus mitigating the impact.

I was thinking along the lines of checking the SCP counters and monitoring the number of retries. An increasing amount of SCP retries is indicative of a pending problem. This check would need to be scripted etc. (sh scp module <#>)

Others may be measuring the asicreg counters but a lot of the time these are engineering commands that only the Cisco TAC can interpret.

I was wondering if anyone has BKM's we could apply to monitor a line cards health?

Thanks

Pat

1 Reply 1

lamav
Level 8
Level 8

Patrick:

Ive never used it or seen it deployed, but I have read a lot about GOLD and Smart Call Home.

These are 2 Cisco utilities that are run on platforms with modular IOS.

GOLD stands for Generic Online Diagnostics.

Read about it on Cisco's website.

HTH

Victor

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: