65xx line cards - proactive health checks?

Unanswered Question
Sep 29th, 2008
User Badges:

Hi Folks,


Besides the occasional s/w bugs we are challenged with from time to time the biggest downtime we encounter are failed line cards especially ws-x6248-rj45 and ws-x6348-rj45 (not always full HW fail, sometimes soft reset or reinsertion will recover)


There are a multitude of alerts we can receive once a card goes into a failed state (reactive) but I was wondering if anyone has developed a means of proactively checking a line cards health. The aim being to have advance warning and perform a scheduled swap out thus mitigating the impact.


I was thinking along the lines of checking the SCP counters and monitoring the number of retries. An increasing amount of SCP retries is indicative of a pending problem. This check would need to be scripted etc. (sh scp module <#>)

Others may be measuring the asicreg counters but a lot of the time these are engineering commands that only the Cisco TAC can interpret.


I was wondering if anyone has BKM's we could apply to monitor a line cards health?


Thanks

Pat



  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
lamav Mon, 09/29/2008 - 12:45
User Badges:
  • Blue, 1500 points or more

Patrick:


Ive never used it or seen it deployed, but I have read a lot about GOLD and Smart Call Home.


These are 2 Cisco utilities that are run on platforms with modular IOS.


GOLD stands for Generic Online Diagnostics.


Read about it on Cisco's website.


HTH


Victor

Actions

This Discussion