65xx line cards - proactive health checks?

Unanswered Question
Sep 29th, 2008

Hi Folks,

Besides the occasional s/w bugs we are challenged with from time to time the biggest downtime we encounter are failed line cards especially ws-x6248-rj45 and ws-x6348-rj45 (not always full HW fail, sometimes soft reset or reinsertion will recover)

There are a multitude of alerts we can receive once a card goes into a failed state (reactive) but I was wondering if anyone has developed a means of proactively checking a line cards health. The aim being to have advance warning and perform a scheduled swap out thus mitigating the impact.

I was thinking along the lines of checking the SCP counters and monitoring the number of retries. An increasing amount of SCP retries is indicative of a pending problem. This check would need to be scripted etc. (sh scp module <#>)

Others may be measuring the asicreg counters but a lot of the time these are engineering commands that only the Cisco TAC can interpret.

I was wondering if anyone has BKM's we could apply to monitor a line cards health?

Thanks

Pat

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
lamav Mon, 09/29/2008 - 12:45

Patrick:

Ive never used it or seen it deployed, but I have read a lot about GOLD and Smart Call Home.

These are 2 Cisco utilities that are run on platforms with modular IOS.

GOLD stands for Generic Online Diagnostics.

Read about it on Cisco's website.

HTH

Victor

Actions

This Discussion