MCS I4 Servers getting stuck

Unanswered Question
Sep 14th, 2010
User Badges:

Hi all,


During the last month, we've found a problem in some CM servers...


What happens is that customer reports that something goes wrong with one of the CM servers of the cluster, and we verify that:

- there's no web access (or sometimes when we try to access the CM IP address, instead of the usual web page we get another one with just the word Platform, but if we try to follow the link, we get a tomcat error -  HTTP Status 404 - /iptplatform -  Apache Tomcat/5.5.28)

- there's no SSH access

- using a keyboard and monitor, we see nothing displayed

- it replies to ping requests (this is the only thing that works)

- telephones register to the next CM server in the CM list

- SIP traffic sent to this server is lost... (it seems that none of the services is up and running)


Up until now, we've faced this in 7 servers:

Cluster 1 - Pub                                   CM v 6.1.5.1000               7816I4

Cluster 2 - Pub                                   CM v 7.1.5.10000             7825I4

Cluster 3 - Pub and, after that, Subs     CM v 6.1.5.1000              7825I4

Cluster 4 - Pub and, after that, Subs     CM v 6.1.4.2000              7825I4

Cluster 5 - Pub                                    CM v 7.1.5.10000            7825I4


As you can see CM version is not always the same, but all of them are IBM I4 servers.

This is the only relationship we've detected among all of them, as we have installed many other clusters with other servers type, and they are not showing up this issue.


Usually we recover the server by rebooting or, if it doesn't work, using the recovery disk, but we are afraid of it being a hardware bug that could be repeating after some time or happening in other new deployments.


Has anyone faced something similar?

Does anyone know about any problem with these platforms?


Thanks in advance

Best regards,

Carmen

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
mcibanez Wed, 09/15/2010 - 03:35
User Badges:

Hi all,


Just in case it helps anyone else...


We've found that there are two bugs already open for this issue:

- CSCti52867 IBM 782x-I4 READONLY file system - tracking defect. This occurs on MCS-7825-I4 and MCS-7828-I4 servers only.

- See CSCti58651 if you encounter this issue on a MCS-7816-I4.


None of them is solved, and the workaround provided is the same I stated previously.


See: https://supportforums.cisco.com/docs/DOC-12955


Best regards,

Carmen

Phillip Ratliff Mon, 10/04/2010 - 09:05
User Badges:
  • Cisco Employee,

Carmen is right on with those two defects.  Lots of customers are seeing this with the 7816, 7825, and 7828 I4 servers.


A new firmware patch for the hard drives on these servers was posted over the weekend.  While we aren't yet calling it fixed we strongly believe the firmware will help.   See the document linked in Carmen's post for details.

mloraditch Thu, 11/18/2010 - 14:02
User Badges:

Ok so I need to update if any of my servers have 3b04 OR 3b05 firmware on the harddrives with the referenced model numbers? The example output shows 3b04 but I could swear it used to show 3b05

Actions

This Discussion