cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2159
Views
0
Helpful
8
Replies

CP7941G/61G's continuously rebooting

npritchett
Level 1
Level 1

I have a growing number of CP7961G and 41G phones in one of my buildings that go into a state where they continuously reboot and my only recourse is to swap them out with a spare.  My telecomm specialist is concerned that if this continues, we will run out of spares.

Particulars: We are running UCM v7.1.5, non-clustered, on a publisher and a subscriber. The phones are about 6 years old. The building is served by three Catalyst 3560-48PS which are not approaching their PoE limits.  We have verified that the problem is not related to a particular switch, port, or patch cord.  The phone will continue rebooting when we bring it back to the test bench. Wiping out its config in UCM and putting a different config on it has no effect.

It is something of a challenge to grab any logs off the phone between reboots, but I managed to retrieve the following messages right before it went down. Any advice?

2662: NOT 14:15:52.387021 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=true

2663: NOT 14:15:52.389162 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - FULLY_REGISTERED - Resetting retry installer interval

2664: NOT 14:15:52.391383 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - ENABLED

2665: NOT 14:15:52.393446 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=true

2666: NOT 14:15:52.395528 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - FULLY_REGISTERED - Resetting retry installer interval

2667: NOT 14:15:52.397532 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - ENABLED

2668: NOT 14:15:52.399608 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.callagent.messages.0 value=0

2669: WRN 14:15:52.402999 JVM: Startup Module Loader|cip.mmgr.dt:? - [MediaMgrSM]: Unhandled Event, State = StateOnHook Event = EventServicesTxStop

2670: ERR 14:15:52.405170 JVM: Startup Module Loader|cip.midp.rms.RecordStoreImpl:midp.rms.IOExceptionInCompactRS - Exception while loading(262160) a RecordHeader() in compactRecords(): java.io.IOException: load invalid offset 262160

2671: ERR 14:15:52.419248 JVM: 2:15:52p|java.io.IOException: load invalid offset 262160

at cip.midp.rms.b.a(I)V(Unknown Source)

at cip.midp.rms.RecordStoreImpl.compactRecords()V(Unknown Source)

at cip.midp.rms.RecordStoreImpl.flush()V(Unknown Source)

at cip.props.a.b()V(Unknown Source)

at cip.props.f.a()V(Unknown Source)

at cip.props.PropertyManager.timerFired(Lcip/sys/k;)V(Unknown Source)

at cip.sys.ar.a(Lcip/sys/ab;Lcip/sys/x;)V(Unknown Source)

at cip.sys.ar.sinkMessage(Lcip/sys/x;)V(Unknown Source)

at cip.sys.n.sinkMessage(Lcip/sys/x;)V(Unknown Source)

at cip.sys.aa.a(Lcip/sys/ad;)Ljava/lang/Throwable;(Unknown Source)

at cip.sys.aa.e()V(Unknown Source)

at cip.sys.l.run()V(Unknown Source)

at java.lang.Thread.startup(Z)V(Unknown Source)

2672: ERR 14:15:52.423225 JVM: Startup Module Loader|PropertyManager:timerFired - Exception flushing RMS.javax.microedition.rms.RecordStoreException: Exception while loading(262160) a RecordHeader() in compactRecords(): java.io.IOException: load invalid offset 262160deleting primary RMS and requesting hardreset to load backup RMS.

2673: ERR 14:15:52.425558 JVM: Startup Module Loader|RmsPropertyStore:? - javax.microedition.rms.RecordStoreException: deleteRecordStore error: record store is still open

2674: NOT 14:15:52.472344 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.system.systemstate

2675: NOT 14:15:52.474498 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.system.systemstate

2676: ERR 14:15:52.476791 JVM: SystemManager n/a          SYSTEM_RESTARTING

2677: ERR 14:15:52.617343 JVM: SystemManager n/a          *systemUnload -->*

2678: ERR 14:15:52.619287 JVM: SystemManager n/a          unloadAllModules ->

2679: ERR 14:15:52.621685 JVM: SystemManager n/a          Unload in this order:

2680: ERR 14:15:52.623620 JVM: SystemManager n/a          SCCP Call Control

2681: ERR 14:15:52.625579 JVM: SystemManager n/a          InstallerModule

2682: ERR 14:15:52.627558 JVM: SystemManager n/a          NotificationModule

2683: ERR 14:15:52.629416 JVM: SystemManager n/a          SubscriptionManager

2684: ERR 14:15:52.631503 JVM: SystemManager n/a          Starts Automation Feature

2685: ERR 14:15:52.633381 JVM: SystemManager n/a          Sidecar

2686: ERR 14:15:52.635212 JVM: SystemManager n/a          Vieo

2687: ERR 14:15:52.637111 JVM: SystemManager n/a          AppStatus

2688: ERR 14:15:52.638977 JVM: SystemManager n/a          Help

2689: ERR 14:15:52.640997 JVM: SystemManager n/a          Go4

2690: ERR 14:15:52.642971 JVM: SystemManager n/a          Messages

2691: ERR 14:15:52.644974 JVM: SystemManager n/a          Directories

2692: ERR 14:15:52.646820 JVM: SystemManager n/a          Services

2693: ERR 14:15:52.648743 JVM: SystemManager n/a          Settings

2694: ERR 14:15:52.650702 JVM: SystemManager n/a          Push Service

2695: ERR 14:15:52.652569 JVM: SystemManager n/a          HTTP Server/Microservlet Container

2696: ERR 14:15:52.654502 JVM: SystemManager n/a          HTTP

2697: ERR 14:15:52.656352 JVM: SystemManager n/a          Config Manager

2698: ERR 14:15:52.658176 JVM: SystemManager n/a          Security

2699: ERR 14:15:52.660245 JVM: SystemManager n/a          TFTP Client

2700: ERR 14:15:52.662092 JVM: SystemManager n/a          Call UI

2701: ERR 14:15:52.663910 JVM: SystemManager n/a          Media Manager

2702: ERR 14:15:52.665803 JVM: SystemManager n/a          Call Agent

2703: ERR 14:15:52.667676 JVM: SystemManager n/a          Application Manager

2704: ERR 14:15:52.669492 JVM: SystemManager n/a          Display

2705: ERR 14:15:52.671540 JVM: SystemManager n/a          Input Manager

2706: ERR 14:15:52.673500 JVM: SystemManager n/a          Localization Manager

2707: ERR 14:15:52.675439 JVM: SystemManager n/a          Reset Service

2708: ERR 14:15:52.677400 JVM: SystemManager n/a          Property Manager

2709: ERR 14:15:52.679351 JVM: SystemManager n/a          Timer Manager

2710: ERR 14:15:52.681306 JVM: SystemManager n/a          Trace Service

2711: ERR 14:15:52.683319 JVM: SystemManager n/a          Trace Manager

2712: ERR 14:15:52.685942 JVM: SystemManager n/a          module "SCCP Call Control"/sccpccapi(31) is UNLOADING

2713: ERR 14:15:52.814693 JVM: SystemManager n/a          module "SCCP Call Control"/sccpccapi(31) is UNLOADED

2714: ERR 14:15:52.819241 JVM: SystemManager n/a          module "InstallerModule"/installer(30) is UNLOADING

2715: WRN 14:15:52.892189 JVM: Startup Module Loader|cip.sccp.cn:? - Read exception: java.net.SocketException: Error 0

Close(d) Connection ..., errno=135

2716: ERR 14:15:52.907053 JVM: 2:15:52p|java.net.SocketException: Error 0

at cip.io.SecureInputStream.socketRead([BII)I(Native Method)

at cip.io.SecureInputStream.read([BII)I(Unknown Source)

at java.io.BufferedInputStream.fill()V(Unknown Source)

at java.io.BufferedInputStream.read()I(Unknown Source)

at java.io.DataInputStream.readInt()I(Unknown Source)

at cip.io.i.readInt()I(Unknown Source)

at cip.sccp.ax.a()Lcip/sccp/bu;(Unknown Source)

at cip.sccp.cn.e()V(Unknown Source)

at cip.sys.l.run()V(Unknown Source)

at java.lang.Thread.startup(Z)V(Unknown Source)

2717: NOT 14:15:53.293053 JVM: Startup Module Loader|cip.midp.midletsuite.g:onRun - FinalizeObjectThread INTERRUPTED

2718: ERR 14:15:53.297661 JVM: SystemManager n/a          module "InstallerModule"/installer(30) is UNLOADED

2719: ERR 14:15:53.300806 JVM: SystemManager n/a          module "NotificationModule"/notification(29) is UNLOADING

2720: ERR 14:15:53.303783 JVM: SystemManager n/a          module "NotificationModule"/notification(29) is UNLOADED

2721: ERR 14:15:53.306581 JVM: SystemManager n/a          module "SubscriptionManager"/subscriptionman(28) is UNLOADING

2722: ERR 14:15:53.318673 JVM: SystemManager n/a          module "SubscriptionManager"/subscriptionman(28) is UNLOADED

... ... ...

2819: NOT 14:16:00.563892 SYSMSG: pid 16 (/ubin/dsp) Normal Exit, status = 1

2820: INF 14:16:00.563965           runtime = 135.140 secs

2821: INF 14:16:00.564011          user cpu = 0.136747533 secs

2822: INF 14:16:00.564054        system cpu = 0.489695153 secs

2823: INF 14:16:00.564091    child user cpu = 0.000000000 secs

2824: INF 14:16:00.564125     child sys cpu = 0.000000000 secs

2825: INF 14:16:00.564173    sys interrupts = 1.283613038 secs for 20499 interrupts

2826: INF 14:16:00.564219 total cpu = 0.626692686 secs ( 0% utilization )

2827: NOT 14:16:01.250272 DHCP: SIGTERM received...DHCP is exiting

2828: NOT 14:16:01.500314 DNS: SIGTERM received...DNS is exiting

2829: ERR 14:16:01.750246 login: [3]:loginSIGTERM signo:23

2830: ERR 14:16:01.750682 login: [3]:tcgetattr failed errno:25

2831: NOT 14:16:02.021689 SECD: exitHandler: ** EXITING, Ver: 1.4(1.2) [May 11 2011, 08:03:23] **

2832: NOT 14:16:06.120490 init: Starting /sbin/inetd

2833: NOT 14:16:06.124411 init: /sbin/inetd started as pid=7

2834: NOT 14:16:06.175078 INETD: pid = 7

2835: NOT 14:16:06.175901 INETD: vvlan = 0 will be initiazed to 4096

2836: NOT 14:16:06.179804 INETD: sshAccessReadConf: sshAccess = 0

2837: NOT 14:16:06.505331 init: Starting /sbin/imgauthd

2838: NOT 14:16:06.509235 init: /sbin/imgauthd started as pid=20

2839: NOT 14:16:06.510234 init: Starting /bin/tftpClient

2840: NOT 14:16:06.514066 init: /bin/tftpClient started as pid=23

2841: NOT 14:16:06.514896 init: Starting /bin/secd

2842: NOT 14:16:06.518723 init: /bin/secd started as pid=14

2843: NOT 14:16:06.519551 init: Starting /bin/ntp

2844: NOT 14:16:06.523520 init: /bin/ntp started as pid=12

2845: NOT 14:16:06.524351 init: Starting /ubin/vieo

2846: NOT 14:16:06.528190 init: /ubin/vieo started as pid=10

2847: NOT 14:16:06.529020 init: Starting /ubin/dsp

2848: NOT 14:16:06.533004 init: /ubin/dsp started as pid=31

2849: NOT 14:16:06.533834 init: Starting /sbin/ewcl

2850: NOT 14:16:06.537658 init: /sbin/ewcl started as pid=25

2851: NOT 14:16:06.538484 init: Starting /bin/mkdir

2852: NOT 14:16:06.547443 init: /bin/mkdir started as pid=21

2853: NOT 14:16:06.565041 CDP-D: pid = 30

2854: NOT 14:16:06.582629 DHCP6: COLDBOOT - wait 4 seconds...

2855: NOT 14:16:06.589560 PAE: pid = 13

2856: NOT 14:16:06.592548 PAE: phone is not single port

2857: WRN 14:16:06.599864 PAE: paeReadPCStatus: empty file

2858: NOT 14:16:06.600744 PAE: link-0 is UP

2859: NOT 14:16:06.601574 PAE: starting supplicant

2860: NOT 14:16:06.602389 PAE: EV_DISABLED, ST_DISCONNECTED->ST_DISABLED

2861: NOT 14:16:06.603180 PAE: supplicant is disabled

2862: NOT 14:16:06.603987 PAE: paeSetLastSupStatus: LastSupStatus 0

2863: NOT 14:16:07.808304 ESP: send ADMIN, logging = 1, shell = 0, ipconfig = 1

2864: NOT 14:16:07.820307 IMG_AUTH: Version: 3.6(1.0): TNP REL

2865: NOT 14:16:07.821288 IMG_AUTH: model: CP-7961G, check for built-in key

2866: NOT 14:16:07.822133 IMG_AUTH: built-in platform RELEASE key is present

2867: NOT 14:16:07.822938 IMG_AUTH: built-in common RELEASE key is present

2868: NOT 14:16:07.824329 IMG_AUTH: TNP, built-in keys  : Platform (curr), Common (curr)

2869: NOT 14:16:07.830481 tftpClient: pid = 23

2870: NOT 14:16:07.847867 SECD: main: Starting... Ver: 1.4(1.2) [May 11 2011, 08:03:23]

2871: NOT 14:16:07.853471 SECD: initSecMode: resuming sec mode NONE

2872: WRN 14:16:07.875848 SECD: WARN:loadCert: phone has no LSC cert

2873: NOT 14:16:07.876880 SECD: initResources: have cert+pkey:  MIC only, no LSC

2874: NOT 14:16:07.877924 SECD: perform_selftests: FIPS RSA SHA1 Self-Tests Passed

2875: NOT 14:16:07.879697 SECD: perform_selftests: FIPS RSA HMAC-SHA1 Self-Tests Passed

2876: NOT 14:16:07.881400 SECD: perform_selftests: FIPS RSA AES Self-Tests Passed

2877: NOT 14:16:07.882752 SECD: perform_selftests: FIPS RSA 3DES Self-Tests Passed

2878: NOT 14:16:08.232243 SECD: perform_selftests: FIPS RSA Sign/Verify Self-Tests Passed

2879: NOT 14:16:08.233274 SECD: perform_selftests: FIPS RSA RNG Self-Tests Passed

2880: NOT 14:16:08.234162 SECD: perform_selftests: FIPS RSA Library Self-Tests Passed

2881: NOT 14:16:08.235031 SECD: runFipsPostTests: FIPS Security Information: RSA Self-Tests Passed

2882: NOT 14:16:08.236899 SECD: TL_initFromCTLFile: No TL file: /flash0/sec/ctl/CTLFile.tlv

2883: WRN 14:16:08.238165 SECD: WARN:initCTL: ** phone has NO CTL file **

2884: NOT 14:16:08.239709 SECD: TL_initFromITLFile: No TL file: /flash0/sec/ctl/ITLFile.tlv

2885: WRN 14:16:08.241073 SECD: WARN:initCTL: ** phone has NO ITL  file**

2886: WRN 14:16:08.241978 SECD: WARN:resetSecMode: ** WARN ** resetting sec mode to NONE (non-secure)

2887: NOT 14:16:08.244938 SECD: clearCapfList: CAPF table cleared

2888: NOT 14:16:08.246183 SECD: initCapfClnt: CAPF clnt initialized

2889: ERR 14:16:08.251313 SECD: EROR:initEntropy: Calling R_rand_seed with bytes 128

... ... ...


8 Replies 8

Leo Laohoo
Hall of Fame
Hall of Fame

If your switch is Cisco, can you please specify the exact model of the switch?

I believe you can find that in my original post, unless I misunderstand your question, in which case I apologize.

--Norm

Ok, you got 3560.  Just want to make sure you have the switches with FastEthernet access ports, right?

Do you have any switches with copper GigabitEthernet?

Yes, these are FastE ports on the 3560s.

I have a 3750 with GigEthernet PoE ports in another building.

Too bad.  I was hoping you had one where the faulty phone is so you could do a TDR.

If this is happening across multiple 3560s in a building, and with different patch cords, and when directly plugged into a test switch (a 3560-8PC), wouldn't you say most of the problems a TDR can detect would be ruled out?  Unless perhaps some flaky office connections are corrupting TFTP transfers to the phones.  In that case I would image the replacement phones eventually having problems as well.

I was wondering if the Record Store Exception messages in lines 2670-2672 might be some clue.  I'm not familiar with what to expect in normal logs, but those strike me as suspicious.

--Norm

aaroncoffman
Level 1
Level 1

- Have you attempted the factory reset procedure on the phones? What's the result?
- Do you have 7941/61s elsewhere on the same cluster not displaying the issue?
- What firmware version are the phones running?
- What do the cucm traces look like leading up to the issue?
- What if anything is different about this site than another?

Regards,
Aaron

Aaron, thank you for the troubleshooting tips! I am more of a routing/switching guy rather than IP telephony.

My availability to troubleshoot the issue has dwindled temporarily for a few days due to an upcoming audit so I haven't had a chance to follow through on everything you listed, but I did find the factory reset procedure and gave that a whirl on the defunct phones and that successfully resucitated them.  

What remains to be solved then is determining what put them into this state.  I hope your additional suggestions will lead me to an answer.

Let me throw in here that upon examining my three switch logs again, I believe the switches may be removing power from the phones.  I am getting a number of "%ILPOWER-5-IEEE_DISCONNECT: Interface Fa0/xx: PD removed " messages which I originally thought were triggered by the phone restarts but am now of the belief that they lead ahead of the restarts.  So that would point to a PoE or power problem of some sort.  Since this occurs across three different switches in the building, I don't believe I am looking at a malfunctioning switch.  Though in considering what is different about this building from all the others I maintain, something recently occurred to me, this building has solar panels!  So now I'm beginning to wonder if something about its power inverters is disagreeing with my PoE?

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: