Hi Experts, I write about this argument because our WRP400s continue to freeze.
We are a Internet and Telephony Service Provider and we have chosen Cisco-Linksys CPEs imagining to find the same stability of well know Cisco equipments.
Unfortunately, we find that the Linksys devices are not Cisco devices.
We use WRP400s since version firmware 1.00.04c.
Over the years we have tested this list of firmwares:
2.00.26 (actually used)
We can say that there have been improvements but we are far from a stable device.
The reason of this post is a common block presents in latest version 1 and all veriosion 2 of the firmware.
After several days of normal activity the voice module of WRP400 freezes.
This is a description of WRP400 during this block:
- PING is OK
- remote access to the router web menu is OK
- Internet surfing from LAN is OK
- phone line is KO
- remote access to the voice web module is KO
- every operations from the router web menu are ineffective (change LAN IP, diagnostic PING or diagnostic LAN search, reboot button, reboot url of version 2.00.26, ecc)
The only procedure to restore normal working is power off and on the device.
Now I'm trying to reproduce the cause of these stops without succes.
I think that the problem is in the router module and consequently blocks the voice module.
I have a feeling that the DHCP module is causing the problem (for now there are only hypotheses).
I think in a memory leak and so the first question is:
Is there a tool to monitor memory and CPU usage?
Can you find a workaround? e.g. a configurable hardware reboot in cron?
I'll update this post with new elements as soon as possible.
Thanks in advance.
Solved! Go to Solution.
Your problem is known from years. Have a look at this thread:
In my case setting "restrict source ip" changed that device is not hanging 10-15/month, but only 1-2/month. Currently you can only fix the issue completley by buying different device.
We are waiting for a feedback.
Just a clarification. Our 2000 WRP400s are in a private network protected by ASA firewall and TippingPoint IDS. The "restrict source ip" option is enabled and the SIP port used is not the standard 5060 port.
We observed that the voice module freezes apparently after provisioning resync procedure. I've attached our config.
Can you investigate?
In our XML file for WRP400 is also present a router configuration module but the WRP400 doesn't loads this part. Can be a problem?
We hope in your news.
I see from your configuration file you are running the latest firmware for your WRP400's, the configuration file was that created with the SPC tool for this version of firmware?
If it has, have you setup a syslog server and set one of the WRP400's to debug log level and did any kind of logging that could show us further information into what is happening?
I see you commented in another post, and am assuming you have configured to block 5060 inbound. Is it locking up every time it registers or at a certain time frame?
Is there any of the devices you have that you purchased in the last year?
Cisco Small Business Support Center
CCNA, CCNA - Security
First of all, thank you for your interest.
Below my answers:
1) The XML file used to provisioning WRP400 routers was generated from a previous version of SPC file. The exact version of SPC used was wrp400-1-01-00-spc-win32-i386.exe.
During time, the XML file was manually updated with new XML tag options.
This is the same procedure used with other Cisco Linksys devices without problems encountered with WRP400, e.g. SPA2102, SPA8000, SPA30x, SRP520.
Just one consideration. The versione 2.00.20 introduced a new XML tag:
Added two new configurable and provisionable parameters, “Soft IRQ polling
count” and “Voice Active,” to address an issue with voice cutoff. Users may
adjust these settings to fit their deployment.
URL to configure these parameters: http://192.168.15.1/VoiceDebug.asp
Use the following syntax for XML provisioning:
I updated my XML file putting this new option under
What is the right position?
I ask this because our WRPs do not load the
2) We have more than 2000 WRPs. The issue of freeze seems random. We tried to monitor some devices via syslog but we were not lucky.
We will try again to give you a feedback.
3) The freeze happens on every WRP400, old and fresh.
It's very difficult "capture" the problem or replicate it.
When logs are on, we observe that WRP400 sends this message every 2 seconds: ---- eval_prov_logic 1 ----
Jan 16 14:40:23 10.95.102.96 RegOK. NextReg in 296 (1)
Jan 16 14:40:24 10.95.102.96 ---- eval_prov_logic 1 ---- 42040 -- 8409139
Jan 16 14:40:26 10.95.102.96 ---- eval_prov_logic 1 ---- 42041 -- 8409337
Jan 16 14:40:28 10.95.102.96 ---- eval_prov_logic 1 ---- 42042 -- 8409544
Jan 16 14:40:30 10.95.102.96 ---- eval_prov_logic 1 ---- 42043 -- 8409742
Jan 16 14:40:32 10.95.102.96 ---- eval_prov_logic 1 ---- 42044 -- 8409940
Jan 16 14:40:34 10.95.102.96 ---- eval_prov_logic 1 ---- 42045 -- 8410138
Jan 16 14:40:36 10.95.102.96 ---- eval_prov_logic 1 ---- 42046 -- 8410345
Jan 16 14:40:38 10.95.102.96 ---- eval_prov_logic 1 ---- 42047 -- 8410543
Jan 16 14:40:40 10.95.102.96 ---- eval_prov_logic 1 ---- 42048 -- 8410741
Jan 16 14:40:42 10.95.102.96 ---- eval_prov_logic 1 ---- 42049 -- 8410939
Jan 16 14:40:44 10.95.102.96 ---- eval_prov_logic 1 ---- 42050 -- 8411137
Jan 16 14:40:46 10.95.102.96 ---- eval_prov_logic 1 ---- 42051 -- 8411344
Jan 16 14:40:48 10.95.102.96 ---- eval_prov_logic 1 ---- 42052 -- 8411542
Jan 16 14:40:50 10.95.102.96 ---- eval_prov_logic 1 ---- 42053 -- 8411740
Jan 16 14:40:52 10.95.102.96 ---- eval_prov_logic 1 ---- 42054 -- 8411938
Jan 16 14:40:54 10.95.102.96 ---- eval_prov_logic 1 ---- 42055 -- 8412145
Jan 16 14:40:56 10.95.102.96 ---- eval_prov_logic 1 ---- 42056 -- 8412343
Jan 16 14:40:58 10.95.102.96 ---- eval_prov_logic 1 ---- 42057 -- 8412541
Jan 16 14:41:00 10.95.102.96 ---- eval_prov_logic 1 ---- 42058 -- 8412739
Jan 16 14:41:02 10.95.102.96 ---- eval_prov_logic 1 ---- 42059 -- 8412937
Jan 16 14:41:04 10.95.102.96 ---- eval_prov_logic 1 ---- 42060 -- 8413144
Jan 16 14:41:06 10.95.102.96 ---- eval_prov_logic 1 ---- 42061 -- 8413342
Jan 16 15:28:34 10.95.102.96 ---- eval_prov_logic 1 ---- 43479 -- 8696943
Jan 16 15:28:36 10.95.102.96 ---- eval_prov_logic 1 ---- 43480 -- 8697141
Jan 16 15:28:38 10.95.102.96 ---- eval_prov_logic 1 ---- 43481 -- 8697339
Jan 16 15:28:40 10.95.102.96 ---- eval_prov_logic 1 ---- 43482 -- 8697537
Jan 16 15:28:42 10.95.102.96 ---- eval_prov_logic 1 ---- 43483 -- 8697744
Jan 16 15:28:44 10.95.102.96 ---- eval_prov_logic 1 ---- 43484 -- 8697942
Jan 16 15:28:46 10.95.102.96 ---- eval_prov_logic 1 ---- 43485 -- 8698140
Jan 16 15:28:48 10.95.102.96 ---- eval_prov_logic 1 ---- 43486 -- 8698338
Jan 16 15:28:50 10.95.102.96 ---- eval_prov_logic 1 ---- 43487 -- 8698545
Jan 16 15:28:52 10.95.102.96 ---- eval_prov_logic 1 ---- 43488 -- 8698743
Jan 16 15:28:54 10.95.102.96 ---- eval_prov_logic 1 ---- 43489 -- 8698941
Jan 16 15:28:56 10.95.102.96 ---- eval_prov_logic 1 ---- 43490 -- 8699139
When voice module freezes, simply no more of these ---- eval_prov_logic 1 ---- messages are sent.
But logs do not show abnormal SIP activity.
When this happens, SIP REGISTER messages are correctly sent from the WRP to the SIP server, but incoming INVITEs to the WRP400 are not shown in logs. Seems that the "Restrict Source IP" filter blocks all incoming INVITEs.
After a reboot, the WRP400 came back to normal activity.
Sometimes all the web menu freezes, not only voice web menu, simply during WEB GUI navigation.
You try to run the Administrator-Diagnostics-Detect Active LAN Client(s) tool. After this, nothing happens, ping to WRP400 is OK, but the web gui became unreachable.
Another strange issue:
in some logs there is this message: (root) CMD (/sbin/check_ps)
in other no!
But I've the same firmware loaded on all my WRPs.
Can you give me an explanation?
Probably the best thing to do is to find a unit that is still under support, then open a case with the Cisco STAC and reference this post.
I reviewed your config, and it seems to be a template with no voice lines enabled. It doesn't look like it was created with the current SPC tool,
You may have to download the SPC tool and create a sample config to use for provisioning,
The Cisco engineers will need the actual config from a router, and probably a full syslog from the failure.
Once the case is opened, it can be escalated to engineering for (potentially) a firmware fix/update.
I've new logs and new questions for you.
Can you describe what is the "eval_prov_logic" process is?
How can you see in the file WRP400_eval_prov_logic.log, immediatly after I've enabled logs on this WRP, the process "eval_prov_logic" has stopped.
In my experience, when this happens the WRP doesn't accept incoming calls even if SIP REGISTER messages are correctly processed.
Only after a reboot, the "eval_prov_logic" starts again.
What is "check_ps" process? Why some logs report this every 2 minutes and others no?
Others logs report "kernel: ++++++++ tdu_restart".
All our 2000 WRPs have same software and hardware properties (Software Version: 2.00.26 Hardware Version: 1.00.01) and are configured with same XML provisioning file but the behaviour is not linear, is different, too much different.
Can you forward these info to developers to investigate the issue?
I know that there is a new beta firmware 2.00.27.
Do you have unofficial release notes?
Here are the answers to your questions,
and a couple questions of my own ...
The router has a linux or unix core, I'm not privy to the modules or what they do.
After we escalate your case, you will have direct attention of the 2nd level engineering, and developers from the business unit.
same, those do sound like unix system processes.
<<2000 WRPs have same software and hardware properties ... same XML provisioning file but the behaviour is not linear, is different, too much different. Can you forward these info to developers to investigate the issue?>>
Yes I can, Please forward me the serial number of a unit that is still under support, and we can open a case,
My questions are :
Did you download the SPC tool for this firmware version and create the template config to use for provisioning?
Or...Is it coming from a service or script that creates it on the fly?
I looked at your file wrpconfig_b.xml ,and it looked good, there were some things missing (line enable) and some things filled with 'placeholders' x.x.x.x, so we would need an actual config if the device,
The syslog files look good. after we create the case, the escalation engineers will probably ask you for more details (packet captures, topology, etc)
The best way to get this moving forward is to:
get the serial number of a unit that is still under support,
open a case and ask it to be escalated.
Chat and other support numbers are here
The XML file used to provisioning WRP400 routers was generated from a previous version of SPC file. The exact version of SPC used was wrp400-1-01-00-spc-win32-i386.exe.
During time, the XML file was manually updated with new XML tag options.
This is the same procedure used with other Cisco Linksys devices without problems, e.g. SPA2102, SPA8000, SPA30x, SRP520.
The wrpconfig_b.xml file is actually used in our production environment.
Some parts are missing to allow manual changes (e.g. disable a line).
Some parts are replaced by x.x.x.x to hide the real value.
These are some serial number:
Device Serial No:CR301J900723
Device Serial No:CR301J900458
Device Serial No:CR301J706514
Device Serial No:CR301K500386
I'm here again with a new feedback.
We updated up to 2k WRP400 with the beta firmware 2.00.30 activating the automatic reset under "administration-system-automatic mainteinence" setted to Saturday 3AM weekly.
After a weekend, a lot of WRP400 have the same problems:
- voice module freeze with web manu unaccessible, the browser responds with "Error 324 (net::ERR_EMPTY_RESPONSE): The server has closed the connection without send data"
- reboot button and reboot URL string and other operations have no effects
- after working hard on the router web menu, the device blocks its web menu
- customer surfing works and we have ping responses
- only the hard reset re-establish the correct work
So, we have configures the daily restart. In this case the major part of the WRPs restarted as expected.
In our opinion the weekly programmable reboot doesn't work properly, we have some devices with this function enabled but with uptime greater than 2 weeks:
Product Name: WRP400 Serial Number: CR301J605494
Software Version: 2.00.30 Hardware Version: 1.00.01
Voice Module Version: 1.0.18(20101206a) MAC Address: 0023697CE18D
Client Certificate: Installed Customization: Open
Current Time: 6/13/2012 14:54:40 Elapsed Time: 13 days and 19:55:39
RTP Packets Sent: 826578 RTP Bytes Sent: 18994020
RTP Packets Recv: 827516 RTP Bytes Recv: 19014115
SIP Messages Sent: 9308 SIP Bytes Sent: 6861861
SIP Messages Recv: 16834 SIP Bytes Recv: 6404342
Probably, in 7 days the blocking condition stops the cron function.
Another consideration, the voice module version in the firmware 2.00.30 is 1.0.18(20101206a), while in firmware version 2.00.27 the voice module is more recent: 1.0.19(2011110Test).
Can you clarify why?
You can find a WRP400 log in attached.
I'll wait your feedback.
I'm about to test v.2.00.32 but.....
What is the difference between
Nothing is mentioned in