04-13-2007 12:46 AM - edited 03-14-2019 12:45 AM
Hi, I'm seeing a strange problem. I have 2 ipcc servers. 1 primary and one secondary. server a is the primary but it keeps failing over to server B becoming the primary with no apparent reason. we have to reboot server B to force server A to become the Primary.
I'm no expert on IPCC so if anybody can give me any idea's that would be great.
04-13-2007 01:17 AM
Hi
You'll have to go through the logs in c:\program files\wfavvid\logs to see the cause of the failover.
Search the logs for instances of the word 'exception' around the time of the failover, some will be cryptic but some will clue you in to the cause.
Regards
Aaron
Please rate helpful posts...
04-13-2007 08:44 AM
We had a similar problem with one of our installations. Actually, it was due to failing DNS lookups (between the two IPCCX nodes).
Double check whether Netbios and DNS lookups OK (can you ping server1 from server2 by 1. hostname 2. fqdn?)
04-16-2007 12:07 PM
I am encountering the same issue with a customer. There is no reason for the failover but if you look at the logs the Master and Standby talk and figure out who is going to be the master and for which services.
In my case no all of the services failover, just the majority. It is never the same but the majority of the time it's all but 2-3 of the SQL services.
When TAC was contacted we searched through the logs and found that it was a "network error." So we monitored the switchports and when the next failover occurred and found that the network connections were fine. We believe that the "network error" is a general error and it due to the server being to taxed to read the information off the wire.
Right now we are running 2 MCS-7825-H1s with 2GB of RAM so we are going to max them out to 4GB. I will let you know how this worksout for us. It might be a few weeks before we find out if it's going to help or not.
Hope this helps.
Travis
04-16-2007 08:27 PM
HI,
Make sure that the speed and duplex are matching on Server side and switch side. It should be either Auto on both ends or hard coded to 100Full on both ends.
All the best.
Regards,
Venkat
04-16-2007 10:16 PM
In my experience TAC are quick to diagnose a 'network error'... IPCCX is susceptible to transient network failures that might go unnoticed with other applications, but you would be advised to look at the logs yourself and make a judgement.
Aaron
04-17-2007 06:58 AM
Along with the speed and duplex make sure the bindings are in the correct order. Also make sure all server entries are in the host file of each server.
04-18-2007 12:31 AM
thanks for the info guys. i'm not an IPCC guy and here is an event i think might be related. any more ideas that will be great.
Event Type: Warning
Event Source: Cisco AVVID Alarm Service
Event Category: None
Event ID: 82
Date: 26/03/2007
Time: 13:14:18
User: N/A
Computer: GBLPLIPCC01
Description:
The description for Event ID ( 82 ) in Source ( Cisco AVVID Alarm Service ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event:
05-09-2007 08:48 PM
When you say "network error", are you referring to the following message in Event Viewer?
"the server has encountered a network error"
05-10-2007 05:08 AM
I can't say for sure if the pagefile fixed our problem but we have not had an issue since I upped it to 4GB. Again Cisco told me to make the pagefile 200% of your physical memory. As of now we have not had a failover since we changed the size of the pagefile.
Give this a shot and see how it works for you, it can't hurt upping it anyway. Hope this helps.
Travis
06-04-2007 10:27 AM
Just set this, will monitor the logs and post if I see anything change.
06-19-2007 11:37 AM
Well it looks like my problem is back. After initially setting the page file to 200% of the physical memory we didn't have any issues for 2 1/2 months. All of the sudden the primary failed over twice in the same week. I opened a TAC case and we see where it loses heartbeats but we do not know why.
I'll keep you posted.
06-19-2007 12:50 PM
Thanks for the note. Just FYI, upping my paging size didn't seem to do anything.
07-07-2008 09:36 AM
Travis,
Any word from Cisco TAC on this? I am getting the same issue with one of my customer. Please post your comments if TAC was able to provide you a solution.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: