We are implementing IM and Presence (2 nodes v. 22.214.171.124900-1) and CUCM (2 nodes v. 126.96.36.19900-28) and everything was working great until start testing high availability.
Jabber clients are registered on CUCM Sub and when we disconnect this server, the clients, configured as softphones, cannot get registered on CUCM Pub and also lose connectivity with Unity Connection.
The only thing is working is Presence status and Directory Search.
On CUCM under UC Services and SIP Trunk I have both Im and Presence servers and on Presence side I have both CUCM servers configured under Legacy client settings and CCMCIP Profile.
Any help will be appreciated.
Actually, CSF clients are able to register on Pub node when Sub is down. Nonetheless, if we close Jabber and open it up afterwards, it doesn't get registered again. I have already checked cachedPresenceConfigStore.xml and it shows me CUCM Pub and Sub information.
Anyone know how to address this issue?
I haven't solved this issue yet.
Tried TAC and no success by now.
Anyone has a similar cenario ? Is CUCM HA working properly for CSFs?
I've thought it may be a Bug.
What is the version of Jabber client that you are using ? Are you trying the latest version ?
When you face this issue, you should run Jabber Porblem reporting tool and from that pull the latest log file and upload it here. Also, mention the time of the issue.
Hi mkchandak, thank you very much for your reply.
Yes, I'm trying the latest version of Jabber client. I also tried all the other versions and got same behavior.
Please find attached the Problem Report I've just collected and bellow the timeline of the tests:
14:23 disconnected CUCM Sub.
14:24 registered with CUCM Pub. Can make calls normally.
14:26 I logged out from jabber.
14:27 Logged in....jabber tried to register with Sub and of course could not since this is the server wich is down. Afterwards tried to register with CUCM Pub.
14:27:30 On Connection Status option, Jabber presents the message: "Connection error. Verify if the server information on the Phone Services tab at the Options Window is correct. Please contact your service administrator for assistance". And it stay like this forever
CUCM Pub: 10.150.160.13
CUCM Sub: 10.150.160.14
CUP Pub: 10.150.160.19
Do you have a HA node of CUP as well ? The logs show another server 10.150.160.20. Is this Server up and running ? Jabber, what have you configured under Presence.
2013-10-16 14:27:04,055 INFO [0x00000794] [sets\adapters\imp\components\Log.cpp(33)] [JabberWerx] [IMPStackCap::Log::log] - [LoginMgr.dll]: login, jabber, serv:10.150.160.20
2013-10-16 14:27:04,055 INFO [0x00000794] [sets\adapters\imp\components\Log.cpp(33)] [JabberWerx] [IMPStackCap::Log::log] - [XmppSDK.dll]: #0, CXmppClient::SignOn 2013-10-16 14:27:04,055 INFO [0x00000794] [sets\adapters\imp\components\Log.cpp(33)] [JabberWerx] [IMPStackCap::Log::log] - [LoginMgr.dll]: login, jabber, serv:10.150.160.20
2013-10-16 14:27:04,055 INFO [0x00000794] [sets\adapters\imp\components\Log.cpp(33)] [JabberWerx] [IMPStackCap::Log::log] - [XmppSDK.dll]: #0, CXmppClient::SignOn
If this server is no longer valid, then you should go to :
1) Go to Connection Settings and confirm it is pointing to the right server.
Oh..yes, we have a CUP HA node. Its IP is 10.150.160.20 and It's UP and running ok. I had configured it on jabber only for testing purposes and the same problem occurs.
I'm right now downgrading our CUCM cluster to a past version to see if solves the issue.
Still facing the same problem.
I have been making some additional tests during the day and found a weird situation.
We configured Jabber 9.2.6 on 3 Laptops running Windows 7 SP 1 in Portuguese-BR and everything is working just great. The same laptop which was failing before, now is working fine. Also worked well running Jabber on a Windows 2008 R2 server and on another server with Windows 2003.
After, we configured Jabber 9.2.6 on 3 Desktops running exactly the same version of Windows 7, same language and the same user logged in Jabber, and, surprisingly, it does not work.
We tried disabling Windows firewall, anti-virus, updated the bios and NIC firmware and it presents the same behaviour we have been struggling around during the past few days.
We also connected a Laptop on the same network port which the PC was connected and Jabber also work ok.
Do you have any idea of what could cause this weird behaviour?
The best way for us to be able to give you further insight we will need a packet capture from both test to look into the network connectivity to both CMs.
So far the only difference you have been able to identify is the laptop, desktop difference.
Thank you for having answered.
So, I did the packet capture on the working an non-working PC running Jabber and at the same time I collected on CUCM Pub and Sub.
I let running a ping to CUCM Sub during the tests, so it's easy to know when I killed connection to it.
Please find attached the files. I've been trying to analyse them but I can't see where the issue is.
Any help will be appreciated! I have no idea of what's going on.
I'm also getting the same behaviour with 9.2.4 version of Jabber on a CUCM two node cluster version 188.8.131.5200-5, with a similar setup to you.
TAC did confirm that CTI Failover with J4W is currently not supported (documented in Enhancement request CSCud18123) so controlling the deskphone in failover is not an option.
I found that when the Sub is down Jabber can't use the softphone until I change its device pool, which has the Pub listed as the first option (the typical device pool for the devices lists the Sub first).
I'm also having issues with Extension mobility being very slow to login/logout and conference phones (without EM) not registering at all when the Sub is down, unless I change their device pool as mentioned.
Nothing that I can see or TAC looks wrong with the configuration so its just a case of collecting logs when simulating another CUCM Sub failover.
If I find anything useful I'll post it here.
Thank you guys for your help.
The problem is solved!
For future references:
After struggling around with this odd issue I ended up finding that our customer's anti-virus was scanning HTTP traffic and blocking connection to the second CUCM node. I had already tried disabling it, but even if disabled, it still could block traffic. The only way it worked was completely uninstalling it.
Interesting! I played around with the Station Keepalive timers in Service Parameters, reducing them to 10s each and got Jabber to register correctly when the Sub was down. It helped with Extension mobility login/logout times also.