Does anyone exeperience issues with their Unity servers constantly losing communication with global catalog and domain controller servers. 7 or 8 times a day I can see in the application event viewer showing my unity server lost communication with DC1 now trying to talk with DC2 for example. The GC's and DC's never appear to be unavailable to any other server. If a call comes in externally and they happen to call when the Unity server is having a hard time communicating the message gets tossed into the failed folder. We have no bandwidth issues during these times. Thanks,
Thanks for the response. Below is the event viewer message. This one in particular is kind of odd. You can see it couldnt find cdc-dcp1.companyname.inc so instead it found cdc-dcp1.companyname.inc (the same one it couldnt find???). Its not always like this. Typically, you see it couldnt talk with cdc-dcp1.companyname.inc and then it found cdc-dcp3.companyname.inc for example. I also enabled the trace like you mentioned. I assume you want to see the diag_AvDSGlobalCatalog_20060307_212745 log. However, this log wasnt created until 11:27pm. I did see the error (the one I sent below) happen at 8:42pm. I dont see the 8:42 time frame in either of the 2 diag_AvDSGlobalCatalog log files I see. Is there another log you're you'd be interested.
Event Type: Warning
Event Source: CiscoUnity_MALEx
Event Category: Warning
Event ID: 30019
Time: 8:42:59 PM
The MAPI subsystem has indicated that the Global Catalog Server
cdc-dcp1.companyname.inc which is used to resolve addresses for message submission cannot be reached, and that it has switched to using Global Catalog server cdc-dcp1.companyname.inc. Unity will continue to function using this newly selected Global Catalog server and will not automatically switch back to the original one. If Unity does not have a dedicated connection with sufficient bandwidth to the newly selected server, then there may be significant delays in Exchange access by Unity. Please verify that Unity has a good connection to the new Global Catalog for proper functioning.
Yes, I get these too. Seems like out of the blue Unity can't communicate with one of my global catalogs and then it comes up with the same message about not being able to connect to DC1 and has switched to DC1!
I'll be watching this thread!
Do you get voice mails placed in the failed folder? I have been out of the office to provide anymore info but I will be getting those logs out ASAP.
Not in the failed folder but do occasionally get them in the unitymta folder. All are from outside (non-subs). When I restart the AvUMRSyncSvr they are delivered.
It's Sunday in the evening and I just checked the Unity server's application log. I have 4 MALEx warnings in the past 7 hours. There is no one at the campus today and the network is not even being used yet I get these errors.
The ONLY thing I can think is the domain controllers are on a different network seperated by a Cisco PIX firewall. I haven't received a straight answer from the firewall guy what ports are open. The document in a message above describing what ports are needed will be helpful.
Interesting thought...I know our DC's on a separate network and we have Cisco ASA appliances in between the Unity servers and the DC's. The ASA devices are fairly new and they are supposed to be in a "monitor" mode only. The IPS functionality is not turned on yet. I will see if I can gather some logs that might help. I know the ASA's caused issues for our mail flow when it was in that "monitor" mode as well. Maybe more to it.
One other thought that crossed my mind but don't how much this would impact Unity and AD - I seem to get most of not all of these errors on the weekends.
Our enterprise backup runs weekly backups on the weekends pretty much all day. During the week we have nightly runs on the backup. Everything is connected with fiber and our network is very underutilized even with the backups running.
Just seems odd that I get these errors during the big backup runs. FWIW - Unity is not being backed up (don't ask why) so there would be no slowness there. The backups are all behind the firewall so if it were a bottleneck it wouldn't be there and I doubt our core switch would be the bottleneck either.
Can you confirm that there is nothing blocking communication between Unity and Exchange/GC? Please review this doc:
Securing TCP/UDP Ports
I will need to look at the rest of the app and sys logs along with the DsGC logs.
Let me explain how this works. For the Malex messages getting logged, Unity is actually uninvolved with the reconnection. We are basically passing the message on. It's purely implemented in Microsoft Exchange code. This blog talks about it a bit:
So what happens is that Unity goes to submit a message and MAPI returns the error code MAPI_E_END_OF_SESSION or MAPI_E_USER_CANCEL back to Unity. If Unity gets MAPI_E_USER_CANCEL we know a reconnect is in progress and it will log it to the event log. Once the referral is complete Unity should get MAPI_E_END_OF_SESSION from Exchange. At that time Unity will refresh its MAPI profile, log that a reconnect has occurred and start working again.
There have been a few customers lately where Unity was unable to recover in either the UMR or AvCsMgr after a reconnect. In each case one process would recover while the other would not. Mind you that this is different that just losing connecting, its what happens after the reconnect. Its documented in CSCsd17381:
The Release-note has a lot of detail but basically the Microsoft code starts returning an incorrect error message back to Unity. Since Unity isnt told a reconnect happened it doesnt hit its reconnect logic and the profile is never triggered. At this point Unity problematic service needs to be restarted.
If you have see CSCsd17381, where Unity doesnt completely recover after a reconnect, please provide the version of the Exchange System Manager you have installed on Unity. You can check the c:\winnt(windows)\emsmdb32.dll file.
But keep in mind this is all after a reconnect. I am not aware of any problem with Microsofts code were it loses connection for no good reason. It might be possible but I havent see it.