Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. And see here for current known issues.

Ask the Expert: Troubleshooting Unified Contact Center Enterprise

            Read the bioWith Goran Selthofer

Welcome to the Cisco Support Community Ask the Expert conversation. This is an opportunity to learn and ask questions about integrating Unified Contact Center Enterprise into your environment and troubleshooting the many features that are available with the Unified Contact Center Enterprise solution.

Cisco Unified Contact Center Enterprise delivers intelligent contact routing, call treatment, network-to-desktop computer telephony integration (CTI), and multichannel contact management over an IP infrastructure. It combines multichannel automatic call distributor (ACD) functionality with IP telephony in a unified solution. This makes it easier for your company to rapidly deploy a distributed contact center infrastructure.

Goran Selthofer is a team lead for the Cisco TAC EMEAR Contact Center team based in Brussels. He has supported UCCE, UCCX, CVP, and UCCE applications for the past seven years within the Cisco TAC. He has more than 13 years of overall experience in the industry, with broad experience in Cisco Unified Communications infrastructure solutions as he has been also working for Cisco Gold Partner prior to joining Cisco TAC. Goran also provides internal training to TAC engineers on Contact Center topics. He graduated with a master's degree at the Technical Military Academy - Belgrade University. He also holds CCIE certification (number 27211) in voice as well as VMware Certified Professional certifications. 

Remember to use the rating system to let Goran know if you have received an adequate response. 

Goran might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation in Collaboration, Voice and Video community,  sub-community, Contact Center discussion forum shortly after the event. This event lasts through February 14, 2014. Visit this forum often to view responses to your questions and the questions of other community members.

46 REPLIES
New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Hi Goran,

Thank you for covering this topic. My question is which logs do I need to check if I have issues with UCCE calls routing?

Thanks.

Jackson

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Jackson!

First of all I want to thank you for participating!

The most important thing to know first is THE CALL FLOW!

Knowing your call flow in details will reveal all nodes and processes which you should or can troubleshoot within logs.

Now, basically, there are different types of nodes being: Central Control, Peripheral Gateways, different peripherals and CTI services (server and desktops). Each of those have their own specifics in setting and collecting traces.

Therefore, we have published the following Tech Note to help partners/customers with setting and collecting logs:

http://www.cisco.com/en/US/products/sw/custcosw/ps1844/products_tech_note09186a0080c177d7.shtml

More details around that and much more serviceability is given within following guides:

http://www.cisco.com/en/US/products/sw/custcosw/ps1844/products_installation_and_configuration_guides_list.html

Please let me know if this is sufficient for you!

Once again, thanks for participating!

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Hi Goran,

What is the best/proper way to troubleshoot replication issue between Rogger/Logger A & B?

Is there an easier way to monitor the communication between A & B?

Thanks!

-JT-

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi JT!

Thank you for the question!

…and…it is a very a good one … so it requires a very long answer

Ok, so usual confusion on that topic comes from the fact that users often think this should be similar to Microsoft SQL Replication. Thus, users expect to see something like GUI or visual presentation of that replication.

However, MSSQL REPL is not used here. Therefore, we need to understand architecture before we can think of ‘monitoring’ it. Also, to be very clear from the beginning, there is no ‘easy way’ of ‘monitoring’ it as there is no ‘tool’ for that.

Now, first, we need to separate Router from Logger because they get their data in a different way hence they have different way of syncing that and that is why they are to be observed separately.

Routers have MDS (Message Delivery Service process). Loggers do not have that process. However, Loggers use MDS of ‘same side’ Router. Logger on one side will never talk with Router on another side.

MDS is a sync zone, meaning every bit of data which comes to Router on one side is replicated through MDS to the Router on another side. Knowing that UCCE architecture utilizes two types of networks, MDS uses PRIVATE network for that communication. It is very active process since Routers sync their MEMORY. Therefore, that needs to be a perfect sync.

However, data which Router gets, router commits to the local DB and since Router doesn’t have DB, it means router commits that to the same side Logger’s DB. That is how Logger gets data. So, data bit which came from PGA to RTR (not relevant how at this point) ends up first in MDS on Router A side (assuming PGA has active link with RTRA) which is replicating it to Router B where it ends in both routers’ memory. Now, EACH router commits that to its own respective Logger.

Bottom line here is the following:

  • Routers sync their memory and that cannot be easily ‘monitored’ but RTR processes are designed in such a way that if there is a difference then they will for sure complain and it will go even up to the point that one side process will not even be able to start or it will restart if not able to go in sync. So:
    • MDS though would be good start point to check if something goes wrong as it will report process or peer disconnects
    • Also, RTTEST tool can be used to check if any failover happened and when or from which side sync was done.
  • Loggers do get their data from respective Routers but Loggers also have a possibility to ‘sync directly’. This kind of sync is done via socket connection by RECOVERY (RCV) process and it can be monitored via RCV logs (in a basic logical fashion way – is there any errors or unusual behavior or not). So:
    • RCV process logs for checking if it is all healthy on that side
    • ICMDBA tool to quickly see if replication of new data is happening (Space Used Summary option from Data menu when Logger DB is selected) by monitoring Max date.

2014-02-04 12_10_57-bru-vaas-vc - vSphere Client.png

Maybe not as you hoped to be but I have tried to give an overall perspective for other users reading it later as well…

Thanks,

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Hi Goran,

Great explanation!

In the same line, I'm trying to get some clarification in regards to the automated truncation process in both UCCE and CVP database

a) there are default retention days for certain UCCE tables (some 14, 100, 1095, etc)

b) if not mistaken CVP is also 1095 days

My questions

a) Will ICMDBA start to auto truncate the tables once the threshold has been passed? (80%). How will it select which tables/data need to be truncated first?

b) If it's compulsory for me to keep all data at least for 1096 days, those retention period can be changed to reflect that? Dependency on sql db & disk space of course

c) Can we disable this automated truncation process?

c) How does it work in CVP report server?

Thanks!

-JT-

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi JT,

Thanks!

Again very interesting questions!

Ok, so I will have to limit to UCCE side in this answer and leave CVP for other people in other sessions.

But I think CVP part is already well described in the CVP SRND/Guides.

Here is the story about PURGING in UCCE.

There are 2 categories with total of 3 types of PURGE which can happen from UCCE point of view:

Category 1: Scheduled Purge
--------------------------------
1. Daily scheduled purge

based on this RETENTION parameters:
HKEY_LOCAL_MACHINE\SOFTWARE\Cisco Systems, Inc.\ICM\\Distributor\RealTimeDistributor\CurrentVersion\Recovery\CurrentVersion\Purge\Retain

Tables are purged usually at 00:30 every day - controlled by this parameter:
HKEY_LOCAL_MACHINE\SOFTWARE\Cisco Systems, Inc.\ICM\\Distributor\RealTimeDistributor\CurrentVersion\Recovery\CurrentVersion\Purge\Schedule\Schedule


Category 2: Emergency Purge
--------------------------------
There are 2 parameters to control this under this path:
HKEY_LOCAL_MACHINE\SOFTWARE\Cisco Systems, Inc.\ICM\cim\Distributor\RealTimeDistributor\CurrentVersion\Recovery\CurrentVersion\Configuration\Purge\Automatic


1. AdjustmentPercentage Purge on 80%
2. PercentFull on 90%

Both are set to purge 1% when DB reaches respectively 80% or 90%.

WARNING: ABOVE REGISTRY KEYS SHOULD NOT BE CHANGED!!! DOING THIS WILL MAKE YOUR SYSTEM UNSUPPORTED!

Reason for this is very simple: ICM processes are in charge of filling data into DB hence ICM needs to keep DB under 80% in order to compensate for the data burst while at the same time ensuring proper performance on process level interacting with DB.


Now, how is the PURGE done is very simple: Purge oldest data but fist but start from Tables starting with letter A.
So, usually it will be oldest data in Agent tables to be purged first.

Here is the drawaback of that approach:
Since it is set to purge 1% of data, just to make DB go under 80% usage, so to 79%, that means that if Agent table is purged with certain number of rows which dropped usage of DB to 79% then purge will stop. However, if there are still incoming data into DB making DB to go to 80% again, then again PURGE will be triggered with the same logic - start from A and purge 1%. So, if your DB is bouncing between 80% and 70% then it can easily happen that your Agent_ tables are purged totally thus making your reporting which depends on those tables not possible.

90% purge works the same way however, when it reaches 90% no new data will be allowed into DB.

So, you can argue with this but you have to keep in mind one SIMPLE RULE:

This is an EMERGENCY action.

Your tasks as system admin or system architect is to design the sytem in that way to AVOID reaching 80% full DB at any time.


So, answer to your a) question is above. keep in mind that it is not ICMDBA tool who is doing that but the code itself.
Answer to your b) question is also above (keys for retention). Of course, you should use ICMDBA tool here, option to Estimate your DB size based on required retention periods and then ensure you have that disk space there already before increasing retention times.
Asnwer for c) - NO. Definitelly NO!

Thanks,
Goran


New Member

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Goran,

That clarifies several doubts

To confirm

a) There is no dependency between scheduled & emergency purge. i.e. tables with pass the retention period will still be purged regardless of how full/empty the database is?

b) Is there a reference/link/doc that states all the current default retention period?

c) What is the trend seen for financial customers in relation to the retention period? Higher retention for interval/halfhour tables & lower retention for detail/event based tables?

d) If data grows faster than initial calculation, i would still be able to expand the database (subject to disk space availablity)? This link is also applicable for the current version?

http://www.cisco.com/en/US/products/sw/custcosw/ps1001/products_tech_note09186a0080094927.shtml

Thanks!

-JT-

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Glad to hear that JT!

to answer your questions:

a) There is no dependency between scheduled & emergency purge. i.e. tables with pass the retention period will still be purged regardless of how full/empty the database is?

Answer: CORRECT

b) Is there a reference/link/doc that states all the current default retention period?

You can find it in Admin Guide:

http://www.cisco.com/en/US/docs/voice_ip_comm/cust_contact/contact_center/ipcc_enterprise/ippcenterprise10_0_1/maintenance/UCCE_BK_AD83C810_00_administration-guide-cce-and-hosted.pdf

I don't think HDS is mentioned there but for 'All Other Historical Tables' in HDS I believe it is 1095 days.

c) What is the trend seen for financial customers in relation to the retention period? Higher retention for interval/halfhour tables & lower retention for detail/event based tables?

Answer: CORRECT. However, mind that there are also some different rules forced by law in some countries telling how long data should be kept.

d) If data grows faster than initial calculation, i would still be able to expand the database (subject to disk space availablity)? This link is also applicable for the current version?

http://www.cisco.com/en/US/products/sw/custcosw/ps1001/products_tech_note09186a0080094927.shtml

Answer: CORRECT and CORRECT.

However, PLEASE do not take that as 'a primary line of success' - meaning - I will just put now what I 'think' it is good as anyhow we can expand it later. That decision might cost your customer some data loss since UNTIL you are enaged back to expand it, almost for sure there has been a problem already and data started to drop.

Therefore, probably daily we have at least one TAC CASE opened asking 'where is my data'. This is because improper estimation is done during the deployment about retention periods compared to DB size. So, customer wanted to retain 3 years of data and retention periods are set according to that WISH. and that is nothing more than a WISH. However, in order for that to become reality then DB size also needs to follow that WISH. Well, DB size was left to 40 GB and then 'suddenly' everyone is wondering 'why I am losing data since I have configured retention period on 3 years'

I hope I have given you a clue - why is that

Also, if reporting is so important to customer, we do recommend HDS on both sides and regular backups and DB maintenance.

Cheers!

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Hi Goran,

In relation to log reading that Cisco TAC does, is there a reference/list of the common errors that will appear in the respective processes.

For example

a) Connection to Central Controller side A failed

b) Connection to Central Controller side B failed

c) Connectivity with duplexed partner has been lost due to a failure of the private network, or duplexed partner is out of service

d) others

Other logs typically have certain key identifier if that particular log is just info, warning, error, fatal, etc Something like this will definitely speed up the troubleshooting process.

Thanks!

-JT-

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi JT!

Good questions!

However, the least Cisco TAC is doing when reading logs is that it is using some ‘magic cheat sheet’ to decode all traces.

No, we simply work as per experience and read traces knowing or getting ‘good’ examples or simply reading ‘error’, ‘exception’, ‘fail’, ‘timeout’ keywords and take it from there.

It is a long and hard process to read logs and the more you do it the more it starts to get some meaning - like Matrix ••J

So, bottom line: No, there is no reference/list of common ‘process’ errors except for what is already published for maybe Router here:

http://docwiki.cisco.com/wiki/Router_Error_Codes

Also, as described in one of the above posts you can use checke tool to see what is the peripheral error mapping – code to description:

  • Peripheral Error Code Descriptions

A quick way to obtain the description for UCCE Peripheral Error Codes is to log onto a UCCE system is to open a command prompt and navigate to C:\icm\bin directory and "checke where error code is the peripheral error code that you have identified. In this example we would use c:\icm\bin>checke 12005

Now, although most of the processes are not completed from serviceability point of view to document/list all possible errors, intention of BU is to directly write in logs as much details as it can be done to give more clues of what is happening.

Examples of Error messages in logs:

Failed to update the database.

The Update succeeded at the controller but was not propagated back to the Distributor.

Check the status of UpdateAW on the Distributor.

Or:

Failed to update the database.

Another user has changed the configuration data. Re-retrieve the data and try save again.

If the problem persists, you need to reload your local database. You can do this using

the Initialize Local Database tool.

Thanks,

Goran

Cisco TAC

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Goran,

Troubleshooting Finesse issues seems to be a huge pain, do you have any tips for that?  For example, sudden logout errors, sudden failover messages, etc.

Thank you.

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi David!

Thank you for being part of the event!

Indeed! Finesse can be a bit tricky as it is a fairly new product...compared to CTIOS or CAD.

However, luckily we have our internal engineers in CAP, BU and TAC, who are creating more and more use cases in internal and external knoweledge databases about Finesse.

As a result the following pages are made externaly visible the same way as we have them internaly:

http://docwiki.cisco.com/wiki/Troubleshooting_Cisco_Finesse

I definitelly advise that you check those!

Examples:

Problem Solving process:

http://docwiki.cisco.com/wiki/Additional_troubleshooting_information_for_Cisco_Finesse_8.5

Client Error: Client requests constantly result in "503 Service Unavailable" Error:

http://docwiki.cisco.com/wiki/Client_Error:_Client_requests_constantly_result_in_%22503_Service_Unavailable%22_Error

Replication issues:

http://docwiki.cisco.com/wiki/Replication:_Check_status_and_fix_replication_errors

Or, here is an useful tip which you might not find there yet:

How to check the Health of your Finesse Server

The SystemInfo API doesn't require authentication and will provide you with either an "IN_SERVICE" or "OUT_OF_SERVICE" status

Point your browser to the following url

     http:///finesse/api/SystemInfo

The status wil only show IN_SERVICE when all Finesse components are on-line.

medium.jpg

Not to repeat links, I will also post in the next reply below the link for troubleshooting Finesse Agent Login Trace with the Use of Logs since below question is more specific to that part...

I hope I have given you a clue but if anything else is needed, you know where to find us

Thanks,

Goran

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Goran,

Going back on the Finesse issue(s).  I'm experiencing an issue where the phonebooks changes aren't reflecting to the agent desktop.  Any thoughts on how to troubleshoot this?

Thank you.

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

David,

Unfortunatelly, Finesse version is not shared and also more details like - i.e. phonebook changes are not reflecting but is it consistent for both servers if duplex server deployment) or only for one...all changes from certain time, all the time or intermittent...etc...etc...

In above post I shared Problem Solving process where there are lots of questions shared which might help to isolate.

This is probably not very popular to be asked to answer as some think that is a waste of time but this is how TAC resolves more than 65% of cases believe it or not

Those questions actually come from well-known Kepner-Tregoe Problem Analysis methodology and are used in troubleshooting diffent issues, not only in IT. Every Cisco TAC engineer is required to pass KT training so to be able to use it.

OK, so back to the issue, I will assume you are not on 10.0 release hence it might be that you are hitting known issue:

CSCul20619    CCE and CCX: PhoneBook update not shown on desktop after DB restart

There are some issues in seeing this defect from outside currently but it is marked as external so will be visible in the future. Anyway, the workaround is to restart Cisco Tomcat.

Please check if that resolves it for you and let me know. (Note: restart Cisco Tomcat out of production hours).

Now, if you want to troubleshoot Finesse for that issue then here is what usually you do for logs:

Substitute your primary Finesse server IP Address in this url for collecting the logs.

http://XXX.XXX.XXX.XXX/finesse/logs/webservices/


Capture of the Web Services logs, but in this sequence:

All on primary Finesse Server:

1)      Agent logs out of Finesse

2)      Stop Tomcat Service

3)      Start Tomcat Service

4)  Make some changes to phone book (note what exactly)

5)      Agent logs into Finesse

6)      Agent attempt to make call and options window is open showing available phone books.

7)      Collect Web services logs from the time Tomcat is restarted until just after the attempt to make a call and missing or incomplete phone books are observed.


Be careful, this is service impacting, so do it after hours. Also note, Tomcat restart might resolve the issue as well as mentioned above so you might not be able to reproduce it.


How to collect the Error and Desktop logs for review:

1.  When agent sees that issue on the desktop  have the agent hit "Send Error Report" on the desktop.This will send the client side logs to the Finesse server.
2.  Use the cli command to collect all Finesse logs - file get activelog desktop recurs compress
3.  Collect CTI server logs from the time of Finesse tomcat restart to the time the agent sees the issue on the desktop. (Healthcheck)

I hope this helps!

Have a great weekend!

Thanks,

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Goran,

Thank you for initiating this discussion. Want some help on advanced troubleshooting like how to read the logs once they are collected. Is there any tool available for the partners to do that?

Also When working with Agent State Trace, what exactly happens at the logging level

Regards

Chitrangad Pathak

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hello Chitrangad!

Nice to see you here, thanks for posting the question!

Well, honestly, there is no 'tool' which is used by TAC to read UCCE traces. Far from that . We use text editors with coloring schemes when reading logs and that is a long manual process.

One exception is a basic Call Flow tool which is distributed with CVP software. That one is promissing but currently it is still not widely used as it requires pre-created log templates and currently there are only few. However, it works very well for CVP SIP tracing.

However, back to UCCE logs, we are not using any special tools and that is why TAC would generally ask you to provide as much info as possible (like ANI, time stamps, Agent ID, Extension ...etc) in order to analyze logs as it is not enough just to send logs to TAC. So, unfortunatelly there is no magic buttons or crystal balls (YET! ) which TAC or partners and customers can use when reading logs.

With that being said, foundation for reading logs is to really understand processes and tasks, to know the exact call flow and expected behavior and to gather as much info as possible about BAD but also GOOD examples.

Now, related to the second part of your question, with intention to make it more actual by introducing Finesse in the same story and also as I have promised David from above post, I invite you to read great example written by my good friend and colleague Linda, who has created the following:

Finesse Agent Login Trace with the Use of Logs

http://www.cisco.com/en/US/products/ps11324/products_tech_note09186a0080c14a55.shtml

I think this should give you a pretty good overview of what is happening there...

Otherwise, there is also another example from her which is yet to be published and it is about:

How to Identify CTI Server errors in Finesse Logs

Introduction

This document will show you how to quickly identify CTI Server peripheral errors in the Finesse Logs.

Prerequisites

Requirements

Cisco recommends that you have knowledge of Cisco Finesse,  Voice Operating System (VOS) CLI command prompt and UCCE CTI Server messages.

Components Used

The information in this document is based on Cisco Finesse Version 9.1(1).

The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.

Finesse Error-Desktop-webservices Log File

First locate any CTI server errors log in the Error-Desktop-webservices log.  This log will help you identify if the Finesse Server is receiving error messages from CTI Server.  In this log we see the following message at 14:43:50.838.

0000063104: 10.10.10.38: Jun 19 2013 14:43:50.838 -0400: %CCBU_pool-6-thread-57-3-CMD_FAILED: %[ENTITY_ID=2005][ERROR_DESCRIPTION=errorCode 70][command_name=LOGIN]: Received failed command response

The error message indicates ENTITY_ID=2005 or agentID 2005 encountered an error when trying to login.

Finesse Desktop-webservices Log File

Open the Desktop-webservices log for the same timestamp that the error was found in the Error-Desktop-webservices log.

Search the log file to locate the error message.  Search on the timestamp of the error (14:43:50) or the failure code, i.e.failureCode=70 to locate the error message.   The webservices log will provide the "peripheralErrorCode" in this example the peripheral error code received from the backend cti server is 12005.

0000063103: 10.10.10.38: Jun 19 2013 14:43:50.836 -0400: %CCBU_CTIMessageEventExecutor-0-6-DECODED_MESSAGE_FROM_CTI_SERVER: %[cti_message=CTIControlFailureConf [failureCode=70, peripheralErrorCode=12005, text=null]CTIMessageBean [invokeID=19976, msgID=35, timeTracker={"id":"ControlFailureConf","CTI_MSG_NOTIFIED":1371667430836,"CTI_MSG_RECEIVED":1371667430836}, msgName=ControlFailureConf, deploymentType=CCE]][cti_response_time=0]: Decoded Message to Finesse from backend cti server

CTI Server Log

Open the CTI Server log for the same timestamp that the error was found in the Error-Desktop-webservices log and locate the peripheral error code 12005 with approximately the same time stamp.  In this example search on 14:43.

14:43:48:898 cg1A-ctisvr SESSION 5: MsgType:SET_AGENT_STATE_REQ (InvokeID:0x4e08 PeripheralID:5001 AgentState:LOGIN

14:43:48:898 cg1A-ctisvr SESSION 5:         AgentWorkMode:AWM_UNSPECIFIED NumSkillGroups:0 EventReasonCode:50004 ForcedFlag:1

14:43:48:898 cg1A-ctisvr SESSION 5:         AgentServiceReq:0 AgentInstrument:"2005" AgentID:"2005" )

14:43:48:898 cg1A-ctisvr Trace: ProcessSetAgentStateRequest - sessionID 5

14:43:48:898 cg1A-ctisvr Trace: *** AddToAssociateAgentList();           ADDED: SessionID=5 AgentID=2005 PeripheralID=5001

14:43:48:898 cg1A-ctisvr Trace: CSTASetAgentState: InvokeID=0x2f0b1596 Dev=2005 AgentMode=LOG_IN AGID=2005 SG=-1(0xffffffff))

14:43:48:898 cg1A-ctisvr Trace: PrivateData: EventReasonCode=50004 WorkMode=0 NumAdditionalGroups=0 PositionID= SupervisorID= ClientAddress=

14:43:48:900 cg1A-ctisvr Trace:

14:43:48:900 cg1A-ctisvr Trace: CSTAUniversalFailureConfEvent: InvokeID=0x2f0b1596 Error=GENERIC_UNSPECIFIED_REJECTION

14:43:48:900 cg1A-ctisvr Trace: PRIVATE_DATA: PeripheralErrorCode=0x2ee5(12005)

14:43:48:900 cg1A-ctisvr SESSION 5: MsgType:CONTROL_FAILURE_CONF (InvokeID:0x4e08 FailureCode:CF_GENERIC_UNSPECIFIED_REJECTION

14:43:48:900 cg1A-ctisvr SESSION 5:         PeripheralErrorCode:12005 )

14:44:06:483 cg1A-ctisvr Trace:

The CTI server log will show that CTI Server received a SET_AGENT_STATE_REQ for agent 2005 using AgentInstrument 2005. This message was sent to CTI server from Session 5 or the Finesse Server.  CTI server responded to the request with a PeripheralErrorCode:12005.

To determine which session your Finesse server you can use procmon.  Using procmon we can verify that Session 5 is 10.10.10.38 with a client ID of Finesse.  Where 10.10.10.38 is the IP address of our primary Finesse Server.

C:\icm>procmon ucce cg1a ctisvr

>>>>clients

Session   Time Ver Flags   ClientID         AgentID AgentExt Signature

   Host

       5 67:44:42 16   AUX R Finesse     Finesse  (10.10.10.38:32990)

       7 66:53:16 16   AUX R Finesse     Finesse   (10.10.10.138:32995)

>>>>

Note: Detailed instructions on how to use Procmon can be found here.

Peripheral Error Code Descriptions

A quick way to obtain the description for UCCE Peripheral Error Codes is to log onto a UCCE system is to open a command prompt and navigate to C:\icm\bin directory and "checke where error code is the peripheral error code that you have identified.  In this example we would use c:\icm\bin>checke 12005

C:\icm\bin>checke 12005

Error 12005

   Symbol = PERERR_JTAPPLAY_ADDCALLOBSERVERFAILURE

   Level 1 = Login could not be performed - Possible causes are Invalid Instrument; Media Termination Problem o

r other CM issue

   Level 2 = AddCallObserver failed - Please see PIM log for more details

   Level 3 =

C:\icm\bin>

This command will provide a description of the error code and potential causes for the error.  In our example agent 2005 attempted to log-in with an Invalid Instrument.

Thanks,

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Goran, thanks for the great detailed answer!  One more question for you. If my CIM (Cisco Interaction Manager) is integrated with UCCE, what is the best point to start activity routing issues? Thanks again!

Jackson

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Jackson,

You are more than welcome! Feel free to use rating system below each answer so that others can benefit from useful answers...

Now, my favorite topic - "routing issues in CIM-ICM environment" - thanks for asking!!!

OK, this one is very simple, let me explain:

- CIM can work standalone but also can work integrated with ICM

- in case CIM is a standalone, CIM will do all routing decisions

- in case CIM is integrated with ICM, then CIM will extended routing decisions to ICM. That is why the service from CIM side which makes this possible is called EAAS - EXTERNAL Agent Assignment Service.

- So, EAAS talks with PIM on MRPG side of ICM. Therefore, this makes life much easier as they are using MRI (MR Interface) hence they will have some standardized behavior.

- Now, exactly that PIM on your MRPG is THE Border Line to start troubleshooting routing issues.

- AND it is VERY SIMPLE - here is how you do it:

* First enable MR tracing on that MR PIM. Let's take example that ICM instance name is ACME and MRPG node is PG2A where PIM1 is mrpim talking to CIM.

So, open cmd line on MRPG box, procmon to that pim and enable all MR tracing:

> procmon acme pg2a pim1

>>>>ltrace (this will list traces currently enabled)

>>>>trace *.* /off (first you want to disable everything which is enabled currently to avoid noise in logs)

>>>>trace mr* /on (this enables ALL MR traces)

>>>>ltrace
...
...
mr_msg_config                         1       On
mr_msg_comm_session            2       On
mr_heartbeat_messages            3       On
mr_msg_incoming_mr                4       On
mr_msg_outgoing_mr                 5       On
mr_msg_incoming_inrc               6       On
mr_msg_outgoing_inrc                7       On
mr_msg_outgoing_csta               8       On
mr_function_call                         9       On
mr_ECC_variables                      10      On
>>>>

>>>>trace *heart* /off (this DIASABLES HEARBEATS as that will be too noisy in logs)


....
mr_heartbeat_messages            3       Off
....

>>>>


so in few seconds with above commands you enabled MR tracing.

To make long story short:

- when new email comes in, RX will pull it inside CIM and then after going via DB to generate ActivityID it will eventually hot CIM Workflow and there it will reach to integrated queue. That will make EAAS send activity to ICM for routing.

What exactly will happen is that EAAS will send NEW_TASK message to ICM via MRPIM:

08:56:30 pim1     Trace: Application->PG:
Message = NEW_TASK; Length = 73 bytes
   DialogueID = (1) Hex 00000001
   SendSeqNo = (1) Hex 00000001
   MRDomainID = (5002) Hex 0000138a
   PreviousTask = -1:-1:-1
   PreferredAgent = Undefined
   Service = (0) Hex 00000000
   CiscoReserved = (0) Hex 00000000
   ScriptSelector: EIM_SS
ECC Variable Name: user.cim.activity.id
Value: 1041
...

NEW_TASK needs to provide MRD ID and ActivityID.


If there are no available agents for this activity ICM will fail to route it and will send NEW_TASK_FAILURE_EVENT:


09:56:30 pim1     Trace: MR_Peripheral::On_Router_DialogFail:
DIALOG_FAIL  RCID=5001 PID=5001 FailureType=2 NumOfEvents=1 DID=1 DIDRelSeqNo=0 ReasonCode=11
09:56:30 pim1     Trace: Function==>MR_Peripheral::Send_ToApp_NewTaskFailureEvent
09:56:30 pim1     Trace: PG->Application:
Message = NEW_TASK_FAILURE_EVENT; Length = 12 bytes
   DialogueID = (1) Hex 00000001
   SendSeqNo = (1) Hex 00000001
   ReasonCode = (209) Hex 000000d1


However, if all is good, expected behavior is that ICM sends DO_THIS_WITH_TASK message with exact Agent ID back to the application.
After this CIM is in charge of assigning this actvity to that agent.

Here is the example: activity 1057 routed to agent 5010 (SkilTargetID from t_Agent table on ICM side) or that is 1004 from CIM side (USER ID from EGPL_USER table).


13:51:15 pim1     Trace: Application->PG:
Message = NEW_TASK; Length = 73 bytes
   DialogueID = (2) Hex 00000002
   SendSeqNo = (1) Hex 00000001
   MRDomainID = (5002) Hex 0000138a
   PreviousTask = -1:-1:-1
   PreferredAgent = Undefined
   Service = (0) Hex 00000000
   CiscoReserved = (0) Hex 00000000
   ScriptSelector: EIM_SS
ECC Variable Name: user.cim.activity.id
Value: 1057


...
...
...

13:51:15 pim1     Trace: PG->Application:
Message = DO_THIS_WITH_TASK; Length = 90 bytes
   DialogueID = (2) Hex 00000002
   SendSeqNo = (1) Hex 00000001
   IcmTaskID = 150881:202: 1
   SkillGroup = (5013) Hex 00001395
   Service = Undefined
   Agent = (5010) Hex 00001392
   AgentInfo: 1004
   Label:
   ApplicationString2:
   Call Variable 1:
   Call Variable 2:
   Call Variable 3:
   Call Variable 4:
   Call Variable 5:
   Call Variable 6:
   Call Variable 7:
   Call Variable 8:
   Call Variable 9:
   Call Variable 10:
ECC Variable Name: user.cim.activity.id
Value: 1057


So, bottom line, start from MR logs to see if NEW_TASK and DO_THIS_WITH_TASK are present for the same activity. If yes, then ICM job is done and issue is on CIM side. If DO_THIS_WITH_TASK is not there then it fails on ICM side.
This will point you furhter which side you need to investigate more.


I hope this helps!

Thanks,
Goran

New Member

Re:Ask the Expert: Troubleshooting Unified Contact Center Enter

Hi Goran,

I have a question about redundancy amd failures... how does UCCE handle it if connection between side A and B got disconnected ( both public and private WAN links) ? and how can one recover the data after they reconnect together...

also, how is the "RecoveryKey" calculated, and is it possible to reset it manually?


Sent from Cisco Technical Support Android App

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi!!!

Thanks for your questions!

I hope you find this event useful!

Ok, since there are more and more questions coming I will have to limit my answers to shorter ones

Ok, so your first question is answered in SRND guide in this chapter:

http://www.cisco.com/en/US/docs/voice_ip_comm/cust_contact/contact_center/icm_enterprise/icm_enterprise_10_0_1/Design/design_guide/UCCE_BK_UEA1023D_00_unified-cce-design-guide_chapter_011.html#UCCE_CN_UBD4EEF9_00

Check under "Response to failures of both networks"

However, I also invite you to read all other scenarios there as it really describes what happens when only Private network fails, or only Public network fails... or when only one Logger fails, or PG...

So, it will tell you that PG can buffer some data, also it tells you there that Logger can be 12 hours down and if more then you will need to do MANUAL sync of DBs....etc...

I am sure you will get very useful information from that document!

Now, your question about RecoveryKey calculation. I will use explanation I learned from BU while working on some case:

RecoveryKeys are numbers automatically generated by the CallRouter and once the data is replicated to the HDS are no longer used.
So, the system automatically generates the Recovery Key in all of ICM historical tables as they are written from the "temp" tables into the real ICM tables by the Recovery process.  This process generates the key based on the date/time of the Logger (down to the nano-second) and it figures out the number seconds that have passed since the starting date of 1/1/1995.  This gives us a Julian Date in Seconds from our starting point to be able to compare dates/times. 

This seconds value is multiplied by 1,000 to create the actual key -- based on the fact that we don't expect to be able to write more than 1,000 records per second into any one table.  Each time a table is
written to, the key is incremented for the next record within the same table.  This key is set when the Logger HistLogger process starts on the Logger -- it even spits out a message telling you that this is
the new recovery key for all historical data.

The recovery keys are kept unique by this concept on a table-by-table basis -- every table starts with the same key value --  then it is incremented in that table by one on every record written in that table.

When the data is then replicated from the Logger to the HDS, it is not written based on the recovery key, it is written based on the table itself--in alphabetical order taking each new item in their recovery key order within the table and writing to the HDS.

So, answer to your question - can you manually reset those keys - is ABSOLUTELY NO.

I hope this helps!

Thanks,

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Hi Goran,


I've been tryin to get the finesse up and running, but i'm getting the invalid user id/ password error when i try to login..i've entered valid parameters in the finesse admin page..and have also verified to user mapping of the awdb  in the sql server management studio (i'm using windows authentication).


Is this related to awdb connection?


Thanks

Kishore B

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Kishore!

Thanks for the question!

I had my colleague Zaid Salama preparing this while still being very busy with his own work so I want to thank him for that!

Ok, so here we go:

Regarding this question, I believe we will need more information on the Finesse version, UCCE version,..etc however from the first look on the description, I would expect that the issue is related to the AW, if you are sure that the username and password are correct, the user has the correct privileges, and the connectivity between both sides are good, it might be that you are using NTLMv2 authentication on the AW.

The Docuemnation defect "CSCuj95347 Document Finesse JDBC driver cannot authenticate using NTLMv2" confirms that NTLMv2 is not supported by Finesse as Finesse used a third party driver, that driver doesn't support NTLMV2. The resolution to this is to disable the NTLMv2 on AW, that can be done by running the following:

1) Disable NTLMv2 on the AW server hosting the AWDB and reboot the AW server.

2) Administrative Tools > Local Security Policies > Security Settings

>Local

Policies >Security Options

3) Network Security: LAN Manager authentication Level has "Send NTLMv2 response only" - Choose "Send LM & NTLM responses" or any option that requires only NTLMv2 reponses.

4) Network Security: Minimum session security for NTLM SSP based (including secure RPC) clients has "Require NTLMv2 session security" - UNCHECK this

5) Network Security: Minimum session security for NTLM SSP based (including secure RPC) servers has "Require NTLMv2 session security" - UNCHECK this

6) Reboot AW server.

Hope this helps!

Silver

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Goran,

Thank you for doing this

I've always wondered... why doesnt TCD, or any other table within an ICM database hold the information as to who disconnected the call? I find it combersome to have to pull CDR logs and try to correlate it to TCD records just to find out who disconnected the call. I'm sure there's a good reason to this and I'd certainly like to hear your input on it.

Additionally, is there an easier way to correlate TCD and CDR records? Can RTMT play a role here?

Similar to what was asked here: https://supportforums.cisco.com/thread/2208349

Thanks

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Omar!

Welcome to the party!

Yeah, "who disconnected call" investigation was always part of the call control side of things... historically, ICM worked with ACDs and never had to worry about call control hence that part was always left out since from ICM point of view it was not something to be concerned of as it is irrelevant for ICM calculations.

Well, times are changing so if there is real demand for this I believe that pressure will and should come from customer's side by contacting local Cisco Account Managers and asking them to open Product Enhancement Request (PER) for that. Of course, BU will need them to provide some business case / justification. As it is less of a technical than capacity and priority issue. Keep in mind that it will require developer hours and possible protocol changes. That is why there is no action on that side as risks are higher than gain (considering call control part already has that info).

OK, but now, your real question about how to correlate TCD and CDR.

Not sure if you have seen this, but more than 3 years ago I have already answered that question here:

https://supportforums.cisco.com/thread/2060220

I hope that will give the complete answer as I have outlined Cradle to Grave mapping there.

In case someone is not able to access that, i will paste it here:

Mapping CDR to TCD

-----------------------------

The UCManager CDR does have a mapping to ICM Termination Call Detail record, but you need to do some conversion.

The CDR globalCallID_CallManagerID and globalCallID_CallId is combined to create the TCD PeripheralCallKey.

The globalCallID_CallManagerID is moved into the high order byte. To shift it over the properly, multiply it by the hexadecimal value 1000000 and add it to the globalCallID_CallId.

So:

(globalCallID_CallId * 0x1000000) + globalCallID_CallId = PeripheralCallKey

These IDs are not unique because the same PeripheralCallKey and CallID are re-used in redirect, transfer and conference scenarios.

Also, this only works with in a single cluster. So in a multiple cluster environment, you need to map Cluster CDRs to a specific PeripheralID.

Cradle to Grave Call Tracking in ICM

----------------------------------------------------------

The RouterCallKeyDay, RouterCallKey, and RouterCallKeySequenceNumber will track a call from its first route until its final call leg.

The RouterCallKeyDay and RouterCallKey combine to provide common attribute across the calls.

The RouterCallKeySequenceNumber gives you some sense of order of when calls were created. (gselthof: so note, 'some sense' is not guaranteed order!!!)

In a multi-peripheral environment, this requires routing between peripherals. This means calls to the IVR need to be translation routed, and calls to other agent clusters need to be routed as well.

Identifying Routed Agent TCDs

----------------------------------------------------------

You will want to filter out the TCDs created for the CVP call legs, and calls are generated for agents for internal agent to agent calls.

Use the AgentSkillTargetID to identify agent, SkillGroupSkillTagetID to identify SkillGroup, and CallTypeID to identify Call Type / program.

If all three of these values are filled in, you know you got a call that was routed to an agent.

Sometimes more than one TCD will meet these three criteria for the same PeripheralCallKey In those cases, the one with the lowest RouterCallKeySequenceNumber will identify the first call answered by the agent.

CallDisposition

-----------------------------

The CallDispositionFlag is the best indicator to find out if a call was handled or not. There are a bunch of CallDispositions. The CallDispositionFlag distills the results down to 7 categories.

You can find details on what the CallDispositionFlags are in the schema help or schema guide.

I hope this helps!

Goran

New Member

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Hi Goran, Thanks for details, Can you please confirm if SIP refer is supported for call transfer with UCCE considering we have CUBE as well as 3rd party SBC

And My 2nd Question is for Blending support, inbound and outbound voice calls for same agent. Do we need specific license for same.

Thanks

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi Maheshwar!

Thanks for participating!

Let me try to address this by pointing to this link:

http://www.cisco.com/en/US/docs/voice_ip_comm/cust_contact/contact_center/customer_voice_portal/srnd/9_0/CCVP_BK_C7053373_00_cvp-srnd_chapter_01010.html#CCVP_TP_SAAE8C0F_00

That is where we confirm when SIP refer transfer is supported. Please check.

For the second question, if it is about Outbound Option (Dialer) then I believe yes, you need special licenses for that. You can check ordering guide here:

http://www.cisco.com/web/partners/downloads/partner/WWChannels/technology/ipc/downloads/CCBU_ordering_guide.pdf

I am not from Pre-Sales or Sales side but I can see this is being mentioned there:

5.1.1.2.2 Unified CCE Agent Licenses for Voice Applications with Outbound Option

Unified CCE Outbound Option requires purchase of at least Unified CCE Enhanced or Premium Agent voice application licenses and the appropriate number of Dialer Port Licenses.

...etc...

Have a great weekend!

Cheers,

Goran

Silver

Ask the Expert: Troubleshooting Unified Contact Center Enterpri

Thanks for the in-depth reply!

New Member

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Thanks Goran for your clarification and links for failover and RecoveryKey

A couple of things related here:

- From my understanding about your explanation of how RecoveryKey is generated, does this mean that the RecoveryKey of ANY UCCE system is similar? As it depends on time, this means that today's RecoveryKey is always bigger than yesterday's, even if those were 2 different clusters at 2 different customers?

--> if that's the case, then why in a technology refresh system I can't migrate the HDS table AFTER the logger? This is based on the upgrade guide of UCCE, where all scenarios of upgrade have the HDS being upgraded first (or at the same time as the logger)

- In case side A and side B historical data (from loggers and also hds) are not equal due to some failure happening at some point (we checked icmdba and number of records is different), what is the best way to fix that?

Re: Ask the Expert: Troubleshooting Unified Contact Center Ente

Hi!

Ok, let me get to the answers quickly:

- Yes. But they are independent between systems. One system will not create exact as other system. However, it is true that RC can only increase but looking from the own initial base on the system. RC between different customers should not even be discussed as it is not something which can or should be used anyway.

- Install/Upgrade Guide:

If you complete the upgrade of the main Administration & Data Server within the Logger purge window (usually 14 days), you can replace the temporary Administration & Data Servers with the upgraded Administration & Data Servers for reporting. The data replication process fills in any missing data.

So that means that since you would probably need your AW for some tasks during the upgrade, that you migrate it before/at the same time. However, if you don’t want then you can setup NEW TEMPORARY one for that and then migrate real AW/HDS later as said above.

- Recommendation is that you don’t bother with data holes as that is why you have two HDS on both sides. Since it can happen that you have data hole then you can just point and take reports from the side which has data. That action is anyhow just limited to the time you will need reports for that particular missing period. You collect reports and you are done. You don’t need to bother with that anymore. Not sure why you would need to really keep them in total sync as they are there to compensate for those data holes. If you want to keep them in total sync then stop icm services and do full backup of HDS1, start services on HDS1, copy that file over to HDS2, stop services on HDS2, delete HDS DB on HDS2 and recreate DATA and LOG size parts as for the HDS1 so that you can restore that HDS1 backup on HDS2 box (ICMDBA has limit to 32GB so once you create with ICMDBA then use SQL Studio to expand file parts to match HDS1 settings). Once you restore, truncate recovery table on HDS2 and start services on HDS2. Now you have both with same data.

Cheers,

Goran

9088
Views
183
Helpful
46
Replies