cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1795
Views
0
Helpful
13
Replies

CAS not connecting and out of sync

Khurram Noor
Level 1
Level 1

Dear all,

i changed the passwords for all the cams and cas in my environment for root. since then my cas are showing out of sync in the cam and also not responding to ping request. please guide what should be done to correct it.                  

13 Replies 13

Tarik Admani
VIP Alumni
VIP Alumni

When you reset the passwords did you accidentally issue a new self signed certificate on the manager of any of the Cas?

Sent from Cisco Technical Support iPad App

no i didnt do anything like this... only i gave a reboot to all of the cams and cas. since then it is happening.

Were you able to get this working? I see in the other thread that you werent tagging your mangement traffic on eth0?

Thanks,

Tarik Admani
*Please rate helpful posts*

Well, it was working and not working in past. But this tagging was missing from the very begining. Anyways afterwards i wasnt able to get the cas connected to cam as they were shown not connected and in out of sync state. So i removed the cas from cams and readded them to cam. Now the cas are added back and i found that all the configuration done previously is missing. Unfortunately, the engineer who deployed it has left and i am very new to NAC and dont know many things. My deployment is OOB Real IP gateway. I saw in the document that in this mode we dont need SVI over switch for managed network for cas. but i found that the switch SVI for all untrusted vlans were present both on switch and on the cas. Also i couldnt see any static routes in the cas, where in the document it is written that switch should point towards the trusted interface of the cas for all managed network. So i think cas should have some static routes for entire network reachability in the static route portion. But i am confused which interface should be used for reachability ...should it be the trusted interface or the untrusted interface. can you suggest?

Khurram,

If the routing is set to use the CAS untrusted interface as the next hop or if there is an ACL set to diver all traffic over to the CAS then you are ok in this case. If the untrusted interface is on the same subnet or if there is a managed subnet entry on the CAS then that means this is a L2 deployment (no need for static routes).

Do you have a backup of the cam configuration on hand? If not, there should be a daily backup that is run on the CAM that you can restore too.

Follow this guide to see if you can restore a dailybackup to before you removed the CAS.

http://www.cisco.com/en/US/docs/security/nac/appliance/configuration_guide/49/cam/m_admin.html#wp1053310

Thanks,

Tarik Admani
*Please rate helpful posts*

Thanks Tarik,

So i will try to restore a backup. I can see some snapshots in the backup area of cam. I shall tell you how it went.

Well i did got the snapshot restored but after it when i rebooted the cass both didnt come up. They are now unreachable over the network. One of them have booted properly and is showing login prompt. the other one is stuck on this line

type=1404 audit(1344925451.361:2) selinux=0 auid=4294967295 ses=4294967295

i have added one snapshot along which messege have a look.

Since they are in HA can you see if one of the CAS appliances is on the network and is able to connect to the network. Also can you issue the following command on the CLI:

find / -name *manager.log

then copy the path and run tail -f

then see what the error messages are appearing from the CAS.

Thanks,

Tarik Admani
*Please rate helpful posts*

I tried to ping the gateway for both trusted and untrusted from the cas but not able to reach it. on the switch i see both the cas interface are connected.

when i issued find / -name *manager.log on the cas which was displaying login prompt. i got following messege.

"find: WARNING: Hard link count is wrong for /click: this may be a bug in your filesystem driver, Automatically turning a find's -noleaf option. Earlier results may have failed to include directories that should have been searched".

For the other cas i think there is some corruption in linux boot. please suggest.

Sorry for the confusion, please run the debugs on the active manager.

Thanks,

Tarik Admani
*Please rate helpful posts*

Ok i run this now over active cam. here is the output

[root@rgotc-dc-naccam1 ~]# tail -f /perfigo/control/tomcat/logs/nac_manager.log
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:200) [tomcat-ajp.jar:na]
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) [tomcat-ajp.jar:na]
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:775) [tomcat-ajp.jar:na]
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:704) [tomcat-ajp.jar:na]
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:897) [tomcat-ajp.jar:na]
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) [tomcat-util.jar:5.1]
        at java.lang.Thread.run(Thread.java:662) [na:1.6.0_26]
2012-08-14 10:39:40.488 +0400   pool-4-thread-1 INFO  com.cisco.nac.core.reporting.DashBoardDataProvider - No Data request for 30 minutes.
2012-08-14 10:39:40.489 +0400   pool-4-thread-1 INFO  com.cisco.nac.core.reporting.DashBoardDataProvider - Destroying tasks
2012-08-14 10:39:40.490 +0400   pool-4-thread-1 ERROR com.perfigo.wlan.web.Util                          - Util - IntShell : null

2012-08-14 11:44:22.881 +0400   TP-Processor23 INFO  com.perfigo.wlan.web.Util                          - eth0 MAC:E4_1F_13_F1_F4_91
2012-08-14 11:44:26.558 +0400   TP-Processor23 INFO  com.cisco.nac.core.reporting.DashBoardDataProvider - Creating reporting tasks
2012-08-14 11:44:26.558 +0400   TP-Processor23 INFO  com.cisco.nac.core.reporting.DashBoardDataProvider - Finished creating new future tasks

2012-08-14 11:46:33.190 +0400   TP-Processor23 ERROR com.perfigo.wlan.web.admin.ConnectorClient         - Communication Exception : Could not connect to the Clean Access Server Exception creating connection to: 172.16.16.2; nested exception is:
        java.net.SocketTimeoutException: connect timed out
2012-08-14 11:46:33.194 +0400   TP-Processor23 ERROR com.perfigo.wlan.web.admin.SecureSmartManager      - Could not connect to 172.16.16.2

172.16.16.2 and 172.16.16.4 are the Ips of the two CASs

Khurram,

Could you please open a tac case for them to look at this issue. There seems to multiple issues with your current solution and then can help you get these issues ironed out.

thanks,

Tarik Admani
*Please rate helpful posts*

ok i am opening a tac case. Thanks for your kind support