Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements

Welcome to Cisco Support Community. We would love to have your feedback.

For an introduction to the new site, click here. If you'd prefer to explore, try our test area to get started. And see here for current known issues.

New Member

ACS redudancy

Greetings all,

I am tryin to configure an ACS redudancy model in our routers. We have 2 ACS servers runnin on W2K. In the router configuration I've made an "aaa group server tacacs+ test" and denoted our 2 ACS server from the global config.

However, when I shutdown the first ACS server, the whole thing don't work and I get a strange error from the debug (see bellow).

Bellow is a snap-shot from the config just in case a left something out.

Has anyone implement this ?

Thanx in advance,

Kostas

-----------------------------------------------------------------------------------------------------------

aaa new-model

!

!

aaa group server tacacs+ TEST

server 10.10.10.1

server 10.10.10.2

!

aaa authentication login telnet group tacacs+ local

aaa authentication login aux local

aaa authentication login console local

aaa authorization exec default group tacacs+ local

aaa accounting exec default start-stop group tacacs+

aaa session-id common

.

.

.

!

tacacs-server host 10.10.10.1 key tes1t

tacacs-server host 10.10.10.2 key tes2t

tacacs-server directed-request

.

.

.

line con 0

login authentication console

line aux 0

login authentication aux

line vty 0 4

login authentication telnet

transport input telnet

!

end

-----------------------------------------------------------------------------------------------------------

debug-error:

Jun 2 13:04:31.611: TPLUS: Queuing AAA Authentication request 12951 for processing

Jun 2 13:04:31.611: TPLUS: processing authentication start request id 12951

Jun 2 13:04:31.611: TPLUS: Authentication start packet created for 12951()

Jun 2 13:04:31.611: TPLUS: Using server 10.10.10.1

Jun 2 13:04:31.615: TPLUS(00003297): Select released but nopeername.. Failover

Jun 2 13:04:31.615: TPLUS: Choosing next server: 10.10.10.2

Jun 2 13:04:36.616: TPLUS(00003297): Select Timed out

15 REPLIES
New Member

Re: ACS redudancy

Since you have a tacacs group defined, you should change your aaa authen statement to select that group name rather then the tacacs+ keyword. i.e.:

aaa authentication login telnet group TEST local

Hope this helps...

Marcus

Cisco Employee

Re: ACS redudancy

Hi,

Couple of things to check;

1. At the router, run the below test command to test the TACACS operation;

test aaa group tacacs username password

for e.g test aaa group tacacs cisco cisco

2. Try to bump up the tacacs timeout value from the default 5 sec to 10 sec.

3. What is the version of the IOS? There could be a bug associated.

CSCdx41454

4. Are you using the command

ip tacacs source-interface

Thanks,

yatin

New Member

Re: ACS redudancy

Hi,

Well I did all these things but no luck.

I checked for the bug id. The thing is that I am not using ip tacacs-source interface loopback 0 on my router that I have for testing reasons. I have a single FastEth, and this is what I have also configured in the ACS server.

Any more ideas ?

Thanks in advance,

Kostas

Cisco Employee

Re: ACS redudancy

OK, next things to check is whether the secondary TACACS server is realy setup correctly. You have authorization configured, check on the ACS that the EXEC is selected. Try to match the settings with those of the working server.

Ohter thing to check is to see that all the ACS services are indeed running on that server.

If this still doesn't resolve the issue, please send the 'sh ver' for the router and take a look at the details in the package.cab file.

Thanks,

yatin

New Member

Re: ACS redudancy

Hi Yatin,

Well, the secondary TACACS is working properly. I configured the router first only with the main TACACS, then only with the secondary, and they both worked well.

Only the redudancy model doesn't seem to work.

The IOS is 12.1.(3)T1.

What I don't understand is the thing about the package.cab file.

Could you please explain ?

Thanks in advance,

/kostas

Cisco Employee

Re: ACS redudancy

Hi Kostas,

What if you reverse the order of the tacacs server from

tacacs-server host 10.10.10.1 key tes1t

tacacs-server host 10.10.10.2 key tes2t

to

tacacs-server host 10.10.10.2 key tes1t

tacacs-server host 10.10.10.1 key tes2t

As for the package.cab file, here's the procedure; looks lengthy but it is simple.

Follow these instructions even if your server is already running in detailed logging mode. This

will ensure that all the proper service startup information is included in the package.cab file.

If

these instructions are not followed properly, we will need to request the information again.

- Log onto the ACS server itself as the local administrator.

- Browse to the UTILS directory in the ACS program directory.

- Run the program there called CSSupport.

- Select "Set Log Levels Only" and click Next.

- Select "Set Diagnostic Log Verbosity to Maximum."

- Check "Keep TACACS+ Packet Capture."

- Check "Keep RADIUS Packet Capture."

- Click Next, then click Finish.

At this point we need to duplicate the issue. Do whatever is causing the problem, or wait for the

problem to occur again if it's not triggered by a direct sequence of events. Once that's done, we

need to gather the verbose logs created. To do so, follow the instructions below AFTER the problem

has been recreated and recorded:

- Log onto the ACS server itself as the local administrator.

- Browse to the UTILS directory in the ACS program directory.

- Run the program there called CSSupport.

- Select "Run Wizard" and click Next.

- If we need more than today's logs:

-- Put a check in both "Previous Logs" checkbox.

-- Select the number of days to go back.

- Click Next four times.

- When the Finish button appears, click it.

The package.cab will be found in the UTILS\Support directory under the ACS program directory. This

file contains all of the log information from ACS and limited information about the computer that

ACS is running on. All collected information is essential for proper troubleshooting.

New Member

Re: ACS redudancy

Hello again,

So, I did reversed the TACACS servers in my routers but that didn't solved anything.

Once again let me give a short discription to what I am doing.

Configure 2 ACS servers in my routers.

After a successfully login with the primary server, I shutdown it (the primary ACS) and try to login with me secondary ACS. And there where is my problem.

I also tried two methods. The first is to simply add the ACS servers in the global config and the other, after putting them in global config also putting them in "aaa server group tacacs TEST" and change the aaa authentication, authorasation, properly. None of these worked.

Now for the package.cab, I produced. Which of the files is necessary for you and where can I sent them.

Kind regards,

Kostas

Cisco Employee

Re: ACS redudancy

Hi Kostas,

The Failed Attempts csv and the tcs.log would be a good starting point. How about the ACS services on this server? Are they all running fine? Has this server even once authenticated a login properly? What you need to confirm is that the server is functioning properly as a primary server. That's why I asked to put this server as the first entry.

What was the result of the "aaa test ......" command?

Thanks,

yatin

New Member

Re: ACS redudancy

Hi Yatin,

Well I am in a middle of a strange situation.

After checking the logs that you pointed out, I didn't find anything strange.

So I reversed one more time the configuration. I mean I put the active ACS (10.10.10.1) as an backup and the backup (10.10.10.2) as an active in the router configuration.

ip tacacs-server host 10.10.10.2

ip tacacs-server host 10.10.10.1

Then I unpluged the network cable from the current active ACS (10.10.10.2) and tried to login in my router and out of nowhere everything worked just fine !

Then I reconfigured my router as it was (reversed the ACS in the previous form) and it didn't worked.

ip tacacs-server host 10.10.10.1

ip tacacs-server host 10.10.10.2

The strange in all this is that my active ACS (10.10.10.1) is doing a FULL replication to the backup (10.10.10.2) in order to have both ACS the accurate configuration. So when I first thinking that there was something wrong in my active ACS (10.10.10.1) I end up in the conclusion that it couldn't be anything wrong at the active ACS since it's doing the replication. So, since the backup ACS has the total same configuration as the active (I tripled checked it!) it shouldn't worked when I did the reverse. Correct ?

I know it sounds a bit confusing but still this is the true story. :-)

Any more good ideas ?

Thanks in advance,

/kostas

PS1: If you still want the package.cab you can send me an e-mail and I will reply to it. I can't post them here since they contain sensitive information

PS2: Is it possible the problem occured because of a RADIUS distribution table ? But then again the backup ACS has the same distribution table....

Cisco Employee

Re: ACS redudancy

Hi Kostas,

Let me take a look at the package.cab.

Thanks,

yatin@cisco.com

Silver

Re: ACS redudancy

Hi,

Did you make the following changes made by the first reply:

From,

aaa authentication login telnet group tacacs+ local

To,

aaa authentication login telnet group TEST local

If this doesn't ressolve the problem, problem seems to be with the IOS code.

Thanks,

Mynul

New Member

Re: ACS redudancy

Hi Mynul,

Yes I did that.

It was my first change and to tell you the truth I felt a completely idiot when I saw my obvious mistake.

Nevertheless that didn't solved my problem.

Kind regards,

/kostas

Cisco Employee

Re: ACS redudancy

Hi Kostas,

The standby/backup ACS server doesn't seem to be in the domain NOC, i.e. member of this domain. Please check that. If it is in a different domain, then there needs to be a proper trust relationship between those two domains.

Error in the log file ---------------

We are NOT a member of a domain => we cannot authenticate accounts on other trusted domains.

---------------------------------------

Because of this, it seems that there was no replication happening between the primary and secondary servers. The primary ACS in installed on the PDC of domain NOC.

Thanks,

yatin

Silver

Re: ACS redudancy

Hi,

This shouldn't cause any problem with the ACS replications. But, its a problem if you want to authenticate users against the domain controller as the minumum requirement is to install ACS in a member server. Have you intergrated ACS with the domain controller, i.e, are you trying to authenticate users with the domain accounts thru ACS. If thats the case, may be primary acs is sending mal packets when cannot authenticate users against the domain controller. To elimate the pssoibility that its no ACS, please stop the primary acs services all together, then see if router is falling back on the secondary server. If that doesn't happen, then its the problem on IOS, if you can share the vesrion info on the router, can suggest if this is bug on the code. Thanks,

Mynu;

New Member

Re: ACS redudancy

Hi Mynu,

Well no, that isn't the case. We are not trying to authenticate users with the domain accounts. Actually we don't have domain accounts at all. :-)

We use ACS only for our NOC people for strictly telnet purposes, for some custom scripts and for some VoIP testing. But the main reason of the existence of ACS is for telnet reasons.

The case is the second example that you are giving.

I stop all the services from the primary ACS (even unpluged the network cable) and then try to login in the router with the backup ACS, and that doesn't work.

The thing with the replication cross my mind so what I did is to delete ALL the entries in the backup ACS and do a manual replication and all worked well.

As for the IOS bug, I read about the one (CSCdx41454) for the problem with routers that have loopbacks but the router which I experiment with don't have a loopback and has only one FastEthernet active with a default-gateway.

As for the version, well to tell you the truth we have many routers from GSR12000, 7200, 3640, AS5300, in different PoPs so it's kind difficult to get the IOS versions from all of them. In a statistically experiment, I tried to check the redudancy in various boxes in various PoPs (after stoping the active ACS) but none of them worked so I thought that couldn't be an IOS bug. Nevertheless if you think that there is a bug problem I can sent you a full list of the various IOS versions plus the package.cab files as I did with Yatin. Just for the records I will c/p the sh ver output of the router I am experiment.

Finally, the strange situation is that when the backup ACS take the place of the active ACS and the active ACS becomes backup everything seems to work well. At least in a couple of routers that I've tested it.

Thanks in advance,

/kostas

----------------------------------------------------------------------------------------------------

#sh ver

Cisco Internetwork Operating System Software

IOS (tm) 5300 Software (C5300-JK8S-M), Version 12.2(11)T, RELEASE SOFTWARE (fc1)

TAC Support: http://www.cisco.com/tac

Copyright (c) 1986-2002 by cisco Systems, Inc.

Compiled Wed 31-Jul-02 20:11 by ccai

Image text-base: 0x60008938, data-base: 0x61730000

ROM: System Bootstrap, Version 12.0(2)XD1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)

BOOTLDR: 5300 Software (C5300-BOOT-M), Version 12.0(4)T1, RELEASE SOFTWARE (fc1)

uptime is 6 weeks, 6 days, 2 hours, 20 minutes

System returned to ROM by reload at 11:20:41 EDT Tue Apr 22 2003

System restarted at 11:21:31 EDT Tue Apr 22 2003

System image file is "flash:c5300-jk8s-mz.122-11.T.bin"

cisco AS5300 (R4K) processor (revision A.32) with 131072K/16384K bytes of memory.

Processor board ID 24710123

R4700 CPU at 150Mhz, Implementation 33, Rev 1.0, 512KB L2 Cache

Channelized E1, Version 1.0.

Bridging software.

X.25 software, Version 3.0.0.

SuperLAT software (copyright 1990 by Meridian Technology Corp).

TN3270 Emulation software.

Primary Rate ISDN software, Version 1.1.

Backplane revision 2

Manufacture Cookie Info:

EEPROM Type 0x0001, EEPROM Version 0x01, Board ID 0x30,

Board Hardware Version 3.2, Item Number 800-2544-04,

Board Revision B0, Serial Number 24710123,

PLD/ISP Version 0.0, Manufacture Date 25-Feb-2001.

1 Ethernet/IEEE 802.3 interface(s)

1 FastEthernet/IEEE 802.3 interface(s)

128 Serial network interface(s)

4 Channelized E1/PRI port(s)

60 DSP(s), 120 Voice resource(s)

128K bytes of non-volatile configuration memory.

32768K bytes of processor board System flash (Read/Write)

8192K bytes of processor board Boot flash (Read/Write)

-----------------------------------------------------------------------------------------------------------

328
Views
0
Helpful
15
Replies