cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4902
Views
9
Helpful
32
Replies

How to fix Fault Tolerance on ACE 20?

netternewbie
Level 1
Level 1

Hi,

 

I have two ACE 20's in two 6500's. I have an Admin context with a number of contexts. Unfortunately one context has the backup as FSM_FT_STATE_STANDBY_COLD not Standby_Hot.

 

I have checked its configuration and noticed a number of lines missing from the correct active context. I know now I have to force the config across by taking the ft-group out of service.

 

My question is do I do this on the working ACE or do it on both ACE's admin groups? 

 

I think this is what I do.

config ft group X

no inservice

Give it a few mins to sync across to broken ACE and then put it back inservice?

 

Or do I need to do this on the broken ACE? I don't want to lose the good config and the broken config to copy across.

1 Accepted Solution

Accepted Solutions

Hi Netter,

I thought you had already applied. But i am glad issue is resolved.

Regards,

Kanwal

View solution in original post

32 Replies 32

Kanwaljeet Singh
Cisco Employee
Cisco Employee

Hi Netter,

You can do "no ft auto-sync running-config" and then do "ft auto-sync running-config". Do it same for start-up config too. This should take care of the issue. Also, note that standby-cold state can also be due to missing certificates or any scripts which are missing. Since other contexts are ok it cannot be software mismatch or license mismatch. Kindly do this and this should take care of the issue.

This is done on ACTIVE ACE in configuration mode from affected context itself.

Regards,

Kanwal

Thanks Kanwalsi,

 

I will give this a go. i take it it will cause an outage for services in this context?

 

 

Hi Netter,

There would be no impact on the services. This will just disable and enable configuration SYNC.

Regards,

Kanwal

Thanks Kanwalsi,

 

Unfortunately this did not fix problem. I have looked at crypto files and all files are the same.

Hi Netter,

That should have fixed the problem. Send me the output of "show ft group detail".

Regards,

Kanwal

FT Group                     : 7
No. of Contexts              : 1
Context Name                 : *************************
Context Id                   : 7
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_ACTIVE
My Config Priority           : 200
My Net Priority              : 200
My Preempt                   : Enabled
Peer State                   : FSM_FT_STATE_STANDBY_COLD
Peer Config Priority         : 100
Peer Net Priority            : 100
Peer Preempt                 : Enabled
Peer Id                      : 1
Last State Change time       : Wed Oct 30 15:33:06 2013
Running cfg sync enabled     : Enabled
Running cfg sync status      : Peer in Cold State. Error on Standby device when applying configuration file replicated from active
Startup cfg sync enabled     : Enabled
Startup cfg sync status      : Peer in Cold State.  
Bulk sync done for ARP: 0
Bulk sync done for LB: 0
Bulk sync done for ICM: 0

Hi Netter,

Can you try the same commands as suggested above from Admin context. Please do for both running and start-up. Also, do you know if there are scripts that you have on ACTIVE for probing and missing on standby?

Regards,

Kanwal

Thanks Kalwalsi, willcheck now. How do I check for active scripts?

 

Also if I try from admin context will that sync across all configs or do I go into this group 7 context and do it from there?

Hi Netter,

You can do dir disk0: to see if there are any scripts in there. Ensure that they are there on the standby as well. You cannot replicate that and you will need to ftp them to standby. And yes the sync from Admin will be for all contexts. There should be no problem but if you have any apprehensions of doing this in Admin, try to do this in your downtime.

Also, let me know if you get any error while disabling and then enabling the auto-sync.

Regards,

Kanwal

Thanks Kanwalsi,

What I have noticed is that there is a script on the active context dated 2009 and the same script is on the standby context dated april 2011. Strange. Is there a quick way to copy the 2009  one on active loadbalancer to  standby loadbalancer.

Hi Netter,

The date should not matter. If the script is same it shouldn't be a problem. Check name and size to see if the file is same. Else you would need to upload the script to standby using FTP (no other way) as well or remove the script from ACTIVE if you don't use/need it.

Regards,

Kanwal

Hi Kanwalsi,

One thing I have noticed the scripted probes are the same size but on the active in disk0 it is called ****_probe but on standby it is ****-probe. 

strange the correct name is actually ****-probe the way it is on non working config. 

 

Hi Netter,

Try to upload with same name and see if that makes a difference. Ensure that crypto files are also same. Other than that i cannot think of anything which should cause the issue. If issue still persists, i would suggest opening a TAC case and let them have a look at it.

Regards,

Kanwal

Hi Kanwalsi.

From the admin context by doing 

ft group x

no inservice

inservice

It fixes fault tolerance between the aces but it wipes some config from a sticky serverfarm I was working on in that context. When I put the config back in fault tolerance breaks.

Very strange. Any ideas? or may it still be the probe name?

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: