Infinite reload of STANDBY FWSM unit due to incorrect configuration replication from ACTIVE FWSM unit.

Unanswered Question
Aug 1st, 2010

Dear Cisco Experts and Cisco experienced persons!

This is a failover problem. We have A/S FWSM failover configuration (ver. 4.0.6 each)

This problem suddenly appears when our ACTIVE unit configuration reached 125K in size. STANDBY can't load this configuration, because it's incorrectly

recognized ACL entries (detect some strange mistakes), and this happened each times in different locations. And after that STANDBY unit is reload. And this is the cycle process:

1-st reload:

Aug  x 02:08:01.172 MSK: %DIAG-SP-6-RUN_MINIMUM: Module 4: Running Minimal Diagnostics...
SW2#
Aug  x 02:08:04.260 MSK: %MFIB_CONST_RP-6-REPLICATION_MODE_CHANGE: Replication Mode Change Detected. Current system replication mode is Ingress
Aug x 02:08:04.300 MSK: %SNMP-5-MODULETRAP: Module 4 [Up] Trap
SW2#
Aug  x 02:08:03.952 MSK: %DIAG-SP-6-DIAG_OK: Module 4: Passed Online Diagnostics
SW2#
Aug x 02:08:07.975 MSK: %OIR-SP-6-INSCARD: Card inserted in slot 4, interfaces are now online

SW2#session slot 4 processor 1
The default escape character is Ctrl-^, then x.
You can also type 'exit' at the remote prompt to end the session
Trying 127.0.0.41 ... Open

FWSM# Beginning configuration replication from mate.
Access Rules Download Complete: Memory Utilization: < 1%

Config Sync Error: Following command could not be executed on standby
         access-list XXX commit-status commfttgl ,hne 93 extended permit object-group DM_INLINE_SERVICE_19 host me host you 
Context: single_vf

******REPLICATION OF CONFIGURATION FROM ACTIVE TO STANDBY UNIT IS INCOMPLETE,
TO PREVENT THE STANDBY UNIT TAKING OVER AS ACTIVE WITH A PARTIAL CONFIGURATION,
THE STANDBY UNIT WILL NOW REBOOT*******

[Connection to 127.0.0.41 closed by foreign host]
SW2#
Aug  x 02:08:59.764 MSK: %SNMP-5-MODULETRAP: Module 4 [Down] Trap
SW2#
Aug  x 02:08:59.764 MSK: %MFIB_CONST_RP-6-REPLICATION_MODE_CHANGE: Replication Mode Change Detected. Current system replication mode is Egress
SW2#
Aug  x 02:08:59.739 MSK: SP: The PC in slot 4 is shutting down. Please wait ...
SW2#
Aug  x 02:09:14.749 MSK: SP: shutdown_pc_process:No response from module 4
RD2rt16#
Aug  2 02:09:24.761 MSK: %C6KPWR-SP-4-DISABLED: power to module in slot 4 set off (Reset)
SW2#

2-nd reload:

Aug  x 02:10:01.172 MSK: %DIAG-SP-6-RUN_MINIMUM: Module 4: Running Minimal Diagnostics...
SW2#
Aug  x 02:10:04.260 MSK: %MFIB_CONST_RP-6-REPLICATION_MODE_CHANGE: Replication Mode Change Detected. Current system replication mode is Ingress
Aug x 02:10:04.300 MSK: %SNMP-5-MODULETRAP: Module 4 [Up] Trap
SW2#
Aug  x 02:10:03.952 MSK: %DIAG-SP-6-DIAG_OK: Module 4: Passed Online Diagnostics
SW2#
Aug x 02:10:07.975 MSK: %OIR-SP-6-INSCARD: Card inserted in slot 4, interfaces are now online

SW2#session slot 4 processor 1
The default escape character is Ctrl-^, then x.
You can also type 'exit' at the remote prompt to end the session
Trying 127.0.0.41 ... Open

FWSM# Beginning configuration replication from mate.
Access Rules Download Complete: Memory Utilization: < 1%

Config Sync Error: Following command could not be executed on standby
               access-list MMM commit-status committed line 61 extendek pgzm)u udp Network 255.255.224.0 host she eq ntp
Context: single_vf

******REPLICATION OF CONFIGURATION FROM ACTIVE TO STANDBY UNIT IS INCOMPLETE,
TO PREVENT THE STANDBY UNIT TAKING OVER AS ACTIVE WITH A PARTIAL CONFIGURATION,
THE STANDBY UNIT WILL NOW REBOOT*******

[Connection to 127.0.0.41 closed by foreign host]
SW2#
Aug  x 02:10:59.764 MSK: %SNMP-5-MODULETRAP: Module 4 [Down] Trap
SW2#
Aug  x 02:10:59.764 MSK: %MFIB_CONST_RP-6-REPLICATION_MODE_CHANGE: Replication Mode Change Detected. Current system replication mode is Egress
SW2#
Aug  x 02:10:59.739 MSK: SP: The PC in slot 4 is shutting down. Please wait ...
SW2#
Aug  x 02:11:14.749 MSK: SP: shutdown_pc_process:No response from module 4
RD2rt16#
Aug  2 02:11:24.761 MSK: %C6KPWR-SP-4-DISABLED: power to module in slot 4 set off (Reset)
SW2#

AND so on... INFINITE

Any ideas how to resolve?

With best regards,

Max

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Magnus Mortensen Sun, 08/01/2010 - 22:08

Max,

     I see you sanitized the output. Can you please provide the exact lines that are failing. There are some bugs that may impact the code you are running, but we need to see exactly what the config line is that fails. What changes were made recently that may have kicked off this issue?

- Magnus

scytmax_2 Mon, 08/02/2010 - 02:12

Magnus,

thank You for request of additional information from us. Now we want to provide more detail info (see below).

We have two identical WS-C6506 switches with WS-SUP720-3B (s72033_rp-ADVENTERPRISEK9_WAN-M, Version 12.2(33)SXH4) and with one WS-SVC-FWM-1 (FWSM Firewall Version 4.0(6)) installed each.

The configuration of WS-C6506 (firewall related) is identical:

firewall multiple-vlan-interfaces
firewall module 4 vlan-group 570,800,801,809,850,860,900,982,984,986,990,992,993,998,
firewall vlan-group 570  570
firewall vlan-group 800  800
firewall vlan-group 801  801
firewall vlan-group 809  809
firewall vlan-group 850  850
firewall vlan-group 860  860
firewall vlan-group 900  900
firewall vlan-group 982  982
firewall vlan-group 984  984
firewall vlan-group 986  986
firewall vlan-group 990  990
firewall vlan-group 992  992
firewall vlan-group 993  993
firewall vlan-group 998  998,999

Note: all vlans are exists and active

The configuration of SVC-FWM-1 (A/S failover, ACTIVE) (failover related):

!
interface Vlan998
description LAN Failover Interface
!
interface Vlan999
description STATE Failover Interface
!

failover
failover lan unit primary
failover lan interface FWSM_failover Vlan998
failover link FWSM_Statefull_failover Vlan999
failover interface ip FWSM_failover 1.1.1.1 255.255.255.252 standby 1.1.1.2
failover interface ip FWSM_Statefull_failover 1.1.2.1 255.255.255.252 standby 1.1.2.2

The configuration of SVC-FWM-1 (A/S failover, STANDBY) (failover related):

!
interface Vlan998
description LAN Failover Interface
!
interface Vlan999
description STATE Failover Interface
!

failover
failover lan unit secondary
failover lan interface FWSM_failover Vlan998
failover link FWSM_Statefull_failover Vlan999
failover interface ip FWSM_failover 1.1.1.1 255.255.255.252 standby 1.1.1.2
failover interface ip FWSM_Statefull_failover 1.1.2.1 255.255.255.252 standby 1.1.2.2

!

Earlier, from time to time, ACTIVE and STANDBY loses it's connectivity:

at the begin:

ACTIVE:

Failover On
Failover unit Primary
Failover LAN Interface: FWSM_failover Vlan 998 (up)

............................................................................

        This host: Primary - Active

............................................................................

        Other host: Secondary - Standby Ready

............................................................................

Stateful Failover Logical Update Statistics
        Link : FWSM_Statefull_failover Vlan 999 (up)

............................................................................

STANDBY:

Failover On
Failover unit Secondary
Failover LAN Interface: FWSM_failover Vlan 998 (up)

............................................................................

        This host: Secondary - Standby Ready

............................................................................

        Other host: Primary - Active

............................................................................

Stateful Failover Logical Update Statistics
        Link : FWSM_Statefull_failover Vlan 999 (up)

............................................................................

Then after some time of many configuration changes via Device Manager Version 6.1(5)F during work time we have such situation:

ACTIVE:

Failover On
Failover unit Primary
Failover LAN Interface: FWSM_failover Vlan 998 (up)

............................................................................

        This host: Primary - Active

............................................................................

        Other host: Secondary - Failed

............................................................................

Stateful Failover Logical Update Statistics
        Link : FWSM_Statefull_failover Vlan 999 (Failed)

............................................................................

STANDBY:

Failover On
Failover unit Secondary
Failover LAN Interface: FWSM_failover Vlan 998 (up)

............................................................................

        This host: Secondary - Standby Ready

............................................................................

        Other host: Primary - Active

............................................................................

Stateful Failover Logical Update Statistics
        Link : FWSM_Statefull_failover Vlan 999 (up)

............................................................................

How can we resolve this situation earlier?: we can go to WS-C6506, where the STANDBY unit installed in slot #4, and type the command:

no power enable module 4

    and then two minutes later:

power enable module 4

And it works fine for us: after some time STANDBY unit was reloaded and copy it's running configuration from ACTIVE, then communication between ACTIVE and STANDBY  be restored.

But from some time (when configuration on ACTIVE become large enough ~ 125K) this method doesn't working already:

when STANDBY wal reloaded, it's begin configuration replication from ACTIVE, but stopped and begin reload, because such error:

Config Sync Error: Following command could not be executed on standby
               access-list MMM commit-status committed line 61 extendek pgzm)u udp Network 255.255.224.0 host she eq ntp
Context: single_vf

******REPLICATION OF CONFIGURATION FROM ACTIVE TO STANDBY UNIT IS INCOMPLETE,
TO PREVENT THE STANDBY UNIT TAKING OVER AS ACTIVE WITH A PARTIAL CONFIGURATION,
THE STANDBY UNIT WILL NOW REBOOT*******

The correct line is:

access-list MMM line 61 extended permit udp MMM_Network 255.255.224.0 host 10.10.100.1 eq ntp

The next time parser of replicated configuration at STANDBY stopped at some OTHER line:

Config Sync Error: Following command could not be executed on standby
         access-list XXX commit-status commfttgl ,hne 93 extended permit object-group DM_INLINE_SERVICE_19 host me host you 
Context: single_vf

******REPLICATION OF CONFIGURATION FROM ACTIVE TO STANDBY UNIT IS INCOMPLETE,
TO PREVENT THE STANDBY UNIT TAKING OVER AS ACTIVE WITH A PARTIAL CONFIGURATION,
THE STANDBY UNIT WILL NOW REBOOT*******

The correct line is:

access-list XXX line 93 extended permit object-group DM_INLINE_SERVICE_19 host me host you

and so on...

Also, I want to say, that all configuration at ACTIVE unit are correct, without any errors.

Also, we actively use such objects in our configurations:

name 10.10.10.10 me

name MMM_Network 10.10.10.0

and other network or service object groups.

Also, we widely use remarks for our access lists and actively use Device Manager for FWSM configuration during business hours (to improve configuration performance).

All of this are correctly recognized at ACTIVE unit, but can't fully replicate to STANDBY.

How can we resolve this problem without manual load configuration from ACTIVE to STANDBY unit?

With best regards,

Max

marat.ishmakov Mon, 11/30/2015 - 22:13

Hello. I faced with the same problem as wrote Max.  Does anybody found solution?

Thank you.

Actions

This Discussion

Related Content