Solved: Strange error message

Walter Dey · ‎08-09-2013

We have 2 UCS domains running at 2.1.2a, which have been powered down, and moved to a new datacenter.

Now I see on both domain, on each FI, the error message: management interface is down (see file)

However, I can connect to UCS and use UCSM; something must be wrong here. The error cannot be removed,

by acking it.

Any advice is appreciated

Robert Burns · ‎08-09-2013

Walter,

Deducing you just moved these domains between data centers - any chance you don't yet have a Default Gateway that's reachable from your FIs? The default Management Interface Monitoring policy tries to ping the default gateway - if its unable to do so, that fault will remain.

From your show tech, its indeed enabled:

`show mgmt-if-mon-policy`

Admin Status: Enabled

Polling Interval: 90

Max Failure Reports: 3

Monitoring mechanism: Ping Gateway

MII Status Settings:

-------------------

Mii Status Retry Interval: 5

Mii Status Retry Count: 3

Solution - Disable this policy or configure the Gateway

Regards,

Robert

View solution in original post

Walter Dey · ‎08-09-2013

Here the cli output that shows, that the mgt interface is up and running

FI-BAL16-1-B(local-mgmt)# show mgmt-ip-debug
Ifconfig Info
-------------

eth0      Link encap:Ethernet HWaddr 00:2A:6A:15:E9:20
          inet addr:192.168.246.62 Bcast:192.168.246.255 Mask:255.255.254.0
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:571856 errors:0 dropped:0 overruns:0 frame:0
          TX packets:223429 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:44835327 (42.7 MiB) TX bytes:18630272 (17.7 MiB)

Robert Burns · ‎08-09-2013

Walter,

What does the cluster status show? (show clust ext).

Just because UCSM is reachable doesn't mean there may not be a problem with one of the FIs.

Robert

Walter Dey · ‎08-09-2013

Hi Robert

FI-BAL16-1-A# sho cluster extended-state
Cluster Id: 0x63470844da5411e2-0x9d4b002a6a1d63e4

Start time: Wed Jul 31 13:34:28 2013
Last election time: Wed Jul 31 13:39:57 2013

A: UP, PRIMARY
B: UP, SUBORDINATE

A: memb state UP, lead state PRIMARY, mgmt services state: UP
B: memb state UP, lead state SUBORDINATE, mgmt services state: UP
heartbeat state PRIMARY_OK

INTERNAL NETWORK INTERFACES:
eth1, UP
eth2, UP

HA READY
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1712HD7U, state: active
FI-BAL16-1-A#

Robert Burns · ‎08-09-2013

Walter how long as the error been present? It may just be a matter of a soaking-period (24hrs) for the error to self-clear. This could have been a result of the FI's Mgmt interfaces being disconnected while still powered up before the move, or disconnected during power up after the move.

If its been longer than 24hrs, we'll need to look at a UCSM show tech to advise further.

Regards,

Robert

Walter Dey · ‎08-09-2013

The move of the UCS domains happened 1 week ago !

I add show tech, just in case someone likes to clarify

Robert Burns · ‎08-09-2013

Is the show tech you attached from the same as the system from the screenshot above?

The Fault ID from the screenshot doesn't appear in the logs and the faults in your screenshot have been acknowledged, whereas in the TechSupport bundle they have not. The timestamps are also different.

Please confirm.

Severity: Major

Code: F0736

Last Transition Time: 2013-07-31T13:22:43.322

ID: 590273 <<<<<<

Status: None

Description: Management interface on Fabric Interconnect B is down

Affected Object: sys/switch-B/extmgmt-intf

Name: Extmgmt If Mgmtifdown

Cause: Mgmtif Down

Type: Management

Acknowledged: No <<<<<<

Occurrences: 1

Creation Time: 2013-07-31T13:22:43.322

Original Severity: Major

Previous Severity: Major

Highest Severity: Major

Severity: Major

Code: F0736

Last Transition Time: 2013-07-31T13:21:13.333

ID: 590267 <<<<<

Status: None

Description: Management interface on Fabric Interconnect A is down

Affected Object: sys/switch-A/extmgmt-intf

Name: Extmgmt If Mgmtifdown

Cause: Mgmtif Down

Type: Management

Acknowledged: No <<<<<<<

Occurrences: 1

Creation Time: 2013-07-31T13:21:13.333

Original Severity: Major

Previous Severity: Major

Highest Severity: Major

Regards,

Robert

Walter Dey · ‎08-09-2013

Robert, you are correct ! this are 2 UCS domains, both show exactly the same error. Therefore I also uploaded the other show tech (see above)

Thanks Walter.

Robert Burns · ‎08-09-2013

Walter,

Deducing you just moved these domains between data centers - any chance you don't yet have a Default Gateway that's reachable from your FIs? The default Management Interface Monitoring policy tries to ping the default gateway - if its unable to do so, that fault will remain.

From your show tech, its indeed enabled:

`show mgmt-if-mon-policy`

Admin Status: Enabled

Polling Interval: 90

Max Failure Reports: 3

Monitoring mechanism: Ping Gateway

MII Status Settings:

-------------------

Mii Status Retry Interval: 5

Mii Status Retry Count: 3

Solution - Disable this policy or configure the Gateway

Regards,

Robert

Walter Dey · ‎08-09-2013

Hi Rob

Indeed, this was the problem; one cannot ping the default gateway ! I changed the policy to MII Status; now the error is gone !

Thanks for your outstanding support (as usual)

Enjoy the weekend

Kind Regards

Walter.

rosaho · ‎07-13-2015

Attachment has been removed to comply to the CSC terms of use conditions.