07-24-2012 03:32 AM - edited 03-01-2019 10:31 AM
Hi All.
I have a very strange situation. Recently arrive the new UCS 2.0(3a) to our site.
After a week of running without problem the fabric interconnect B went down ( this happens twice ).
If I do a
porfic03-B# show cluster extended-state
Start time: Thu Jul 12 17:38:02 2012
Last election time: Thu Jul 12 18:29:59 2012
B: UP, PRIMARY
A: UP, SUBORDINATE
B: memb state UP, lead state PRIMARY, mgmt services state: UP
A: memb state UP, lead state SUBORDINATE, mgmt services state: UP
heartbeat state PRIMARY_OK
INTERNAL NETWORK INTERFACES:
eth1, DOWN
eth2, UP
HA READY
in the porfic03-A I get :
porfic03-A# show cluster extended-state
Start time: Thu Jul 12 18:29:51 2012
Last election time: Thu Jul 12 18:29:54 2012
A: UP, SUBORDINATE
B: UP, PRIMARY
A: memb state UP, lead state SUBORDINATE, mgmt services state: UP
B: memb state UP, lead state PRIMARY, mgmt services state: UP
heartbeat state PRIMARY_OK
INTERNAL NETWORK INTERFACES:
eth1, UP
eth2, UP
HA READY
So as you can see all looks fine inside UCS. Although outside UCS I cannot ping porfic03-B and the cluster virtual IP ( because is attached to porfic03-b that is the primary node )
Phisically I see that the management network card in porfic03-B as link but as no activity.
Can anyone point in the right direction to solve this issue ?.
Reboot the porfic03-B solve the problem but then the problem after a week comes back.
Any ideas ?
Regards
07-24-2012 05:07 AM
Hello,
Please provide following information from FI B
scope monitoring
scope sysdebug
show cores detail
connect nxos b
show version
show system reset-reason
show int mgmt0
------------------
Regarding network connectivity for FI B mgmt interface, start with verifying the cabling and upstream switch port configuration.
Padma
07-24-2012 05:54 AM
I already check FI B mgmt interface cabling and upstream switch port ( no erros ) port is up in the switch. I already switch the cable in the mgmt A to the mgmt B ant the port still was no activity.
The output of:
porfic03-B# scope monitoring
porfic03-B /monitoring #
porfic03-B# scope sysdebug
^
% Invalid Command at '^' marker
porfic03-B# show cores detail
^
% Invalid Command at '^' marker
connect nxos b
show version:
Software
BIOS: version 3.5.0
loader: version N/A
kickstart: version 5.0(3)N2(2.03a)
system: version 5.0(3)N2(2.03a)
power-seq: Module 1: version v1.0
Module 3: version v2.0
uC: version v1.2.0.1
SFP uC: Module 1: v1.0.0.0
BIOS compile time: 02/03/2011
kickstart image file is: bootflash:/installables/switch/ucs-6100-k9-kickstart.
5.0.3.N2.2.03a.bin
kickstart compile time: 6/19/2012 7:00:00 [06/19/2012 15:21:08]
system image file is: bootflash:/installables/switch/ucs-6100-k9-system.5.0
.3.N2.2.03a.bin
system compile time: 6/19/2012 7:00:00 [06/19/2012 17:04:19]
Hardware
cisco UCS 6248 Series Fabric Interconnect ("O2 32X10GE/Modular Universal Platf
orm Supervisor")
Intel(R) Xeon(R) CPU with 16622556 kB of memory.
Processor Board ID FOC161117SU
Device name: porfic03-B
bootflash: 31266648 kB
Kernel uptime is 11 day(s), 20 hour(s), 13 minute(s), 4 second(s)
Last reset
Reason: Unknown
System version: 5.0(3)N2(2.03a)
Service:
plugin
Core Plugin, Ethernet Plugin, Fc Plugin, Virtualization Plugin
show system reset-reason:
----- reset reason for Supervisor-module 1 (from Supervisor in slot 1) ---
1) No time
Reason: Unknown
Service:
Version: 5.0(3)N2(2.03a)
2) At 462964 usecs after Wed Jul 4 16:04:11 2012
Reason: Reset Requested by CLI command reload
Service:
Version: 5.0(3)N2(2.03a)
3) At 493083 usecs after Wed Jul 4 10:40:51 2012
Reason: Reset Requested by CLI command reload
Service:
Version: 5.0(3)N2(2.03a)
4) At 902919 usecs after Tue Jul 3 15:29:48 2012
Reason: Reset Requested by CLI command reload
Service:
Version: 5.0(3)N2(2.02q)
show int mgmt0:
mgmt0 is down (Administratively down)
Hardware: GigabitEthernet, address: 547f.ee8b.c060 (bia 547f.ee8b.c060)
Internet Address is xxx.xx.xx.xx/24
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 64808/255, txload 1/255, rxload 1/255
Encapsulation ARPA
auto-duplex, 1000 Mb/s
EtherType is 0x0000
1 minute input rate 0 bits/sec, 0 packets/sec
1 minute output rate 0 bits/sec, 0 packets/sec
Rx
472140 input packets 0 unicast packets 472140 multicast packets
0 broadcast packets 39500853 bytes
Tx
0 output packets 0 unicast packets 0 multicast packets
0 broadcast packets 0 bytes
Thanks for your reply.
Regards
07-24-2012 06:18 AM
Hello,
Can you please check if there are any core dumps on the FI by
scope monitoring
scope sysdebug
show cores detail
Mgmt status being display down is a known issue and we cannot consider it in this scenario.
Is mac address of FI B mgmt B interface learned on upstream switch port ?
Padma
07-24-2012 08:39 AM
Hi
No mac address in the upstream port.
porfic03-B /monitoring/sysdebug # scope monitoring
porfic03-B /monitoring # show cores detail
^
% Invalid Command at '^' marker
porfic03-B /monitoring # scope sysdebug
porfic03-B /monitoring/sysdebug # show cores detail
porfic03-B /monitoring/sysdebug # porfic03-B /monitoring/sysdebug # scope monitoring
porfic03-B /monitoring # show cores detail
^
% Invalid Command at '^' marker
porfic03-B /monitoring # scope sysdebug
porfic03-B /monitoring/sysdebug # show cores detail
porfic03-B /monitoring/sysdebug #
Thanks for the replay
Regards
07-25-2012 04:05 AM
Hi all
When I do a:
- porfic03-B(nxos)# show hardware internal cpu-mac mgmt stats
I get a lot of errors in the mgmt port. I will switch the module and check in the next days if the problem was solved.
Regards
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: