The following procedure is not supported by TAC, the Wireless
Networking Business Unit or any other entity at Cisco.
The issue seems to be related to a problem identified by VMware:
Linux based file systems become read-only
VMware has identified a problem where file systems may become read-only after encountering busy
I/O retry or SAN or iSCSI path failover errors. NCS users have also encountered this issue after the
storage has been uncleanly removed, usually brought on by a power outage.
If you can get to the shell prior to a reboot you can try issuing the following. If you don't have access
to a CLI becuase the vm is in a boot loop, proceed to the next section:
mount -o remount /
Recovering an NCS Virtual Machine stuck in a boot loop:
1. Download a live linux distro locally to your machine. Users have reported success with Fedora.
2. In vSphere, left click the NCS VM -> Summary tab -> Storage -> right click the storage -> browse datastore -> click the icon to upload a file -> browse to the ISO.
3. Exit out of the datastore browser.
4. Right click the NCS VM -> edit settings -> CD/DVD drive -> enable 'Connected' and 'Connect at power on' -> select the radio button 'Datastore ISO File' -> browse to the ISO you just uploaded -> save
5. Reload the VM and boot to the ISO
6. Get to the CLI (the exact steps to do so will completely depend on the linux distro)
7. Determine which designation has been given to the volumes we need to repair. In the output below, this
particular linux distro has given the volumes the 'sdb' designation. This can vary. There will be three of them (sdb1, sdb2, sdb3)
# fdisk -l
Disk /dev/sdb: 209.7 GB, 209715200000 bytes
255 heads, 63 sectors/track, 25496 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 64 512000 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sdb2 64 77 102400 83 Linux
Partition 2 does not end on cylinder boundary.
/dev/sdb3 77 25497 204184576 8e Linux LVM
8. Scan for volume groups:
# lvm vgscan -v
Wiping cache of LVM-capable devices
Wiping internal VG cache
Reading all physical volumes. This may take a while...
Finding all volume groups
Finding volume group "smosvg"
Found volume group "smosvg" using metadata type lvm2
Archiving volume group "smosvg" metadata (seqno 12).
Creating volume group backup "/etc/lvm/backup/smosvg" (seqno 12).
9. Activate all volume groups:
# lvm vgchange -a y
11 logical volume(s) in volume group "smosvg" now active
10. List logical volumes:
# lvm lvs –a
LV VG Attr LSize
altrootvol smosvg -wi-a- 96.00M
home smosvg -wi-a- 96.00M
localdiskvol smosvg -wi-a- 29.28G
optvol smosvg -wi-a- 123.22G
recvol smosvg -wi-a- 96.00M
rootvol smosvg -wi-a- 3.91G
storeddatavol smosvg -wi-a- 9.75G
swapvol smosvg -wi-a- 15.62G
tmpvol smosvg -wi-a- 1.94G
usrvol smosvg -wi-a- 6.81G
varvol smosvg -wi-a- 3.91G
11. Use fsck to check all the partitions on the drive. It is ok if you receive errors for one of these.
# fsck -t ext3 –y /dev/sdb1
# fsck -t ext3 –y /dev/sdb2
# fsck -t ext3 -y /dev/sdb3
12. Perform the same steps for all of the logical volumes in the group identified with the lvscan command (remember to use the –y flag in all cases).
# fsck -t ext3 –y /dev/smosvg/altrootvol
# fsck -t ext3 –y /dev/smosvg/home
# fsck (repeat for all the others from step 10)
You are looking for similar output to this:
fsck 1.39 (29-May-2006)
e2fsck 1.39 (29-May-2006)
/home: clean, 34/128016 files, 33751/512000 blocks
13. Cleanly shut down the vm, remove the ISO configurations, and restart the server. It should now boot successfully
With this information, and the volumes activated, you should be able to mount the partitions and volumes:
a. mount /dev/sdb1 /media/boot ( not the Linux LVM )
b. mount /dev/sdb2 /media/storedconfig ( not the Linux LVM
c. mount /dev/smosvg/localdiskvol /media/NCSbackup
d. Move the most current backup file from the localdiskvol volume, as well as the startup config from the
storedconfig volume, redeploy the VM using the OVA file, then restore from the