cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1563
Views
5
Helpful
19
Replies

/opt/CSCOpx/bin/dbstop.pl

yjdabear
VIP Alumni
VIP Alumni

Is the following error of any concern, for LMS 2.2?

Can't open perl script "/opt/CSCOpx/bin/dbstop.pl": No such file or directory

I can't find this file anywhere.

19 Replies 19

Joe Clarke
Cisco Employee
Cisco Employee

Yes, this is a big problem. Without this script, stopping dmgtd can lead to database corruption. The fact that it is missing indicates someone accidentally deleted it, or you have a bad installation.

The box kernel-panicked while the LMS 2.2 DST patch was being installed. After reboot, the web gui is prompting to install JRE 1.4.2_12. Can I simply copy dbstop.pl from another box? Should I reinitdb even if I don't see sign of db corruption yet?

Copying the file may work, but there may be other file system problems. I wouldn't reinitialize any thing at this point, but if you start to see other strange problems, especially with missing files, you might consider restoring application components from a file system backup, or doing a reinstallation.

I did a test run of dmdtd stop/start after fetching dbstop.pl from the test box. Upon dmgtd stop, the following db processes never went down:

root 17069 1 0 14:52:46 ? 0:00 /opt/CSCOpx/objects/dmgt/dmgtd.sol

casuser 14747 1 0 14:23:31 ? 1:12 /product/CSCO/CSCOpx/objects/db/bin/dbsrv7 -x tcpip{HOST=localhost;DOBROADCAST=

casuser 14730 1 0 14:23:15 ? 0:05 /product/CSCO/CSCOpx/objects/db/bin/dbsrv7 -x tcpip{HOST=localhost;DOBROADCAST=

casuser 17070 17069 1 14:52:46 ? 0:00 /opt/CSCOpx/objects/ess/bin/rvrd -store /opt/CSCOpx/objects/ess/conf/rvrd.conf

Without running "dbstop.pl all", daemons.log keeps spitting out "A database server with that name has already started" upon "dmgtd start". Analyze ANI Server fails as well.

Will a reinstall of the patches below address the db issue?

cw-common-services-2.2-sp3-sol-K9.tar.gz

cwcs2.2-sol-CSCsc30604-K9.tar.gz

cwcs2.2-sol-CSCse127781-k9.tar

cwcs2.2-sol-CSCsg58592-K9.tar.gz

cwdst2_2_3_sol_k9.zip

Right. The databases are stopped using that script to prevent a dirty shutdown leading to corruption. If you kill off those processes, you risk damaging the data. So, you WILL need that script. Either you copy it from another good system, restore it from a file system backup, or do a reinstallation.

Obviously, I'm being very paranoid in this case. A reinstallation may not be needed. It may be just this file that is missing. However, my experience tells me to be safe rather than sorry.

Just to clarify, I did the "dmgtd stop" after copying dbstop.pl over. It seems "dmgtd stop" did not call upon "dbstop.pl stop" as if it's not aware of the latter's presence, although it no longer complained of missing "dbstop.pl" in daemons.log. Is there something I can do to make this awareness happen?

Here's the output of dbstop.pl all

/product/CSCO/CSCOpx/bin/dbstop.pl all

Stopping database engine aniEng

Stopping database engine rmeEng

There are still active connections. Stop anyway? (Y/N) Y

Stopping database engine cmfEng

Stopping database engine SqlCoreDBServer

Could not find DSN=mcp.

Prior to running dbstop.pl all, what is the output from the following command:

/usr/bin/ps -ef | /usr/bin/fgrep /product/CSCO/CSCOpx | /usr/bin/fgrep -v fgrep | /usr/bin/fgrep dbsrv | /usr/bin/awk '{ print $2; }'

It might also be useful to get the output of:

sh -x /etc/init.d/dmgtd stop

In a pinch I went ahead with /opt/CSCOpx/campus/bin/reinitdb.pl -restore. That seems to keep ANI Server up, whereas before Analyze ANI Server was erroring out/testing ok/erroring out again.

The first command returns three PIDs, the same ones before or after "dmgtd stop".

21967

22046

22032

Attached is the screen capture of sh -x dmgtd stop, after a server reboot.

Interesting. You may have found a bug, or something more is messed up on your system. Since you have installed into a non-default location, NMSROOT should point to that location. Instead, it's being found to be /opt/CSCOpx. This isn't necessarily bad (unless the /opt/CSCOpx symlink is missing), but since the processes are preceded with the real root, dmgtd cannot find them, and thus it doesn't know to stop them.

I assume this wasn't happening before the panic? Do you have a good file system backup of this server?

The symlink /opt/CSCOpx pointing to /product/CSCO/CSCOpx appears fine.

lrwxrwxrwx 1 casuser casusers 20 Jul 9 2004 CSCOpx -> /product/CSCO/CSCOpx

I'm sure the server can be restored if needed. What file systems would need to be restored?

/product/CSCO/CSCOpx (/opt/CSCOpx) and /var/Logs/CSCO/adm/CSCOpx (/var/adm/CSCOpx) are on SAN, while /etc/init.d/dmgtd is on local disk. /var/adm/CSCOpx (local) was a symlink to /var/Logs/CSCO/adm/CSCOpx (SAN), but now LMS is writing to it, which probably started when the 108993-61 OS patch called for booting into single-use mode. However, I'm still trying to wrap my mind around how dmgtd could've started in single-user mode, finding no /var/Logs/CSCO/adm/CSCOpx, therefore overwriting the symlink /var/adm/CSCOpx with a real directory.

I suspect that sequence may also have something to with the following that just trickled in: Apparently there were some errors, when installing cw-common-services-2.2-sp3-sol-K9, after 108993-61 got installed:

Hmmm, now I wonder if the whole db mess is because the db is confused because some of the files it expects are in /var/Logs/CSCO/adm/CSCOpx/files which it can no longer find, while the most recent ones are in /var/adm/CSCOpx/files.

I was more interested in differences between /opt/CSCOpx/objects/dmgt/dmgtd.conf from the backup and from the active system. Also, what are the differences in the /var/sadm/pkg/CSCOmd/pkginfo in the backup and on the active system?

The fact that SP3 failed to apply in the CSCOmd package could be very bad. What does pkgchk CSCOmd report? The database problem right now doesn't have to do with logs, but with the fact that the dbsrv7 processes are not being stopped when dmgtd is brought down. Since they continue to run, dmgtd cannot spawn new engines.

diff pkginfo-backup pkginfo

< LD_LIBRARY_PATH=/product/CSCO/CSCOpx/objects/db/lib:/product/CSCO/CSCOpx/lib:/opt/CSCOpx/MDC/lib:/opt/CSCOpx/campus/lib:/opt/CSCOpx/MDC/lib:/opt/CSCOpx/MDC/lib

---

prod box (not behaving)

> LD_LIBRARY_PATH=/product/CSCO/CSCOpx/objects/db/lib:/product/CSCO/CSCOpx/lib:/opt/CSCOpx/MDC/lib:/opt/CSCOpx/campus/lib:/opt/CSCOpx/MDC/lib

diff dmgtd.conf-backup dmgtd.conf

23c23

< CampusOGSServer y CmfDbMonitor,ESS /product/CSCO/CSCOpx/bin/cwjava -cp:a /product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/classes:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ctm.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-server1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-sharedasa1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-sqlasa1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-util1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-client1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-cmasa1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-cmclient1.0.jar:/product/CSCO/CSCOpx/MDC/tomcat/lib/apps/xerces.jar:/product/CSCO/CSCOpx/MDC/tomcat/lib/apps/log4j.jar:/product/CSCO/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/jconnect5.2.jar com.cisco.nm.xms.ogs.server.OGSServer

---

> CampusOGSServer y CmfDbMonitor,ESS /opt/CSCOpx/bin/cwjava -cp:a /opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/classes:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ctm.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-server1.0.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-sharedasa1.0.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-sqlasa1.0.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-util1.0.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-client1.0.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-cmasa1.0.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/ogs-cmclient1.0.jar:/opt/CSCOpx/MDC/tomcat/lib/apps/xerces.jar:/opt/CSCOpx/MDC/tomcat/lib/apps/log4j.jar:/opt/CSCOpx/MDC/tomcat/webapps/campus/WEB-INF/lib/jconnect5.2.jar com.cisco.nm.xms.ogs.server.OGSServer

34d33

< ANIDbEngine y - /product/CSCO/CSCOpx/objects/db/bin/dbsrv7 -x tcpip{HOST=localhost;DOBROADCAST=NO;ServerPort=43443} -q -s local0 -m -ti 0 -gm 100 -gc 5 -c 8M -ht -gss 9900 -n aniEng /opt/CSCOpx/databases/ani/ani.db -n aniDb

42a42

> ANIDbEngine y - /product/CSCO/CSCOpx/objects/db/bin/dbsrv7 -x tcpip{HOST=localhost;DOBROADCAST=NO;ServerPort=43443} -q -s local0 -m -ti 0 -gm 100 -gc 5 -c 8M -ht -gss 9900 -n aniEng /opt/CSCOpx/databases/ani/ani.db -n aniDb

lms-backup# pkgchk CSCOmd

ERROR: /tmp/.SQLAnywhere

permissions <0770> expected <0777> actual

ERROR: /var/adm/CSCOpx/log

permissions <0750> expected <0755> actual

group name expected <(null)> actual

ERROR: /var/adm/CSCOpx

file type expected actual

lms# pkgchk CSCOmd

ERROR: /product/CSCO/CSCOpx/etc/install.cshrc

file size <268> expected <232> actual

file cksum <22005> expected <18913> actual

ERROR: /product/CSCO/CSCOpx/etc/install.profile

file size <209> expected <182> actual

file cksum <17013> expected <14694> actual

ERROR: /tmp/.SQLAnywhere

permissions <0770> expected <0777> actual

ERROR: /var/adm/CSCOpx

group name expected actual

owner name expected actual

What I mean is since it's mentioned before (by Nadim?) that it's not a good idea to delete files direclty from /var/adm/CSCOpx/files, the fact right now

/var/Logs/CSCO/adm/CSCOpx/files is not symbolically linked with /var/adm/CSCOpx/files might be an issue since the db can't find these files.

You should fix the pkgchk errors on /product/CSCO/CSCOpx/etc/install.profile and install.cshrc (this will most likely mean restoring good versions of these files). As for the symbolic link, this will cause problems, but not THIS problem. Consequently, we do not support symbolic links in this fashion. It will generally work, but patch installations may overwrite the symbolic link with an actual directory.

It does seem like you are hitting a bug in dmgtd. This bug has been fixed in LMS 2.6. For now, you could change your dmgtd script on lines 202, 221, and 227 so that the fgrep for NMSROOT is actually:

/usr/bin/egrep '('\$NMSROOT'|'/product/CSCO/CSCOpx')'

Or just be sure to run dbstop.pl all as well as a kill -TERM on all casuser processes after shutting down dmgtd.

Hello Joe,

Could you please tell me to which application belongs the

DSN=mcp

I think I?ve never seen this DSN on an LMS installation

from a previous post in this thread:

####

Here's the output of dbstop.pl all

/product/CSCO/CSCOpx/bin/dbstop.pl all

Stopping database engine aniEng

Stopping database engine rmeEng

There are still active connections. Stop anyway? (Y/N) Y

Stopping database engine cmfEng

Stopping database engine SqlCoreDBServer

Could not find DSN=mcp.

#####

And do you know if a symlinc on /var/adm/CSCOpx will be supported in the future (version or date ..)

I also know some installtions with this problem...

regards,

MArtin

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: