I've been having some problems with the linux version of WCS. Sometimes, and I still don't know why the Database server stops. To activate it again I need to restart the service using the WCSStart script. Is anyone having similar problems? The service can work for two weeks or a month... then it just stops.
I am having the same problem. WCS 22.214.171.124 on RedHat 5.x 32 bit and a fast machine. everything works fine for quite some time, until the DB stops.
This happens in the middle of the night; right when some background tasks are scheduled. i splitted them up now some minutes to see which one causes this. Maybe this will bring some light into that matter.
But nice to see that I am not the only one with that problem...
A few questions:
Is selinux enabled?
Anything in /var/log/messages?
The fact that it happens in the middle of the night makes me suspect a cronjob that is kicking off which may cause a conflict.
check the following dirs & files:
under root login: crontab -l
If I were to venture a guess, it dies @ 4am
I'm using RedHat Linux. The server stops randomly. Never realized it was during the night. I just found out because I stoped getting the email alerts.
Issuing the command "crontab -l" results in "no crontab for root".
I have Selinux enable and it's actually blocking write access to a file:
I checked this file and it's an empty file.
No other problem found so far.
Perhaps it is empty because selinux does not allow it to be written to.
You want want to alter your selinux policies or disable it and use other means for securing your server
I don't have the expertise to do this. I wouldn't disable SeLinux but I would add a policy to allow the writing... I don't even know if the file needs to be written, since everithing works just fine for several days, even with lots of data arriving to the WCS...
Tiago, let's run a database check to fix any corrupted tables in your WCS database.
That should help you.
Here is how:
2. ./opt/WCSx.x/bin/DBAdmin checkschema
Ok, I did like you posted. Aparently everithing went well:
"[root@XXXXX ~]# /opt/WCS126.96.36.199/bin/StopWCS
Stopping Health Monitor...
Waiting for shutdown to complete.
/opt/WCS188.8.131.52/bin/Health Monitor successfully shutdown.
WCS is stopped.
The database server is stopped.
Apache server is stopped.
WCS successfully shutdown.
[root@XXXXX ~]# /opt/WCS184.108.40.206/bin/DBAdmin checkschema
This may modify your database. Do you wish to continue? (y/n) y
Starting database server ...
Database server is running.
Updating schema. This may take a while.
Loading schema. This may take a minute.
Schema is loaded.
Shutting down database server ...
Database server successfully shutdown.
[root@XXXXX ~]# /opt/WCS220.127.116.11/bin/StartWCS
WCS started successfully."
I'll have to wait to see if this solved the issues...
i have selinux disabled, nothing in /var/log/messages, no cronjob at the time it dies, only the wcs background tasks.
i tried now the dbadmin command - maybe this will help, we'll know in some weeks :)
Well it's been a week since I issued the DBadmin command, and so far, it hasn't stoped. I'll wait another week to be sure it solved the problem.
my database stoped working again... It happened last night.
I got this output from the WCSStatus script.
"Health Monitor is running.
WCS is initialized, but is not serving web pages.
The Server seems to be unavailable. The possible reasons are
1. The server is not running yet.
2. The server is hanging.
3 The server has crashed.
If the server has crashed or is hanging
Please report the issue and follow the instructions to restart it.
1. Run StopWCS to stop all WCS processes.
2. Run StartWCS to start the WCS server.
3. If the server still fails to start, or you still get this
message, then reboot the machine.
Database server is stopped
Apache server is running"
So I think the DBAdmin didn't solve the problem. Any next steps?
Thanks for your help.
Pull the logs from WCS. set the WCS logging to the highest level and try to see if something triggers within WCS.
Question to Cisco: are there any debug flags that can be set on the CLI and/or called from the init script?
I still think this is an underlying config issue with the OS. the solid db may not like it if another backup job tries to backup open files.