WCS 6.0.132.0 Database

Unanswered Question
Oct 13th, 2009

Hello,

I've been having some problems with the linux version of WCS. Sometimes, and I still don't know why the Database server stops. To activate it again I need to restart the service using the WCSStart script. Is anyone having similar problems? The service can work for two weeks or a month... then it just stops.

Best regards

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4 (2 ratings)
Loading.
ericgarnel Tue, 10/13/2009 - 04:29

It may have something to do with the underlying OS that you are running WCS on.

Are you running linux or windows?

skronawithleitner Tue, 10/13/2009 - 05:04

hi,

I am having the same problem. WCS 6.0.132.0 on RedHat 5.x 32 bit and a fast machine. everything works fine for quite some time, until the DB stops.

This happens in the middle of the night; right when some background tasks are scheduled. i splitted them up now some minutes to see which one causes this. Maybe this will bring some light into that matter.

But nice to see that I am not the only one with that problem...

ericgarnel Tue, 10/13/2009 - 05:12

A few questions:

Is selinux enabled?

Anything in /var/log/messages?

The fact that it happens in the middle of the night makes me suspect a cronjob that is kicking off which may cause a conflict.

check the following dirs & files:

/etc/cron.daily

/etc/cron.weekly

/etc/cron.monthly

/etc/crontab

under root login: crontab -l

If I were to venture a guess, it dies @ 4am

tiago.molinos Tue, 10/13/2009 - 06:06

I'm using RedHat Linux. The server stops randomly. Never realized it was during the night. I just found out because I stoped getting the email alerts.

Issuing the command "crontab -l" results in "no crontab for root".

I have Selinux enable and it's actually blocking write access to a file:

/opt/WCS6.0.132.0/webnms/ifconfig.txt

I checked this file and it's an empty file.

No other problem found so far.

ericgarnel Tue, 10/13/2009 - 06:19

Perhaps it is empty because selinux does not allow it to be written to.

You want want to alter your selinux policies or disable it and use other means for securing your server

tiago.molinos Tue, 10/13/2009 - 10:16

I don't have the expertise to do this. I wouldn't disable SeLinux but I would add a policy to allow the writing... I don't even know if the file needs to be written, since everithing works just fine for several days, even with lots of data arriving to the WCS...

Best regards,

Lucien Avramov Tue, 10/13/2009 - 09:09

Tiago, let's run a database check to fix any corrupted tables in your WCS database.

That should help you.

Here is how:

1./opt/WCSX.X/bin/StopWCS

2. ./opt/WCSx.x/bin/DBAdmin checkschema

3. /opt/WCSX.X/bin/StartWCS

tiago.molinos Tue, 10/13/2009 - 09:45

Ok, I did like you posted. Aparently everithing went well:

"[[email protected] ~]# /opt/WCS6.0.132.0/bin/StopWCS

Stopping WCS

Stopping Health Monitor...

Waiting for shutdown to complete.

/opt/WCS6.0.132.0/bin/Health Monitor successfully shutdown.

WCS is stopped.

The database server is stopped.

Apache server is stopped.

WCS successfully shutdown.

[[email protected] ~]# /opt/WCS6.0.132.0/bin/DBAdmin checkschema

This may modify your database. Do you wish to continue? (y/n) y

Starting database server ...

Database server is running.

Updating schema. This may take a while.

Loading schema. This may take a minute.

Schema is loaded.

Shutting down database server ...

Database server successfully shutdown.

[[email protected] ~]# /opt/WCS6.0.132.0/bin/StartWCS

Starting WCS

WCS started successfully."

I'll have to wait to see if this solved the issues...

Thanks

skronawithleitner Tue, 10/13/2009 - 21:03

i have selinux disabled, nothing in /var/log/messages, no cronjob at the time it dies, only the wcs background tasks.

i tried now the dbadmin command - maybe this will help, we'll know in some weeks :)

tiago.molinos Tue, 10/20/2009 - 02:50

Well it's been a week since I issued the DBadmin command, and so far, it hasn't stoped. I'll wait another week to be sure it solved the problem.

Thank you.

tiago.molinos Fri, 10/23/2009 - 02:59

Hello,

my database stoped working again... It happened last night.

I got this output from the WCSStatus script.

"Health Monitor is running.

WCS is initialized, but is not serving web pages.

*********************************

The Server seems to be unavailable. The possible reasons are

1. The server is not running yet.

2. The server is hanging.

3 The server has crashed.

If the server has crashed or is hanging

Please report the issue and follow the instructions to restart it.

1. Run StopWCS to stop all WCS processes.

2. Run StartWCS to start the WCS server.

3. If the server still fails to start, or you still get this

message, then reboot the machine.

*********************************

Database server is stopped

Apache server is running"

So I think the DBAdmin didn't solve the problem. Any next steps?

Thanks for your help.

Tiago

ericgarnel Fri, 10/23/2009 - 04:11

Pull the logs from WCS. set the WCS logging to the highest level and try to see if something triggers within WCS.

Question to Cisco: are there any debug flags that can be set on the CLI and/or called from the init script?

I still think this is an underlying config issue with the OS. the solid db may not like it if another backup job tries to backup open files.

tiago.molinos Fri, 10/23/2009 - 06:35

I had the logs in trace level. I found something the could lead someone to a conclusion.

The logs started to look bad at this timestamp: 10/23/09 01:00:03.731

It can be found in the attached file wcs-6-0.log. I've included other two files for a more complete followup.

Tiago

Attachment: 
Lucien Avramov Tue, 10/27/2009 - 15:42

Your database crashed.

Do you have other things running on your server?

On windows, A/V could be scanning WCS folder and make this happen, on Linux I have not seen it.

You can re-run a dbadmin checkshema but I suggest you open a TAC case at this point, this needs to be investigated more in details and you can actually upload your DB to TAC so they look at it.

tiago.molinos Mon, 01/25/2010 - 09:02

Hello,

I've been absent for a while and now I can see the proble I had remains. I noticed there was a new version 6.0.170.0 so I upgraded the WCS.

Anyway it didn't solve my issues, so I'll just open a TAC ticket.

Tiago

skronawithleitner Thu, 02/18/2010 - 23:48

have you heard back from TAC yet?

my DB crashed again... would be nice to see this resolved...

tiago.molinos Fri, 02/19/2010 - 01:47

Hello,

I have opened a TAC ticket. They asked me to increase the logging level, and to send them the logs when it happens again. Since I did that the database hasn't crashed. I'm waiting for failure . It seems curious though that I think it never took so long before...

Best regards,

Tiago

sschmidt Fri, 02/19/2010 - 05:54

Hey all,

I just wanted to let you know that the db issue has been seen before and is actively being worked in TAC and beyond.  The issue is getting the info when it happens to try and determine the root cause which has been difficult.  Please make sure to keep those debugs turned up with at least the Object Manager, General, Async and Database modules checked.  What is needed is the transition from an "awake" state to the "crashed" state so capturing the logs during this time would be ideal.

Actions

This Discussion

 

 

Trending Topics - Security & Network