cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1214
Views
4
Helpful
11
Replies

RME 4.0.4: Can't purge config archive

anywebsbb
Level 1
Level 1

Hi,

Customer has LMS 2.5.1 SP4 on Solaris 9 on 2 server running. They work with master/slave for dcr. Now the master server can't purge the config archive anymore. The job failed 16h later, without any details.

On the slave server purge achrive job is running properly.

The directory /var is nearly full. Ciscoworks in not able to run all daly jobs properly. A lot of them fail. A solaris engineer find out, that ciscoworks create a lot if small inodes on the solaris server. It looks that rme is not able to delete the old config files.

Did have somebody else same experience?

Thanks

HR

11 Replies 11

Joe Clarke
Cisco Employee
Cisco Employee

Config Purge can take a long time. Sixteen hours is not uncommon for large networks, but it should make some progress. The best way to troubleshoot this problem is to enable ConfigJob debugging under RME > Admin > System Preferences > Loglevel Settings, then run a new purge job. Under /var/adm/CSCOpx/files/rme/jobs/ArchivePurge, there will be a directory that corresponds to the purge job ID. In there, there will be number directories starting from 1. Those directories represent the instance of the job. Inside will be a log file. Scanning that log file should reveal why the configs are not being successfully purged.

Hi Joe

Thans for your investigation.

I did like you said. I send the log to the tac.

Case is SR605610155, when you are interested.

I would appreciate.

Kind regards

Hans

I do not see the requested logs attached to this SR.

Hi Joe

I put the file on the ftp-sj.cisco.com in the directory incoming. File name is log_SR605610155_2.tar

Regards

Hans

This tar archive is corrupt.

Hi Joe

I put a new file on the server ftp-sj.cisco.com.

File is under /incoming and name is archivepurgejob_SR605610155.tar.

I check this file before. It is an tared directory. I was able to see the content with the unix command strings.

Regards

Hans

Sorry, this archive is also corrupt. I cannot simply use strings on it to get what I need. I need to be able to properly untar it. When you upload this file to the FTP server, make sure you are using a BINARY transfer. You might also try compressing the tar file with zip or bzip2 as well.

Sorry for the circumstances.

I put a file on the server ftp-sj.cisco.com.

It is in the directory /incoming and name is

archivepurgeSR605610155.tar.gz.

I compressed the file with gzip and transfered it binary.

Hope it works.

Regards

Hans

I was able to extract this file, but unfortunately, the error messages do not give me enough of a clue to know exactly what is wrong. At first glance, it seems your RME database is out of sync with the file system in terms of config archive. Since this is Solaris, a two-minute long truss of the running job may provide more clues. Once the job is running, use /usr/ucb/ps -auwwx and grep for the job ID. Then, when you have PID, run:

truss -a -f -vall -rall -wall -o /tmp/purge.truss -p

Kill the truss after about two minutes, then compress and post the resulting purge.truss.

Hi

At the meantime tac engineer told me to clear the devices in the dcr an reimport them again. After that an reboot, purge job is alright again. Anyway i put the file on the server again. ftp-sj.cisco.com/incoming/purge.truss.gz. Maybe you can see anything.

Thank you very match.

Best regards

Hans

If you've removed and re-add the devices, anything obtained from the truss will not be relevant. Hopefully with regular purging from the start, you won't see a problem again.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: