cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1575
Views
0
Helpful
6
Replies

Backups in a cluster environment?

Jason Meyer
Level 1
Level 1

What do the rest of you do for backups in clustered environments? You can backup the cluster level configuration very easily but if you have a failure you need a machine level backup. Do you take the time to remove the machines from the cluster to get a good machine level backup?

Anyone know what the best practice is for this or what is your procedure?

Long live the NATION!

6 Replies 6

Andrew Wurster
Level 1
Level 1

excellent questions.

i think the goal of a "cluster" concept is to have a no-backups-necessary kind of virtual configuration floating around in the ether. of course if you are making lots of machine level changes, then it's not practical to forgo individual backups.

temporarily removing machines and rejoining them shouldn't be too difficult, although you would want to probably suspend listeners and bring it offline for a few minutes:


1. temporarily remove the machine:
> clusterconfig removemachine

2. save configs via either email or local storage:
> saveconfig yes
> mailconfig yes

3. rejoin machine
> clusterconfig join [--port=xx]

important to note is that using 'mailconfig' or 'saveconfig' on a clustered box does give you everything you need, you just can't manually import it again as-is with 'loadconfig'. maybe it'd be easier to just do mailconfig from all appliances and parse out the machine level stuff in the XML config files???

i'd like to hear any other suggestions on how to pull this stuff out though! i'm sure there are some good ones!

andrew

martinc8306
Level 1
Level 1

I like to use an expect script that runs on a daily crontab to mail the config to me

#!/usr/bin/expect -f
set ipaddr [lrange $argv 0 0]
set scriptname [lrange $argv 1 1]
set arg1 [lrange $argv 2 2]
#set timeout -1
spawn ssh -p (SSH PORT ON ESA) admin@$ipaddr $scriptname $arg1
match_max 100000
# send blank line (\r)
#send -- "\r"
expect eof



0 1 * * * /usr/bin/expect /root/scripts/ironbackup.exp x.x.x.x mailconfig (your email address)

Once I have this in mail format my Entourage rules send the XML file to a network storage location based on my inbound rules for subject line.

steven_geerts
Level 1
Level 1

Most of the times, a disaster recovery is needed due to a human error. The way Ironport handles the cluster configurations does not meet that fact at all. The Ironport way of clustering is a perfect solution for host outages, but gives no remedy against a (stupid) mistake of a human.

At this moment we take a daily copy of the configuration with a Unix shell script that triggers the Ironport to mail the configuration.

Another improvement plan I have is based on syslogNG functionality.
We are forwarding our Ironport logs to syslogNG and that brings in some interesting features.
One idea that's in my mind is to abuse syslogNG to trigger the "mail-me-the-config" shell script every time someone is "committing" a change. This way you will get a good repository of Ironport config files. This allows you to investigate when a certain change was made and gives you a (poor-mans) way of rolling back.

As soon as I have finished the stuff I might publish it (if someone is interested). The most optimal would be that Ironport would implement a way of reverting to previous policy versions.

Steven

Eisenhafen
Level 1
Level 1

We have had the problem at a customers site, that the A/C was failing for the whole datacenter and due to a long chain of events both ironports on site were rebootet several times, but power failures occured during bootup. In the morning both cluster devices had a broken filesystem...

There is an open feature request with ironport to allow cluster backup. Just open a ticket as feature request to make the topic more important.

Jason Meyer
Level 1
Level 1

Just submitted the feature request, thanks Eisenhafen.

meyd45_ironport
Level 1
Level 1

The lack of a working loadconfig in clusters has been much complained about to IronPort for years.

In addition to the issue of recovery from a catastrophic cluster failure it makes change control procedures messier to deal with because one is unable to quickly revert to a known good saved configuration.

James

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: