[Users] openvz-diff-backups - survival guide - part three

tranxene50 tranxene50 at openvz-diff-backups.fr
Fri Sep 18 02:54:42 MSK 2020


Hello!

In part three, we will see what can (or not) be done when restoring backups.

I suppose that you have followed part two and then have a "dummy" 
running container named "www.kick-me.com" with CTID 666.

# ---------------------

Before restoring this "evil" container, let's do a "cold" backup using 
"quiet" option, "turbo" option and "optimize" option (nocache):

On production, you may/might/want/will always use thee options because 
mail reports will be more readable.

# CTID="666"

# openvz-diff-backups backup $CTID cold -q -t -o 16

=> Backup   - ctid:   666 - cold_plop - 2020-09-17_20-57-21 - time:    
1m32s - size:       8 MB ( 1%) - speed:       7 MB/s - restart:     13.4s

It took 1 minutes and 32 seconds (remember, commands are executed on a 
very low-end server), its size is ridiculous (because almost nothing 
changed inside) and - bad news - the CT had a downtime of 13 seconds.

However, "cold" backups are the most portable choice and can/should/must 
be used when migrating from one system to another.

For instance, when switching containers from Proxmox 3 to OpenVZ Legacy 
(I have done it in the past) or from OpenVZ Legacy to OpenVZ 7.

The opposite is also possible (although it is quite unlikely): you can 
migrate from OpenVZ 7 to OpenVZ Legacy.

Warning: OVZDB has not been thoroughly tested with OpenVZ 7 so, if you 
find a bug or an issue, please make report and I will try to fix it ASAP.

Also, be aware that I could not try/verify/validate every possible 
scenario (ex: Proxmox 2/3 to Virtuozzo 7 or from Virtuozzo 6 to OpenVZ 
7): any feedback will be appreciated!

# ---------------------

Now, it is time to have a quick look on available backups using the 
"list" mode with "log-level" set to 6 (notice) and "quiet" option (used 
twice):

# openvz-diff-backups restore $CTID list -l 6 -q -q

  1 - *Latest* backup - mode: cold_plop - date: 2020-09-17_20-57-21 - 
task: backup - exit: success - keep: until it is deleted - host: xxx
  2 - Previous backup - mode: live_plop - date: 2020-09-16_02-20-15 - 
task: backup - exit: success - keep: until it is deleted - host: xxx
  3 - Previous backup - mode: live_plop - date: 2020-09-16_02-19-39 - 
task: backup - exit: success - keep: until it is deleted - host: xxx
  4 - Previous backup - mode: live_plop - date: 2020-09-16_01-00-30 - 
task: backup - exit: success - keep: until it is deleted - host: xxx
  5 - Previous backup - mode: live_plop - date: 2020-09-16_00-17-22 - 
task: backup - exit: success - keep: until it is deleted - host: xxx

You can restore backups with three modes:

a) using an exact date_time, ex: 2020-09-17_20-57-21 (it refers to the 
"cold" backup)

b) using "auto" mode, to restore the latest successful backup, no matter 
what backup mode was used (cold, hold or live)

c) using "[cold|hold|live]" mode, to restore the latest successful 
backup but, this time, only if backup mode matches

By default, "target" CTID is the same as "backup" CTID (666 in this 
example) so "restore" task will create/overwrite a container having CTID 
666.

# ---------------------

Let's try to restore a "live" backup:

# openvz-diff-backups restore $CTID live -q -t

=> *Error* - Container is running (ctid: 666 - status: "VEID 666 exist 
mounted running") - aborting (use "vzctl stop 666" to stop it manually)

This is normal: OVZDB will *never* overwrite a running CT.

We need to stop if first and, for the sake of completeness, without 
unmounting it:

# vzctl stop $CTID --skip-umount

=> Container was stopped

Now, CT is stopped but is still mounted: you can check that files exist 
in "/vz/root/666/" directory.

This can happen - when creating backups - if a container is "hooked" to 
another one.

Most common scenario is the use of bindfs (fuse) to mount a directory 
from CT A into CT B.

When doing a "cold" backup of container A, OVZDB will complain because 
"vzctl stop" won't be able to unmount it (backup will succeed nevertheless)

# ---------------------

Let's try again to restore a "live" backup:

# openvz-diff-backups restore $CTID live -q -t

=> *Error* - Container is mounted (ctid: 666 - status: "VEID 666 exist 
mounted down") - aborting (use "vzctl umount 666" to unmount it manually)

Same result: OVZDB will *never* overwrite a mounted container.

So we unmount it and proceed again:

# vzctl umount $CTID

=> Container is unmounted

# openvz-diff-backups restore $CTID live -q -t

=> *Error* - Container already exists (status: "VEID 666 exist unmounted 
down") - aborting (use force option "-f" to overwrite data)

As said by OVZDB, to overwrite an existing container, you must use the 
"force" option.

# ---------------------

Finally, we can restore our "evil" CT using "force" option:

# openvz-diff-backups restore $CTID live -f -q -t

=> *Success: 16s* -    1 ctid - 2020-09-17 22:18:28 - restore 666 live 
-f -q -t

And voilà! Our container was "resuscitated" in 16 seconds.

Note: if for any reason it did not work on your 
"server/container/whatever", please send me a message. ;)

# ---------------------

Now, let's clone this "evil" container:

# openvz-diff-backups restore $CTID cold auto -t

=>  Info    | Creating container (ctid: 667 - diskspace: 
"10485760:10485760" - diskinodes: "655360:655360" - private: 
"/vz/private/667" - root: "/vz/root/667" - ostemplate "centos-6-x86_64" 
- layout: "ploop")
=> *Notice* - IP address already in use (ip: "10.6.6.6") - starting 
container will fail (please, do not panic)
=> *Success: 1m39s* - 1 ctid - 2020-09-17 22:36:22 - restore 666 cold 
auto -t

When using "auto" mode (instead of specifying a target CTID) OVZDB will 
use the next available number (666 => 667) or a random UUID if backup 
CTID had one.

Before starting this brand new CT, we need to update its IP address:

# vzctl set 667 --hostname www.kick-me2.com--ipdel all --ipadd 10.6.6.7 
--save

# iptables -t nat -I POSTROUTING -s 10.6.6.7 -j MASQUERADE

# vzctl start 667

Wait ten seconds and verify that it pings:

# vzctl exec 667 "ping -a -c 3 www.acme.com"

Then, launch an full update (consider 667 as your *staging* container 
and 666 as your *prod* container):

# vzctl exec 667 "yum update -y"

=> Complete!

# ---------------------

Now that our staging container (667) is up to date, we want to clone it 
into prod container (666)

First, do a backup of CT 667 with "log-level" 9 (time) in order to see 
the time taken by each operation:

# openvz-diff-backups backup 667 auto -l 9 -o 16 -t

=> Backup - ctid:   667 - live_plop - 2020-09-17_23-12-36 - time:    
1m40s - size:     686 MB (84%) - speed:       8 MB/s - suspend:      
2.1s - memory:       8 MB (16%)

Second, stop CT 666 (because OVZDB will refuse to overwrite data if it 
is running or mounted):

# vzctl stop 666

=> Container was stopped

Third and final step, transfer CT 667 (staging) into CT 666 (prod):

# openvz-diff-backups restore 667 auto 666 -f -l 9 -o 16 -t

=> *Notice* - IP address already in use (ip: "10.6.6.7") - resuming 
container will fail (please, do not panic)
=> *Warning*- Unable to restore memory dump (path: 
"/var/tmp/openvz-diff-backups_restore_666_dump.delete-me" - error: "6") 
- please send a bug report if this happens again
=> *Notice* - Memory dump not found - switching to "hold_plop" (please, 
do not panic)

And in /var/log/vzctl.log:

=> vzctl : CT 666 : Unable to add ip 10.6.6.7: Address already in use
=> vzctl : CT 666 : (01.281371) Error (criu/cr-restore.c:2938): 
Restoring FAILED.

# ---------------------

At this point, and it was done on purpose, we are in a dead end:

1) memory of CT 666 could not be restored because CT 667 is still 
running and using 10.6.6.7 IP address
2) you cannot start CT 666 because it will try to resume with its memory 
(and will fail again)
3) let's pretend you cannot shut down CT 667 (to release the IP address) 
because [boss|client|devs|psycho] will kill you
4) and, most importantly, you cannot stop CT 666 (vzctl stop 666) 
because this *CENSORED* "Container is not running"

OpenVZ 7 is obviously right - CT 666 is not running - it is in 
"hibernation":

# vzlist 666

=> CTID NPROC STATUS  CFG_IP_ADDR  HOSTNAME
=> 666  suspended     10.6.6.7 www.kick-me2.com

Logically, you may think about destroying CT 666 (vzctl destroy 666), 
making a "cold" backup of 667 and then restoring it into CT 666.

But (again) let's say this is not an option:

- CT 666/667 are big fat containers with terabytes of data (took you 3 
days to copy all files over the network)
- within this period, CT 667 was heavily modified and 
[boss|client|devs|psycho] do not give a *CENSORED* about your little 
"preoccupation"

# ---------------------

So, last option, you need to proceed to a "hold" restore.

*Warning*: this is the worst case scenario because either memory was not 
saved ("hold" backup mode), either memory was discarded while starting 
the container.

In both case, you are accepting the risk of potential filesystem 
corruption (fsck will be your best friend) or database** corruption 
(MySQL/MariaDB/PostgreSQL/etc)

** if you are hosting CMS like 
Wordpress/Joomla/ModX/Prestashop/Magento/etc, data could be left in a 
"pending" state because most of these CMS do not use transactions.

On the contrary, if the container is a big Vanish cache using 128 Go of 
RAM, you really do not want to save/restore its memory (because it is 
"expendable").

Long story short:

a) delete OpenVZ "Dump" directory of prod container (666):

# rm -rv /vz/private/666/dump/Dump/

b) check that prod CT (666) is now in "stopped" state

# vzlist 666

=> CTID  NPROC STATUS  CFG_IP_ADDR  HOSTNAME
=> 666   stopped       10.6.6.6 www.kick-me2.com

b) change its IP address:

# vzctl set 666 --hostname www.kick-me.com--ipdel all --ipadd 10.6.6.6 
--save

=> Saved parameters for Container 666

c) start the container and, if you are a believer (I am not: computers 
only do what they are told to), light a candle

# vzctl start 666

=> Container start in progress...

Ok, sounds good: we want to verify:

# vzctl exec 666 "yum update -y"

=> No packages marked for update

# ---------------------

Part four will describe how to "preload" containers on spare/backups 
servers in order to improve recovery time.

Good night!

Le 16/09/2020 à 02:46, tranxene50 a écrit :

> Hello!
>
> This is part two of the "survival" guide that briefly describe how 
> openvz-diff-backups (OVZDB for short) works and what you can expect 
> from it on a daily basis.
>
> I repeat: English is not my native language. So, if you see something 
> weird, please quote the sentence and report it.
>
> # ---------------------
>
> Before digging into configuration parameters 
> (openvz-diff-backups.conf), let's have a look at the most used task 
> (ie. backup) and some useful options.
>
> # ---------------------
>
> First, create a "dummy" container (all examples below will rely on 
> this CTID):
>
> # CTID="666"
>
> # vzctl create $CTID
>
> # vzctl set $CTID --cpus 2 --ram 512M --swap 256M --hostname 
> www.kick-me.com--ipadd 10.6.6.6 --nameserver 9.9.9.9 --searchdomain "" 
> --save
>
> # iptables -t nat -I POSTROUTING -s 10.6.6.6 -j MASQUERADE
>
> # vzctl start $CTID
>
> Now, you should have a working container with network access (please 
> wait 10 seconds before it fully starts):
>
> # vzctl exec $CTID "ping -a -c 3 www.acme.com"
>
> If it pings 3 times, your are done (if not, wait and try again).
>
> # ---------------------
>
> Second, install OVZB (this is not the latest release but this is on 
> purpose):
>
> # OVZDB_RELEASE="v1.0.1.11-stable"
>
> # cd /usr/local/sbin/
>
> # wget 
> "https://download.openvz-diff-backups.fr/releases/openvz-diff-backups_${OVZDB_RELEASE}.tar.gz"
>
> # tar xvzf openvz-diff-backups_${OVZDB_RELEASE}.tar.gz
>
> # mv openvz-diff-backups_${OVZDB_RELEASE} openvz-diff-backups_stable
>
> # ln -s openvz-diff-backups_stable/openvz-diff-backups 
> openvz-diff-backups
>
> # rm openvz-diff-backups_${OVZDB_RELEASE}.tar.gz
>
> # ---------------------
>
> After that, when simply typing "openvz-diff-backups", it should run 
> and complain about missing tools: most of the time it is "bc", "dig", 
> "bzip2" or "rsync".
>
> Debian: apt-get install openssh-client rsync bc bzip2 dnsutils
>
> CentOS: yum install openssh-clients rsync bc bzip2 bind-utils
>
> # ---
>
> If you can, and this is *highly recommended*, please install pbzip2 
> and nocache:
>
> pbzip2 will speed up OVZBD "live" backups (ie. compressing memory 
> dump) and nocache will avoid to unnecessarilyfill the kernel page cache.
>
> Debian:
>
> # apt-get install nocachepbzip2
>
> CentOS:
>
> # cd /home
>
> # wget 
> https://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/p/pbzip2-1.1.12-1.el7.x86_64.rpm
>
> # yum install pbzip2-1.1.12-1.el7.x86_64.rpm
>
> # rm pbzip2-1.1.12-1.el7.x86_64.rpm
>
> # wget 
> https://ftp.nluug.nl/pub/os/Linux/distr/pclinuxos/pclinuxos/apt/pclinuxos/64bit/RPMS.x86_64/nocache-1.1-1pclos2019.x86_64.rpm
>
> # yum install nocache-1.1-1pclos2019.x86_64.rpm
>
> # rm nocache-1.1-1pclos2019.x86_64.rpm
>
> # ---------------------
>
> At this point, it is time to check if there are updates for OVZDB:
>
> # openvz-diff-backups update all check
>
> As expected, there is a new release so let's install it:
>
> # openvz-diff-backups update all install
>
> This is all, if it succeed, you are good to go.
>
> # ---
>
> Once in a day/week, you should use a cron job to run this command:
>
> # openvz-diff-backups update all auto -l 6 -q -q
>
> It will check for updates and automatically install them.
>
> Note: if you are using a very old version, run this command as many 
> time as necessary until you see: "your release is up to date. Yay!"
>
> # ---------------------
>
> To create a backup, you need to use the "backup" taskbut - because it 
> is the the first time and because you are quite "anxious" - please 
> activate the "dry-run" option (-d):
>
> # CTID="666"
>
> # openvz-diff-backups backup $CTID auto -d
>
> Normally, OVZDB will complain about missing SSH keys:"SSH error: 
> connection failure (host: "127.0.0.1" - error: "255") - please check 
> SSH private/public keys"
>
> By default, backups are stored on "localhost" in directory 
> "/home/backup/openvz-diff-backups" so you need to have a full root SSH 
> access, even if it is on "localhost"
>
> # ---
>
> To solve this, add your public key to "/root/.ssh/authorized_keys":
>
> 1) (optional) create a public/private key pair (if you already have 
> private/public keys, skip this step)
>
> # ssh-keygen -t rsa -b 4096
>
> 2) (required) add your public key to "/root/.ssh/authorized_keys":
>
> # ssh-copy-id -p 22 root at 127.0.0.1
>
> That's all: you are now ready to create your first "fake" backup.
>
> # ---------------------
>
> # openvz-diff-backups backup $CTID auto -d
>
> Unfortunately, because there is a bug in "v1.0.1.12-stable", a dry-run 
> backup won't succeed:
>
> => *Error* - Unable to save memory dump (path: 
> "/vz/private/666/dump/openvz-diff-backups_backup_666_dump.delete-me.bz2" 
> - error: "3") - aborting
>
> However, this is the expected and correct behavior of OVZDB: if 
> anything goes wrong (or is unexpected), it cleanly stops and report 
> the error
>
> The bug was fixed yesterday in my private git repo but I have not yet 
> published a new release: I want to be sure that there are no side 
> effects, even if the patch contains less than 50 characters.
>
> # ---
>
> Ok, instead of using "auto" mode, let's try the "sync" mode:
>
> # openvz-diff-backups backup $CTID sync -d
>
> So far so good, it works but - as said by OVZDB - "nothing will happen 
> for real, I promise!".
>
> Note: before creating a backup, OVZDB will always sync the container's 
> data with the OVZDB "cache".
>
> Long story short: the "cache" is useful every time you want to backup 
> a container because it will speed up the task.
>
> # ---
>
> So let's sync the "cache" for real with the use of "pause" option (-p) 
> and "verbose" option (-v).
>
> "pause" option will wait 3 seconds between every step and "verbose" 
> option (used once) will show you modified files.
>
> # openvz-diff-backups backup $CTID sync -p -v
>
> As you can see, every file/path/other being copied/updated/deleted is 
> displayed and you have the time to read every step before it runs.
>
> # ---
>
> But, do not lie to yourself: you have noticed that it was slow... 
> (although this is the expected behavior)
>
> => Syncing  - ctid:   666 - sync_data - 2020-09-16_00-04-39 - time:    
> 1m13s - speed:     *10 MB/s*
>
> By default, OVZDB will always copy data at 100 Mbits/s.
>
> This was a (very bad) design mistake made 5 years ago when I was 
> struggling with Proxmox 3, simfs layout and LVM2 snapshots over very 
> slow HDDdrives.
>
> At this time, Proxmox was using OpenVZ "Legacy" kernel before dropping 
> it in order to use LXC/LXD.
>
> I do not use LXC/LXD because it lacks some functionalities I need. 
> Nevertheless, this techno is very promising so I check their progress 
> once in a while.
>
> Back in the past: my goal was to be able to make backups without 
> stressing the host. It worked great but, nowadays, most of middle-end 
> dedicated servers have SSD/NVME.
>
> To correct that mistake, the "turbo" option (-t) was implemented: this 
> name is simply a (stupid) joke because it only allows OVZBD to run at 
> its normal speed.
>
> # ---
>
> Ok, let's run a backup in "auto" mode with "dry-run" option, "pause" 
> option, "turbo" option and "verbose" option.
>
> # openvz-diff-backups backup $CTID auto -d -p -t -v
>
> As you can see, you are waiting before each step and no files is 
> modified (this is normal: OVZDB "cache" is up to date).
>
> # ---
>
> Finally, this is it: now we want a "real" backup (only the "turbo" 
> option is required to bypass all speed limitations).
>
> # openvz-diff-backups backup $CTID auto -t
>
> => Backup   - ctid:   666 - live_plop - 2020-09-16_00-17-22 - 
> time:      21s - size:     570 MB (83%) - speed:      31 MB/s - 
> suspend:      2.9s - memory:       6 MB (14%)
>
> Backup succeed: it took 21 seconds to run, backup size is 83% of the 
> total data of the CT, we got a "brute force" speed of 31 MB/s, CT was 
> suspended almost 3 seconds and memory dump size is 14% of the total 
> size of CRIU dump.
>
> You may wonder why these "metrics" are so low, the reason is simple: 
> all examples/tests are done - on purpose - on very low-end hardware 
> (Atom D525 1.8 GHz) and a old 5400 rpm hard drive (Western Digital 
> WD10JFCX).
>
> https://ark.intel.com/content/www/us/en/ark/products/49490/intel-atom-processor-d525-1m-cache-1-80-ghz.html
>
> https://shop.westerndigital.com/products/internal-drives/wd-red-plus-sata-2-5-hdd#WD10JFCX
>
> However, here is some info in order to better understand the status line:
>
> 1) backup size of the first backup will always be "huge" because it 
> needs to save all files
> 2) speed indicates bandwidth speed needed in order to compete with 
> OVZDB (incremental/differential backup vs brute force copy)
> 3) suspend time, including memory dump extraction, is very dependent 
> of apps running by the CT (CRIU tries to do its best but sometimes, it 
> is just badly slow: any report will be appreciated!)
>
> # ---
>
> Now, let's do a second backup using "log level" set to 6 (notice), 
> "quiet" option and "turbo" option.
>
> # openvz-diff-backups backup $CTID auto -l 6 -q -t
>
> => Backup   - ctid:   666 - live_plop - 2020-09-16_01-00-30 - 
> time:      20s - size:       8 MB ( 1%) - speed:      34 MB/s - 
> suspend:      2.0s - memory:       6 MB (14%)
>
> Because this is the second backup, it will now only store differences, 
> hence the backup size of 1% (8 MB) of the total CT data.
>
> "log level" parameter let you decide how much detail you want to see 
> (but for log files, level 9 (time) is always used in order to have a 
> full view of operations)
>
> # ---
>
> We have tried "sync" mode (to fill OVZDB "cache"), "auto" mode (it 
> selects the appropriate mode according CT status), but you can choose 
> more precisely.
>
> OVZDB provides three backup modes:
>
> - "cold": if the container is running, it will be stopped, 
> snapshotted, restarted and finally saved. This is most the most 
> portable choice because only data matters
>
> - "hold": if the container is running, it will save its data 
> (snapshot) but without saving its RAM. It can be useful if you have, 
> for instance, a big fat Varnish cache.
>
> - "live": if the container is running, it will save both data and 
> memory: this mode is very reliable if need to restore a CT on the 
> same/similar hardware.
>
> In short, "live" mode should be you preferred choice for every CT.
>
> for instance, MySQL/MariaDB/PostgreSQL will need their memory - when 
> restored - in order to avoid corruption or database repair.
>
> # ---
>
> In the beginning, we briefly saw that "nocache" command could help: 
> when doing a backup, files are copied but, most of the time, it is 
> useless to store them in kernel cache.
>
> In order to avoid that, you can use an "optimize" option, "-o 16" to 
> be more precise.
>
> It will detect and use "nocache" command in order to to preserve legit 
> kernel page cache when creating a backup.
>
> Let's run a final backup with "log level" to "notice", "quiet" (used 
> twice), "turbo" and "optimize":
>
> # openvz-diff-backups backup $CTID auto -l 6 -q -q -t -o 16
>
> => Backup   - ctid:   666 - live_plop - 2020-09-16_02-20-15 - 
> time:      21s - size:       8 MB ( 1%) - speed:      31 MB/s - 
> suspend:      3.0s - memory:       6 MB (14%)
>
> # ---
>
> All options are displayed when running "openvz-diff-backups --help"
>
> I will try to enhanced inline "documentation" ASAP.
>
> # ---
>
> From now, you are able store your backups anywhere.
>
> You just need to adjust "MASTER_SSH_PATH" in the config file (copy 
> openvz-diff-backups.conf.sample to openvz-diff-backups.conf and modify 
> it)
>
> In part three, we will see how to restore/clone/duplicate containers.
>
> If you have any question, feel free to ask: not sure I will be able to 
> answer it but I will do my best.
>
> Good night!:-)
>
> Le 12/09/2020 à 02:20, tranxene50 a écrit :
>
>> Hello!
>>
>> Here is the first part of a quick "survival" guide in order to start 
>> off on the right foot with openvz-diff-backups (OVZDB for short).
>>
>> Please, be aware that English is not my native language. So, if you 
>> see something weird, please quote the sentence and correct it.
>>
>> Equally, if something is not clear, quote and ask: I will try to 
>> answer the best as I can.
>>
>> # ---------------------
>>
>> Firstly, you need to be aware that OVZDB use three 
>> "hosts/locations/storages" and "navigate" through them:
>>
>> # ---------------------
>>
>> - SOURCE : "host" where OVZDB is installed
>>
>> Most of the time, this is the server on which OpenVZ is running the 
>> containers you want to backup.
>>
>> But it can be any *nix system (with Bash/OpenSSH/rsync) in order to 
>> replicate (upload or download) backups between REMOTE and MASTER.
>>
>> Everything works over SSH as follow: SOURCE -> SSH key 1 -> MASTER -> 
>> SSH key 2 -> REMOTE
>>
>> # ---------------------
>>
>> - MASTER : *mandatory* "host" where backups are stored (copy A)
>>
>> Ideally, MASTER is a dedicated server/VPS/other because OVZDB relies 
>> on IOPS and, the more RAM you will have to cache dentries and inodes, 
>> the faster OVZDB will be.
>>
>> However, by default, backups are stored on the the same server 
>> (MASTER_SSH_PATH="root at localhost:/home/backup/openvz-diff-backups").
>>
>> This is useful if you want to test ASAP or if you have a secondary 
>> drive where backups can be stored (ex: sda for OpenVZ, sdb for backups).
>>
>> In this case, SOURCE will communicate with MASTER (both being on the 
>> same server) using SSH through localhost: as soon as "ssh -p 22 
>> root at 127.0.0.1" gives you a shell without asking for a password, you 
>> are done.
>>
>> On the contrary, if MASTER is a distant host (recommended), you need 
>> to adjust MASTER_SSH_PATH parameter.
>>
>> Ex: 
>> MASTER_SSH_PATH="root at backup.my-server.net:/any-absolute-path-you-want"(trailing 
>> slash is not needed and "backup.my-server.net" will always be 
>> resolved to its IPV4 or IPV6 address)
>>
>> If you need to use a SSH port different from 22, please see 
>> MASTER_SSH_OPTIONS parameter in config file (openvz-diff-backups.conf).
>>
>> # ---------------------
>>
>> - REMOTE : *optional* host where backups are replicated (copy B)
>>
>> In order to secure backups, you may want to replicate them, if 
>> possible, in a different geographical location.
>>
>> MASTER/REMOTE "hosts" can be anything as long as a *nix system is 
>> present with a shell, OpenSSH (other SSH servers have not been tested 
>> yet) and, the most important, rsync.
>>
>> This can be a big fat dedicated server, a large VPS, a medium 
>> instance in the Cloud, a NAS at home or even - if someone is willing 
>> to test (I didn't because mine is too old) - an Android smartphone...
>>
>> SOURCE "host" always requires a Bash shell but MASTER/REMOTE "hosts" 
>> only need a shell (sh/dash/ash/etc) and OVZDB can also deal with 
>> "Busybox" instead of using standard Unix tools.
>>
>> In short, OVZDB does not care and will run as long as the "host" can 
>> handle it (which can take hours/days on very low-end hardware).
>>
>> # ---------------------
>>
>> From SOURCE, you can launch any task (more details in part 2):
>>
>> - backup task will "convert" containers present on SOURCE into 
>> backups on MASTER
>>
>> - restore task will "convert" backups present on MASTER into 
>> containers on SOURCE
>>
>> - upload task will replicate backups present on MASTER to REMOTE (push)
>>
>> - download task will replicate backups present on REMOTE to MASTER 
>> (pull)
>>
>> - delete task will remove backups present on MASTER and/or REMOTE(you 
>> choose)
>>
>> - destroy task will wipe "cache" present on MASTER and/or REMOTE 
>> (more in part 2 because it is not intuitive)
>>
>> - update task will check and/or update OVZDB to its latest version 
>> ("one-click" upgrade)
>>
>> # ---------------------
>>
>> Before going into details about each command, here are some use case 
>> scenarios about backups:
>>
>> (to be shorter, I will not talk about migrating IP addresses, 
>> adjusting firewalls, replacing a dedicated server and other things)
>>
>> - 1 server
>>
>> Your only choice is to store backups on the same server, if possible 
>> on a secondary hard drive or, better, on an external hard drive.
>>
>> Long story short, if you are a believer, pray! ^^
>>
>> - 2 servers(one for prod, one for backup)
>>
>> If you have enough space, store backups on prod server (copy A) and 
>> replicate them (push) on backup server (copy B).
>>
>> (or, better, on backup server, replicate backups using "pull" mode: 
>> this is safer because it would require that both server are 
>> compromised to loose all your backups)
>>
>> Then, use OVZDB on backup server and restore every container on a 
>> daily basis to speed things in the event of an emergency "switch".
>>
>> This way, if prod server crash, you can restore containers on backup 
>> server and, because most files are already synced, you will be online 
>> again quickly.
>>
>> - 2 servers(both prod)
>>
>> If you have enough space (bis), store backups - of containers of each 
>> prod server - locally (copy A) and replicate them on the other prod 
>> server (copy B).
>>
>> (since both servers have root access to each other, using "pull" or 
>> "push" modes are equals: if one server is compromised, you are screwed).
>>
>> Or, you can create OpenVZ containers on both servers to restrict 
>> access to backups.
>>
>> This requires that prod A have no SSH keys in order to access to prod 
>> B and inversely.
>>
>> Prod A will use container A to store its backups (same for prod B 
>> with its container B) and then, you can use "pull" mode.
>>
>> Prod B will download backups from "restricted" container A and Prod A 
>> will download backups from "restricted" container B (this way, if a 
>> server is compromised, you still have backups).
>>
>> *WARNING: never, ever, store OVZDB backups in a container using Ploop 
>> layout: it will get insanely fat and "ploop balloon discard" won't 
>> help much*
>>
>> Instead, use bindfs to mount a directory from the host into the 
>> container.
>>
>> Then, again on a regular basis, restore containers from prod A on 
>> prod B and - you have guessed - restore containers from prod B on 
>> prod A.
>>
>> If one server crash, containers from the other server will be almost 
>> ready to ready to start: just one final restore and you are "done".
>>
>> - 3 servers(one for prod, one for backup, one for rescue)
>>
>> Ideal but may be costly.
>>
>> Store backups on backup server (in a different data center) and 
>> replicate them on rescue server (in a different geographical location).
>>
>> If backup server can handle the load of prod server, restore 
>> containers regularly on it in order to be ready to "switch" ASAP on 
>> it if prod crash.
>>
>> Rescue server can use "pull" mode to replicate backups (download): 
>> this way, if prod and backup servers are compromised, you still have 
>> backups.
>>
>> - 3 servers(two for prod, one for backup)
>>
>> If possible, store backups - of containers of each prod server - 
>> locally (copy A) and replicate them on the other server (copy B).
>>
>> Then use backup server to "pull" backups (if prod A and B are 
>> compromised, you still have backups).
>>
>> Or, but this is highly dangerous, store all backups from prod servers 
>> on backup server (push).
>>
>> - 3 servers(all prod)
>>
>> See "2 servers(both prod)"
>>
>> - 4 servers(3 for prod, one for backup)
>>
>> See "3 servers(two for prod, one for backup)"
>>
>> - more than 4 servers
>>
>> At this point, I assume that you are using XFS or a distributed 
>> filesystem (Ceph?).
>>
>> - more than 10 servers
>>
>> You know the drill, the only thing to know is that OVZDB needs IOPS 
>> and RAM in order, for the kernel, to cache inodes/entries.
>>
>> And, if you have 10 Gbits network cards, consider syncing and 
>> de-duplicating "root.hdd" using brute force. ^^
>>
>> # ---------------------
>>
>> This is all for today!
>>
>> Tomorrow, or later, I will explain each task: 
>> backup/restore/delete/upload/download in more details.
>>
-- 
tranxene50
tranxene50 at openvz-diff-backups.fr



More information about the Users mailing list