Chapter 12. Duplicate an operating System

Chapter 12. Duplicate an operating System
Prev		Next

12.1. Three way to duplicate a computer over the network

Goal of duplication is to easily deploy a computer over network without taking care of numbers of computer. They use parralel technology from clustering product (ka tools, dolly and dolly+). Those methods can duplicate SCSI or IDE hard drive, storage device, and support multiple filesystem (reiserfs, ext2, ext3, xfs, fat...). In this documentation we call golden node the node we want to clone.

12.1.1. KA method

With KA method you can quickly duplicate a computer using a desc file wich describe your partition table. KA method only duplicate data on partitions, so if you have 80go HDD disk, and only 10go on it, KA only duplicates 10go, and not the whole disk. Ka method can clone various Linux filesystem (ext2, ext3, resiserfs, xfs, jfs), and is able to regenerate the modprobe.conf file, so you can duplicate computer wich don't have the same hardware.

Drawbacks:

KA method doesn't not support RAID software
you can only clone Linux filesystem
you must use lilo bootloader
in case of computer trouble, the process of duplication stop
You can only duplicate same kind of HDD (IDE or SCSI)

12.1.2. Dolly method

Dolly is used to clone the installation of one machine to (possibly many) other machines. It can distribute image-files (even gnu-zipped), partitions or whole hard disk drives to other partitions or hard disk drives. As it forms a "virtual TCP ring" to distribute data, it works best with fast switched networks. For example, you can duplicate RAID 0 software on other computer.

Drawbacks:

when you duplicate an HDD, it duplicate all HDD, not only data on HDD. So Dolly can duplicate all filesystem (FAT, LVM, NTFS...)
Duplicate all HDD can take a while. There is no simple way to create the configuration so you have to learn how to write it (it's quite easy)
like KA, in case of computer trouble, the process of duplication stop. You can NOT clone an OS which is currently in use
If you clone an HDD or a partition, you must use the same HDD size, or the same partitions size
you can only duplicate one file/hdd/device

12.1.3. Dolly+ method

Dolly+ is based on dolly program, but it include more features.

improvement in Dolly+:

speed up by using multi threading (net->memory, memory->disk, memory->net threads)
multi file transfer: so you can clone more than one partitions
fail safe mechanism (bypass if a node has trouble)
separate Server (dollyS) which run in the host having the original file image and Client (dollyC) which run in hosts where images are cloned

Drawbacks: like dolly, expect for computer trouble.

12.2. HOW it works

12.2.1. 3 steps

The clone process works in three steps

PXE boot to retrieve stage1: the computer boot on PXE mode, retrieve vmlinuz and aninitrd. The computer is in stage1 mode, and is able to get the stage2 througt various installation method (nfs, ftp, http, dolly, ka). Network is up.
get stage2: the computer get the stage2 with the method you have choosen. You should use dolly or ka method to speed up the process to retrieve the stage2 if you want to clone many computer. Stage2 contains all necessary tools to recognize your hardware (the most important things is to detect you HDD), and all necessary tools to finalise the clone process.
Duplication process: the computer auto-probe needed modules to be able to access to HDD. Now you can choose dolly, ka or dollyC to integrate the duplication ring.

To resume, you can get stage2 from:

normal network: nfs, http
parrallel mode: dolly, ka

You can duplicate a node in stage2 with:

ka
dolly
dolly+

12.2.2. Needed files

All needed files are on the IGGI cdrom

install/stage2/rescue.clp: this is the stage2 file with all needed file to detect and probe modules, and launch the third step of the duplication process. This file will be used on the golden node, after few modification through the setup_ka_deploy.pl script, if you want to send stage2 via a parrallel method (dolly or ka).
isolinux/alt0/vmlinuz : linux kernel, needed in the /var/lib/tftpboot/X86PC/linux/images/ directory of the pxe server
isolinux/alt0/all.rdz : stage1 and all needed modules, must be in the same directory vmlinuz

12.3. Step 1: PXE, TFTP, DHCPD services

To easily clone a computer node, we use PXE technology to boot a kernel, and an initrd image wich contains all needed modules for network and media storage. Documentation about PXE can be found here: PXE doc. Please, keep in mind setting such services can DITURB your current network architecture

We need a TFTP server to share files over the network, in fact kernel and initrd, and a DHCPD server wich support PXE (various options in configuration file).

12.3.1. PXE parameters on server

Mandriva Linux installer support various method to install a computer. With PXE configuration file you can specify wich method you want to use to install your node, or add a specific option at boot prompt. Edit your default PXE configuration file to add your custom entry (/var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default).

PROMPT 1
DEFAULT local
DISPLAY messages
TIMEOUT 50
F1 help.txt

label local
    LOCALBOOT 0

label kamethod
    KERNEL images/vmlinuz
    APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
            automatic=method:ka,interface:eth0,network:dhcp root=/dev/ram3 rw rescue kamethod

label dolly
    KERNEL images/vmlinuz
    APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
            automatic=method:dolly,dolly_timeout:100,interface:eth0,network:dhcp rescue dollymethod

label nfs
    KERNEL images/vmlinuz
    APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
            automatic=method:nfs,interface:eth0,network:dhcp,server:10.0.1.253,directory:/cs4 root=/dev/ram3 rescue rw

label http
    KERNEL images/vmlinuz
    APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
            automatic=method:http,interface:auto,network:dhcp,server:10.0.1.253,directory:/cs4 root=/dev/ram3 rescue rw

At boot prompt no you can boot:

DEFAULT local : default boot will be local, change it with the name of a LABEL
local : boot local
dolly : automatic mode, get stage2 through dolly, set timeout to get stage2 to 100 seconds, if it fails, dolly will try 3 times and reboot. Computer should use eth0 network interface. rescue dollymethod words at the end of the line tell the computer to get the rescue.clp file, and use the dolly replication method
kamethod : automatic mode, get stage2 through ka. Network interface set to eth0. Auto setup the network with DHCP, and use the ka technology to launch the recplication method
nfs : get stage2 through NFS,
http : get stage2 through http,

KA method: at PXE boot you can add kaopt=ka_session_name to use a ka session name. You will to be able to assign computer to ka session, and restrict access to private KA session.

If you use the nfs or http method, put the rescue.clp file in /SERVER/install/stage2/ directory, or client node will be unable to find it.

12.3.2. Configuration files

12.4. Various scripts

12.4.1. clone_script

Use this script to auto-configure your computer as a PXE, DHCPD and TFTP server. You should NOT use it under the IGGI server, cause you PXE configuration is already ok. You can use it on another node, if you want your node to be a self PXE+tftp+dhcpd+clone server. clone_script help:

script to auto-configure PXE and DHCPD server
ie:
./clone_script -I -p 10.0.1.21 -w 10.0.1.253 -n 10.0.1 -s 10.0.1.253 -t 10.0.1.21

-I: install needed software
-p: IP address of the PXE server (should be this computer)
-w: IP address of the gateway
-n: NET base address for dhcpd conf (ie: 10.0.1)
-s: IP address of the DNS server
-t: IP address of the tftpserver (with vmlinuz and all.rdz)

ie:
./clone_script -c path/rescue.clp

You need mount right.
-c: prepare chroot with rescue.clp

12.4.2. ka-d-session.sh script

ka-d-session.sh is a quick script to deploy a chroot via KA method. It can be usefull if you want to send in memory nodes a specific chroot, like an mini-distribution.

Usage:
DATA_PATH=/mnt/ka ./ka-d-session.sh -n nb_nodes

12.5. Step 2: parrallel methods to get stage2

As explain before, you ca use two parralels method to get stage2 on client's computer the KA one, and the Dolly.

12.5.1. Get stage2 via KA method

First you have to get the rescue.clp file on IGGI cdrom (/install/stage2/rescue.clp). We have to convert the rescue.clp to an ISO9660 file, and mount it loop into /mnt/ka directory. You can do it manually, or use the clone_script, but you need mount right:

clone_script -c rescue.clp

Now our stage2 is ready, we can send it to all nodes. Choose a PXE entry with a automatic=method:ka line on the PXE server, boot all nodes to be cloned. Logon onto your golden node, and use the ka-d-session.sh script. For example if you want to send the stage2 to 4 nodes, just do:

DATA_PATH=/mnt/ka ./ka-d-session.sh -n 4

Now, you boot all your nodes. The replication process will start once all nodes are up and waiting on the ka screen.

Figure 12.1. KA stage1

If the nodes cannot reach the golden node, running the KA server, the message "Can't reach a valid KA server" will appear. Each node will try five times to reach the KA server, after that the node will reboot. As the node boots on kamethod, it will retry until it finds it. Node are now ready for step 3.

12.5.2. Get stage2 via Dolly

It's a little bit different. With dolly you can send a partition, or an image. Morever you need to write a configuration file, wich describe on wich computer you want to send the stage2, and what kind of stage2 you use (an image or partition).

12.5.2.1. Create a stage2.img

Get the rescue from your IGGI cdrom (/install/stage2/rescue.clp). Convert the rescue.clp to an ISO9660 file, and mount it loop into a directory, then create a file call stage2.img, format it in ext2, mount it loop too, and copy all data from the mount loop rescue directory to the mount loop stage2.img directory. clone_script will do it for you, but you need mount right:

clone_script -c rescue.clp

12.5.3. dolly.cfg configuration file

Exemple of a dolly server configuration file

infile stage2.img
outfile /dev/ram3
server servernode
firstclient 12.12.12.1
lastclient 12.12.12.3
clients 3
12.12.12.1
12.12.12.2
12.12.12.3
endconfig

infile stage2.img : input file in the server is stage2.img
outfile /dev/ram3 : output file in clients. '>' means dolly does not modify the image
server servernode : specify wich node is the dolly server
firstclient node1 : wich node is the first client
lastclient node3 : wich node is the last one
clients 3 : how many nodes
node1 ... node3 : names of the clients
endconfig : needed, end of configuration file

Choose a PXE entry with a automatic=method:dolly line on the PXE server, boot all nodes to be cloned. Logon onto your golden node, prepare your dolly configuration file, and launch:

dolly -s -v -f dolly.cfg

Now boot all your node in PXE mode, stage2 will be sent to all of them with the dolly method. Node are now ready for step 3.

Now, you boot all your nodes. The replication process will start once all nodes are up and waiting on the dolly screen.

Figure 12.2. Dolly stage1

If the nodes cannot reach the golden node, running the dolly server, the message "Can't reach a valid Dolly server" will appear. Each node will try three times to reach the Dolly server, with a default timeout set to 120. After that the node will reboot. As the node boots on dolly method, it will retry until it finds it. Node are now ready for step 3.

12.6. Step 3, the duplication process

12.6.1. Duplicate a golden node with KA

Logon the node you want to duplicate. You need to create a file wich describes your partition tables. Run fdisk_to_desc as root to get a desc file. Your desc file should be like this one:

linux 3500
extended fill
logical swap 500
logical linux fill

This file describes your partition table and the sample above can be considered as a default one for a recommended installation. There is a 3.5GB "/" partition, a 500 MB swap partition, and "/var" fills the rest, of course you can adjust sizes accoding to your system. Please refer to the man page to get more information (man ka-d).

Set your default PXE server to a PXE entry wich contains rescue kamethod and boot all nodes. So if you want to use ka to get stage2 on computers, and ka to duplicate your golden node, choose a PXE entry like this one:

APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
          automatic=method:ka,interface:eth0,network:dhcp root=/dev/ram3 rw rescue kamethod

If you want to use dolly to get stage2 on computers, and ka to duplicate your golden node, choose a PXE entry like this one:

APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
          automatic=method:dolly,interface:eth0,network:dhcp root=/dev/ram3 rw rescue kamethod

Now all is ready, launch your ka server:

ka-d.sh -r lilo -n nb_nodes -p sda/hda desc -x /tmp

-r lilo : run lilo in chroot after ka deploiement
-n nb_nodes : specify how many nodes to use
-p sda/hda desc : specify if you want to duplicate scsi or ide storage, and the name of the HDD
-x /tmp : exclude /tmp dorectory

See manpage of ka-d.sh for more help.

The duplication process will clone your drives following the description you have made. Nodes will rewrite their partition table, then format their filesystems (ReiserFs, XFS, ext2/3). Then, the drive duplication process will begin. On a fast Ethernet switch you can reach speeds of 10MBytes/sec.

At the end of the duplication process, each node will chroot its partitions and rebuild its /boot/initrd.img, and /etc/modprobe.conf. This step ensures that your node will reboot using its potential SCSI drives and adjusting its network card driver. Before rebooting, each node reinstalls lilo on the MBR. All your node are now ready, adn are clone of master node.

Don't forget to change the default PXE boot to local so node after replication will boot localy. network cards.

12.6.1.1. Know BUGS with desc file

fdisk_to_desc only works with MDK::Common (available in Mandriva Linux distro). If you have large capacity HDD fdisk_to_desc can create a wrong desc files. To fix it follow this procedure:

here is the result of fdisk to desc:

swap 509
linux 5992
extended 71657
logical linux 52281
logical linux 19375
logical linux 78167
logical linux 156327

just change it to:

swap 509
linux 5992
extended fill
logical linux 52281
logical linux 19375
logical linux 78167
logical linux fill

12.6.2. Duplicate a computer with dolly

More documentation about dolly can be found here: Dolly website.

Set your default PXE server to a PXE entry wich contains rescue dollymethod and boot all nodes. So if you want to use ka to get stage2 on computers, and dolly to duplicate your golden node, choose a PXE entry like this one:

APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
          automatic=method:ka,interface:eth0,network:dhcp root=/dev/ram3 rw rescue dollymethod

Typical dolly's configuraton file:

infile /dev/sda
outfile /dev/sda
server node1
firstclient node2
lastclient node5
clients 4
node2
node3
node4
node5
endconfig

infile /dev/sda5 : input file in the server is /dev/sda5
outfile /dev/sda5 : output file in clients. '>' means dolly does not modify the image
server node1 : specify wich node is the dolly server
firstclient node2 : wich node is the first client
node2 ... node5 : names of the clients
endconfig : needed, end of configuration file

Now just launch:

dolly -s -v -f dolly.cfg

Client computer are in dolly method, with a timeout of 300 seconds. At the ebnd of the timeout, the node autmatically reboot. If all is Ok, the dupliation process start, at the end, the node reboot, so don't forget to set PXE boot to localboot.

12.6.3. Duplicate a computer with dolly+

Dolly+ is based on dolly but have several improvement. You can find more original documentation at Dolly+

Set your default PXE server to a PXE entry wich contains rescue dollyCmethod and boot all nodes. So if you want to use dolly to get stage2 on computers, and dolly+ to duplicate your golden node, choose a PXE entry like this one:

APPEND initrd=images/all.rdz ramdisk_size=64000 vga=788 \
          automatic=method:dolly,interface:eth0,network:dhcp root=/dev/ram3 rw rescue dollyCmethod

Now lets see a typical configuration file. Be carreful Dolly+ doesn't support unneeded 'space' or 'tabulation', and each parameters must be on a new line.

iofiles 3
/dev/hda1 > /tmp/dev/hda1
/data/file.gz  >> /data/file
boot.tar.Z >> /boot
server max5.paris
firstclient max6.paris
lastclient max8.paris
clients 3
max6.paris
max7.paris
max8.paris
endconfig

iofiles 3 : 3 images to transfer
/dev/hda1 > /dev/hda1 : input file in the server is /dev/hda1 and output file in clients. '>' means dolly+ does not modify the image
/data/file.gz >> /data/file : input file is /data/file.gz and output file is /data/file. '>>>>' indicate dolly+ should cook the file according to the name of the file. Now. '.tar.', '.gz', 'tar.gz','tar.Z','cpio','cpio.gz','cpio.Z' are supported
boot.tar.Z >> /boot : input file is ./boot.tar.Z and output working directory is /boot. The right argument of '>>' in cases where the input name is 'tar' and 'cpio' should be a directory name
server max5.paris : dolly+ does not care, but must exist
firstclient max6.paris : dolly+ does not care, but must exist
max6.paris... max8.paris : names of the clients
endconfig : needed, end of configuration file

Now just launch:

dollyS -v -f dolly.cfg

Client computer are in dollyC method. if all Ok, the duplication process start, at the end, the node reboot, so don't forget to set PXE boot to localboot.

Typical node side output (thanx erwan):

installing driver piix (for "Intel Corporation|I/O Controller Hub PATA")
Can't find piix.ko in archive
can't find module piix
        failed
Installing driver ata_piix (for "Intel Corporation|I/O Controller Hub SATA cc=IDE")
Installing driver i2c-i801 (for "Intel Corporation|I/O Controller Hub SMBus")
Installing driver tg3 (for "Broadcom Corp.|NetXtreme BCM5721 Gigabit Ethernet PCI
Express")
Installing driver tg3 (for "Broadcom Corp.|NetXtreme BCM5721 Gigabit Ethernet PCI
Express")
I am a dolly+ client
Trying to build ring... 14:13:07.
Accepting port(9998).....cannot find myhost name in RING packet, trying to
continue...(Client.cpp)
 Server name is max5.paris, my name is max8.paris.(14:13:23)
Server(max5.paris) was selected for the next adjacent host.
RING packet recieved/sent.
HOST packet recieved/sent
---Packet contents printing  ---------------------
No of Bytes = 48
No.    flag   name
  0    1      'max5.paris'
  1    1      'max6.paris'
  2    1      'max7.paris'
  3    1      'max8.paris'
--------------------------------------------------
File packet recieved/sent.
---Packet contents printing  ---------------------
No of Bytes = 10
No.    flag   name
  0    0      '/dev/sda'
--------------------------------------------------
readnet() thread started
writenet() thread started. ALL are ready !
open before(14:13:23)-opening file pathname=/dev/sda flag=00
(14:13:23)after
writing to '/dev/sda'
exception is registrated in readnet()
start readnet process 1 files. 14:13:23
Next is server, writenet() exit.
file No.=0 processing...14:13:23

Typical server side output:

dollyS -v -f dolly
Read config file(dolly)...
server name is 'max5.paris' (10)
HOSTs
-------------List---------------------
Items = 4
No.    flag   name
  1    1      max5.paris
  2    2      max6.paris
  3    3      max7.paris
  4    3      max8.paris
--------------------------------------
FILEs
-------------List-------------------------------
Items = 1
No.    flag   name                 name
  1    0      /dev/sda             /dev/sda
------------------------------------------------
Trying to build ring...
Start sending RING Packet
use hostname 'max6.paris'
Connecting to max6.paris.....(host=max6.paris::9998)
setting NOBLOCK mode in open_connect()
Sending Ring packet succeeded. try to recieve the return.
Start recieving RING Packet back
Accepting port(9998).....Sent/Recieved Ring Packet
Start sending Host Packet
Start recieving  Host Packet back
sent/recieved HOST Packet.
-------------List---------------------
Items = 4
No.    flag   name
  1    1      max5.paris
  2    1      max6.paris
  3    1      max7.paris
  4    1      max8.paris
--------------------------------------
******************************************
* Host marked with flag=3 has a trouble. *
******************************************
starting threads
readdisk() thread started.
writenet() thread started. ALL are ready !
fileno=0 '/dev/sda' opened.
filenopacket(fileno=0) sending
10MB  44.05MB/s 10MB=44.1MB/s (0.2s 10MB=0.2s)
20MB  54.50MB/s 10MB=71.4MB/s (0.4s 10MB=0.1s)
....
10470MB  58.16MB/s 10MB=44.8MB/s (180.0s 10MB=0.2s)
...
33560MB  55.93MB/s 10MB=41.3MB/s (600.0s 10MB=0.2s)
...
57650MB  52.41MB/s 10MB=52.6MB/s (1100.0s 10MB=0.2s)
...
76420MB  47.76MB/s 10MB=38.0MB/s (1600.1s 10MB=0.3s)
...
80000MB  46.61MB/s 10MB=48.1MB/s (1716.4s 10MB=0.2s)
fileno=0 closed
TotalTime=1716.38(sec) DiskRead=1702.00(sec) SemaphoreWait=0.05(sec)
readdisk() ended..
Totaltime=1716.41(sec) WriteNetTime=718.33(sec) WaitSemaphoreTime=997.76(sec)
miscTime=1716.33(sec)
writenet() ended..
Transfer Bytes=80000MB. Transfer Speed=46.61MB/s

Clone 80go on 3 nodes take only TotalTime=1716.38(sec).

Prev		Next
Chapter 11. Diskless node	Home	Chapter 13. Administration of IGGI Nodes