Chapter 11. Diskless node

Table of Contents

11.1. Create diskless image
11.2. use Dolly to copy the image in nodes RAM

11.1. Create diskless image

If you want to use your node, without installing the IGGI distribution, you can use node in diskless mode. System will be copied into ram, and system will be auto-configured. Morever ganglia, OAR, taktuk2 will be ready. You need a large amout of ram on your nodes, because the system is copied into the ram. The minimal diskless image is about 150Mo. The diskless system will auto detect if a SWAP is available on your system, and automatically use it.

Use the script called prepare_diskless_image to generate the diskless image. This image will be created in you current directory.

[root@iggi ~]# prepare_diskless_image
 Usage:
 /usr/bin/prepare_diskless_image create package_name package_name2

Auto installed RPMS are:
net-tools passwd clusterscripts-client kernel-smp-2.6.12.26mdk-1-1mdk lam mpich openssh-server ganglia-core
      

If you want to create an image with the tcsh pacakge:

[root@iggi ~]# prepare_diskless_image create tcsh
 - cleaning old chroot
 - needed a basesystem based on rescue
642 blocks of size 65536. Preamble:
#!/bin/sh
#V1.0 Format
insmod cloop.o file=$0 
mount -r -t iso9660 /dev/cloop $1
exit $?

Block 0 length 3781 => 65536
Block 1 length 6629 => 65536
Block 2 length 8061 => 65536
........
 - Copy existing /dev
 - install wanted RPMS
mandatory RPMS: net-tools passwd clusterscripts-client kernel-smp-2.6.12.26mdk-1-1mdk lam mpich openssh-server ganglia-core
user choice: tcsh
installing libbzip2_1-1.0.3-1.2.20060mdk.i586.rpm rpm-helper-0.17-0. ...........
  161/162: libgtk+2.0_0          #############################################
  162/162: clusterscripts-client #############################################
.........
 - special rc.sysinit and go scripts
`/etc/rc.sysinit_diskless' -> `/root/chroot_test/etc/rc.sysinit'
`/etc/profile' -> `/root/chroot_test/etc/profile'
`/root/.bashrc' -> `/root/chroot_test/root/.bashrc'
useradd: unknown GID 100
 - try to reduce the size of the chroot
 - remove unwanted RPM
error: failed to open /etc/mtab: No such file or directory
warning: /etc/sysconfig/bootsplash saved as /etc/sysconfig/bootsplash.rpmsave
 - now we can clean the RPM database

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 If you want to change something in the chroot, do it now
/root/chroot_test
Else just press [ENTER]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Now the script stop, and you are free to do any modification in the chroot used to create the diskless image. Path to this chroot is set by default to /root/chroot_test directory.

There is a specific script, called go in the chroot_path/ka directory. This is the first script launched just after node retrieved the diskless image and probes all his needed modules. So if you want to start some services, mount specific NFS directory, you can do it there.

#!/bin/sh
service portmap restart
setup_client_cluster.pl doall
service ypbind restart
service autofs restart
service gmond restart
service xinetd restat
service sshd restart
SWAP=$(lsparts | grep swap | cut -d : -f 1)
swapon -v /dev/$SWAP

 - size of chroot:
215M    /root/chroot_test
 - create image diskeless_node.img
215900+0 records in
215900+0 records out
- format image diskeless_node.img
mke2fs 1.38 (30-Jun-2005)
diskeless_node.img is not a block special device.
Proceed anyway? (y,n) y
Filesystem label=cluster diskless
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
54000 inodes, 215900 blocks
0 blocks (0.00%) reserved for the super user
First data block=1
27 block groups
8192 blocks per group, 8192 fragments per group
2000 inodes per group
Superblock backups stored on blocks: 
        8193, 24577, 40961, 57345, 73729, 204801

Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 24 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
 - mount loop diskeless_node.img
 - copy all data to image
 - umount diskeless_node.img
 - size of diskeless_node.img
212M    diskeless_node.img
 - if you want to edit your diskless image, just mount it loop:
mount -o loop diskeless_node.img /tmp/diskless_node
 - create dolly conf like this one:
----------------------------
infile /root/diskeless_node.img
outfile /dev/ram3
server iggi.guibland.com
firstclient 12.12.12.1
lastclient 12.12.12.3
clients 3
12.12.12.1
12.12.12.2
12.12.12.3
endconfig
----------------------------

 - Set your PXE server default boot to dolly
 ! DONT forget to adjust the ramsize parameter !
 - now launch dolly to copy diskless image on nodes:
----------------------------
dolly -v -s -f dolly.cfg
----------------------------

11.2. use Dolly to copy the image in nodes RAM

The diskless image has been created in /root/diskeless_node.img. To be able to use it on nodes, we will use the dolly replication method, the easy way to send an image on all nodes. The script show you an basic example of dolly configuration file. See Chapter 12, Duplicate an operating System for more information about dolly.

Now check that the ramdisk size in dolly PXE entry is greater than your diskless image size, and setup you default PXE entry to dolly. You can check this parameter in /var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default, the line with label dolly strings.

label dolly
MENU LABEL Install a node with dolly 
       KERNEL images/vmlinuz
       APPEND initrd=images/all.rdz automatic=method:dolly,dolly_timeout:100,interface:eth0,network:dhcp
        ramdisk_size=270000 vga=text root=/dev/ram3 rw rescue dollymethod
[root@iggi ~]# setup_pxe_server.pl boot dolly

 - Entering setup mode
        - Patching /var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default
PXE Server is now set to boot on 'dolly' Entry

All is OK, now we just have to start all our nodes. All nodes boot, start in dolly mode, and wait for a dolly server. Start the dolly server.

[root@iggi ~]# dolly -v -s -f dolly.cfg
done.
I'm number -2
Parameter file: 
infile = '/root/diskeless_node.img'
outfile = '/dev/ram3'
using data port 9998
using ctrl port 9997
myhostname = 'iggi.guibland.com'
fanout = 1
nr_childs = 1
server = 'iggi.guibland.com'
I'm the server.
I'm not the last host.
There are 3 hosts in the ring (excluding server):
        '12.12.12.1'
        '12.12.12.2'
        '12.12.12.3'
Next hosts in ring:
        12.12.12.1 (0)
All parameters read successfully.
No compression used.
Using transfer size 4096 bytes.

Trying to build ring...
Connecting to host 12.12.12.1... data control.
Waiting for ring to build...
Host got parameters '12.12.12.1'.
Machines left to wait for: 3
Host ready '12.12.12.1'.
Machines left to wait for: 2
Host got parameters '12.12.12.2'.
Machines left to wait for: 2
Host ready '12.12.12.2'.
Machines left to wait for: 1
Host got parameters '12.12.12.3'.
Machines left to wait for: 1
Host ready '12.12.12.3'.
Machines left to wait for: 0
Accepted.
Sending...
Sent MB: 70, MB/s: 11.331, Current MB/s: 11.357      
Read 221081600 bytes from file(s).
Writing maxbytes = 221081600 to ctrlout
Sent MB: 221.       
Synced.
Clients done.
Time: 19.495386
MBytes/s: 11.340
Aggregate MBytes/s: 34.021
Transmitted.

Nodes are now in diskless mode. You can run command, copy data (only taktuk2 is available), login as an existing cluster user....

[root@iggi ~]# gstat -al1
iggi.guibland.com     2 (    0/  158) [  0.28,  0.09,  0.55] [   0.4,   0.0,   0.7,  98.6,   0.3] OFF
node1.guibland.com     1 (    3/   39) [  0.36,  0.09,  0.03] [   5.0,   0.0,  20.1,  74.9,   0.0] OFF
node3.guibland.com     1 (    0/   39) [  0.39,  0.10,  0.03] [   5.6,   0.0,  18.1,  76.3,   0.0] OFF
node2.guibland.com     1 (    1/   39) [  0.43,  0.12,  0.04] [   6.2,   0.0,  24.2,  69.6,   0.0] OFF

[root@iggi ~]# rshp2 $NKA -- uptime
 12:45:06 up 1 min,  1 user,  load average: 0.23, 0.08, 0.03
 12:45:06 up 1 min,  1 user,  load average: 0.28, 0.11, 0.04
 12:45:06 up 1 min,  1 user,  load average: 0.25, 0.09, 0.03

[root@iggi tmp]# mput2 $NKA -- 0005.png /tmp/
[root@iggi tmp]# rshp2 $NKA -- md5sum /tmp/0005.png
14fe4daff28f73c56560b5cbd8978056  /tmp/0005.png
14fe4daff28f73c56560b5cbd8978056  /tmp/0005.png
14fe4daff28f73c56560b5cbd8978056  /tmp/0005.png

[root@iggi tmp]# oarnodes -s
node1.guibland.com --> Alive
node2.guibland.com --> Alive
node3.guibland.com --> Alive

[root@iggi tmp]# ssh n1
-- ---
-- CLUSTER diskless mode ---
-- ---

[root@node1 /]$ free
             total       used       free     shared    buffers     cached
Mem:        513928     266184     247744          0     233288      13972
-/+ buffers/cache:      18924     495004
Swap:      1036152          0    1036152
[root@node1 /]$ 

[iggi@node1 ~]$ id
uid=12385(iggi) gid=100(all) groups=100(all),500(oar),12385(pvm)
[iggi@node1 ~]$ df
Filesystem            Size  Used Avail Use% Mounted on
/tmp/stage2/dev/ram3  205M  200M  5.2M  98% /
iggi.guibland.com:/home/nis/iggi
                       13G  3.6G  8.3G  31% /home/nis/iggi