Chapter 22. Benchmark

Table of Contents

22.1. Povray
22.2. Povray Mpi
22.3. How to benchmark?
22.4. Step by step example
22.5. Blender rendering farm
22.5.1. Server side
22.5.2. Nodes side
22.5.3. It's time to render !

22.1. Povray

The Persistence of Vision Ray-Tracer creates three-dimensional, photo-realistic images using a rendering technique called ray-tracing. It reads in a text file containing information describing the objects and lighting in a scene and generates an image of that scene from the view point of a camera also described in the text file. Ray-tracing is not a fast process by any means, but it produces very high quality images with realistic reflections, shading, perspective and other effects.

22.2. Povray Mpi

There is an unofficial version of povray called mpipovray that allows users to compute povray images using an MPI cluster. This is useful for testing a cluster.

22.3. How to benchmark?

Copy the sample scripts in /usr/share/doc/povray-mpi/sample to the home directory of a user. Then edit the chessItMpi script for choosing a resolution and the number of processes for a node. You can also choose the X that will receive the result of this computation.

Then run the chessItMpi script, it will launch the computation of “chess” on all your cluster. You can compare your results using the tabsnet web site.

22.4. Step by step example

First we need to install povray-common , povray-official , povray-mpich on server and on all nodes. Use urpmi and urpmi parralel to do that.

[iggi@iggi ~]$ cp -v /usr/share/doc/povray-mpich-3.1g/sample/chess* .
`/usr/share/doc/povray-mpich-3.1g/sample/chess2.pov' -> `./chess2.pov'
`/usr/share/doc/povray-mpich-3.1g/sample/chessItMpi' -> `./chessItMpi'

Now edit chessItMpi to check it. We need to know all OAR nodes availables:

[iggi@iggi ~]$ oarnodes -s
node1.guibland.com --> Alive
node2.guibland.com --> Alive
node3.guibland.com --> Alive

I want to use all those nodes, so we have to check the /usr/share/mpich/machines.LINUX file and be sure all nodes are present:

[iggi@iggi ~]$ cat /usr/share/mpich/machines.LINUX
node1.guibland.com:1
node2.guibland.com:1
node3.guibland.com:1

We use mpi through rsh, so rsh server must be turned on all nodes, and must work.

[iggi@iggi ~]$ rshp2 $NKA -- cat /etc/xinetd.d/rsh | grep disable
        disable = no
        disable = no
        disable = no
[iggi@iggi ~]$ su -
Password: 
[root@iggi ~]# rshp2 $NKA -- service xinetd.d status
[root@iggi ~]# rshp2 $NKA -- service xinetd status
xinetd (pid 2112) is running...
xinetd (pid 2060) is running...
xinetd (pid 2058) is running...

[root@iggi ~]# exit

[iggi@iggi ~]$ rsh n1 w 
 15:01:07 up  1:25,  2 users,  load average: 2.07, 1.66, 1.42
USER     TTY        LOGIN@   IDLE   JCPU   PCPU WHAT
root     tty1      14:13   38:53   0.07s  0.07s -bash

All nodes are ready, so i can submit a job with oarsub, tell that my job will take 1 hour and 20 minutes, and i need 3 nodes.

[iggi@iggi ~]$ oarsub -l nodes=3,walltime=1:20 ./chessItMpi
IdJob = 20

22.5. Blender rendering farm

Blender is the open source software for 3D modeling, animation, rendering, post-production, interactive creation and playback. Blender 2.42 for testing purpose is include in the IGGI distribution. Why ? Because it's a good way to to benchmark our cluster, and check that's all is OK.

This documentation will explain you how to, step by step, do rendering on your cluster with blender and 1drqueue. To setup our rendering farm, we will use our server (iggi.guibland.com) has a master drqueue server, and 3 nodes (node1, node2, node3) and the server has slave renderer. I strongly recommend to use node with a SWAP partition available, cause blender can work with a large amount of Memory.

22.5.1. Server side

First we need to install drqueue and blender on our server.

[root@iggi ~]# urpmi blender drqueue

Now drqueue must be configured, using our server has a master drqueue server. Just edit the /etc/profiles.d/drqueue.sh file.

[root@iggi ~]# cat /etc/profile.d/drqueue.sh
#!/bin/sh
export DRQUEUE_DB=/var/db/drqueue
export DRQUEUE_LOGS=/home/nis/renderuser/log
export DRQUEUE_MASTER=iggi.guibland.com
export DRQUEUE_BIN=/usr/bin
export DRQUEUE_ETC=/etc
export DRQUEUE_ROOT=/

Now start the drqueue master and slave daemon on server.

[root@iggi ~]# master.Linux.i686 
Could not open config file: '/etc/drqueue/master.conf'
Parsing config at: /etc/master.conf
Waiting for connections...

[root@iggi ~]# slave.Linux.i686 
Could not open config file: '/etc/drqueue/slave.conf'
Parsing config at: /etc/slave.conf
HWINFO Report
Name:                   iggi.guibland.com
Architecture:           Intel
OS:                     Linux
Processor type:         Pentium 4
64bit-32bit cpu:        32bit
Processor speed:        3001 MHz
Number of processors:   2
Memory:                 1010 Mbytes
Pools: 
        0 - Default
DEBUG: pool list that was sent:
Pools: 
        0 - Default
Highest file descriptor after initialization 7
Waiting for connections...

All is ok. Our Cluster server is now a master and a slave node to do blender rendering. We will now create a dedicated renderuser user to launch rendering, store log. This user will be a NIS user, part of the cluster NIS + Autofs + NFS services availables on our cluster.

[root@iggi ~]# adduserNis.pl 
-----------------------------------------------------------
Add New user in NIS environnement on iggi.guibland.com
user with an uid > 500 are NIS user
-----------------------------------------------------------
Login : 
renderuser
Group(s) [users] (You are member of mpi, oar, pvm by default) : 

 - Backup of /etc/group configuration
Adding renderuser in mpi group.
mpi group not found!
Exiting
Adding renderuser in pvm group.
Adding renderuser in oar group.
----------------------------------------------------------
Login: renderuser
Group: users
Comment: 
passwd renderuser:
Changing password for user renderuser.
New UNIX password: 
Retype new UNIX password: 
passwd: all authentication tokens updated successfully.
gmake[1]: Entering directory `/var/yp/guibland.com'
Updating passwd.byname...
Updating passwd.byuid...
Updating group.byname...
Updating group.bygid...
Updating netid.byname...
# mail netgrp publickey networks ethers bootparams printcap \
# amd.home auto.master auto.home auto.local passwd.adjunct \
# timezone locale netmasks
gmake[1]: Leaving directory `/var/yp/guibland.com'
 - Creating ssh key for user renderuser
Generating public/private dsa key pair.
ssh_askpass: exec(/usr/lib/ssh/ssh-askpass): No such file or directory
ssh_askpass: exec(/usr/lib/ssh/ssh-askpass): No such file or directory
Your identification has been saved in /home/nis/renderuser/.ssh/id_dsa.
Your public key has been saved in /home/nis/renderuser/.ssh/id_dsa.pub.
The key fingerprint is:
58:54:82:29:7f:da:12:5a:37:1d:83:10:c2:2a:d7:90 [email protected]

 - Authorize user to ssh himself
 - Setting .rhosts file for renderuser
 - Setting default .xinitrc for user
 - Create mutt config
 - Setting permission on file
 - Adjust chmod to 0644 on .rhost key

User renderuser has been succesfully created. To be sure that the drqueue ENV will be set at each login, you can add source /etc/profile.d/drqueue.sh in renderuser's ~/.bashrc file.

We can use a diskless system image with blender and drqueue to be able to do rendering on nodes. Use the prepare_diskless_image script to create this system image.

[root@iggi ~]# prepare_diskless_image create blender drqueue tcsh
.....

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 If you want to change something in the chroot, do it now
/root/chroot_test
Else just press [ENTER]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Now it's time to do some modifications on our diskless system image. Copy the /etc/profiles.d/drqueue.sh server's file in the chroot. It's not mandatory, you can do it later, with a parrallel copy command (mput2 $NKA -- /etc/profiles.d/drqueue.sh /etc/profiles.d/drqueue.sh)

[root@iggi ~]# cp -v /etc/profile.d/drqueue.sh /root/chroot_test/etc/profile.d/drqueue.sh
cp: overwrite `/root/chroot_test/etc/profile.d/drqueue.sh'? y
`/etc/profile.d/drqueue.sh' -> `/root/chroot_test/etc/profile.d/drqueue.sh'

Now press [ENTER] and finish the diskless image creation process. Set your PXE server to dolly mode.

[root@iggi ~]# setup_pxe_server.pl boot dolly
- Entering setup mode
        - Patching /var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default
PXE Server is now set to boot on 'dolly' Entry

Log on the server with the renderuser user. Launch blender, open your blend file, and set the rendering path to your home user, or nodes will not be able to write files. Use something like: /home/nis/renderuser/render_tmp/. Save your blend file. For testing purpose, you can use blend files from Elephant dream project.

Figure 22.1. Blender rendering path

Blender rendering path

22.5.2. Nodes side

Just boot all nodes, once up, start the dolly server on your cluster server.

[root@iggi ~]# dolly -v -s -f dolly.cfg
done.
I'm number -2
Parameter file: 
infile = '/root/diskeless_node.img'
outfile = '/dev/ram3'
using data port 9998
using ctrl port 9997
myhostname = 'iggi.guibland.com'
fanout = 1
nr_childs = 1
server = 'iggi.guibland.com'
I'm the server.
I'm not the last host.
There are 3 hosts in the ring (excluding server):
        '12.12.12.1'
        '12.12.12.2'
        '12.12.12.3'
Next hosts in ring:
        12.12.12.1 (0)
All parameters read successfully.
No compression used.
Using transfer size 4096 bytes.

Trying to build ring...
Connecting to host 12.12.12.1... data control.
Waiting for ring to build...
Host got parameters '12.12.12.1'.
Machines left to wait for: 3
read returned 0 from backflow in buildring.
[root@iggi ~]# dolly -v -s -f dolly.cfg
done.
I'm number -2
Parameter file: 
infile = '/root/diskeless_node.img'
outfile = '/dev/ram3'
using data port 9998
using ctrl port 9997
myhostname = 'iggi.guibland.com'
fanout = 1
nr_childs = 1
server = 'iggi.guibland.com'
I'm the server.
I'm not the last host.
There are 3 hosts in the ring (excluding server):
        '12.12.12.1'
        '12.12.12.2'
        '12.12.12.3'
Next hosts in ring:
        12.12.12.1 (0)
All parameters read successfully.
No compression used.
Using transfer size 4096 bytes.

Trying to build ring...
Connecting to host 12.12.12.1... data control.
Waiting for ring to build...
Host got parameters '12.12.12.1'.
Machines left to wait for: 3
Host ready '12.12.12.1'.
Machines left to wait for: 2
Host got parameters '12.12.12.2'.
Machines left to wait for: 2
Host ready '12.12.12.2'.
Machines left to wait for: 1
Host got parameters '12.12.12.3'.
Machines left to wait for: 1
Host ready '12.12.12.3'.
Machines left to wait for: 0
Accepted.
Sending...
Sent MB: 270, MB/s: 11.310, Current MB/s: 11.353      
Read 272281600 bytes from file(s).
Writing maxbytes = 272281600 to ctrlout
Sent MB: 272.       
Synced.
Clients done.
Time: 24.73430
MBytes/s: 11.310
Aggregate MBytes/s: 33.931
Transmitted.

Nodes now probe all their needed modules, and go in auto-configuration mode. They are now availble to do the blender rendering.

[root@iggi ~]# rshp2 -v $NKA -- blender -v
<node1.guibland.com> [rank:1]:Blender 2.42
<node2.guibland.com> [rank:2]:Blender 2.42
<node3.guibland.com> [rank:3]:Blender 2.42

Drqueue slave daemon must be started on all nodes.

[root@iggi ~]# rshp2 -v $NKA -- "source /etc/profile.d/drqueue.sh && slave.Linux.i686 "

22.5.3. It's time to render !

Log as renderuser and launch drqman.Linux.i686.

Figure 22.2. drqman

drqman

Check that all nodes are available, and create a new job (page job).

Figure 22.3. New blender job

New blender job

Check that scene file, script directory, command use path from renderuser's home directory, or nodes will not be able to acces them. Now clic the Submit button.

Figure 22.4. Running job

Running job

Figure 22.5. Drqman computer

Drqman computer