Table of Contents
The Persistence of Vision Ray-Tracer creates three-dimensional, photo-realistic images using a rendering technique called ray-tracing. It reads in a text file containing information describing the objects and lighting in a scene and generates an image of that scene from the view point of a camera also described in the text file. Ray-tracing is not a fast process by any means, but it produces very high quality images with realistic reflections, shading, perspective and other effects.
There is an unofficial version of povray called mpipovray that allows users to compute povray images using an MPI cluster. This is useful for testing a cluster.
Copy the sample scripts in /usr/share/doc/povray-mpi/sample to the home directory of a user. Then edit the chessItMpi script for choosing a resolution and the number of processes for a node. You can also choose the X that will receive the result of this computation.
Then run the chessItMpi script, it will launch the computation of “chess” on all your cluster. You can compare your results using the tabsnet web site.
First we need to install povray-common , povray-official , povray-mpich on server and on all nodes. Use urpmi and urpmi parralel to do that.
[iggi@iggi ~]$ cp -v /usr/share/doc/povray-mpich-3.1g/sample/chess* . `/usr/share/doc/povray-mpich-3.1g/sample/chess2.pov' -> `./chess2.pov' `/usr/share/doc/povray-mpich-3.1g/sample/chessItMpi' -> `./chessItMpi'
Now edit chessItMpi to check it. We need to know all OAR nodes availables:
[iggi@iggi ~]$ oarnodes -s node1.guibland.com --> Alive node2.guibland.com --> Alive node3.guibland.com --> Alive
I want to use all those nodes, so we have to check the /usr/share/mpich/machines.LINUX file and be sure all nodes are present:
[iggi@iggi ~]$ cat /usr/share/mpich/machines.LINUX node1.guibland.com:1 node2.guibland.com:1 node3.guibland.com:1
We use mpi through rsh, so rsh server must be turned on all nodes, and must work.
[iggi@iggi ~]$ rshp2 $NKA -- cat /etc/xinetd.d/rsh | grep disable disable = no disable = no disable = no [iggi@iggi ~]$ su - Password: [root@iggi ~]# rshp2 $NKA -- service xinetd.d status [root@iggi ~]# rshp2 $NKA -- service xinetd status xinetd (pid 2112) is running... xinetd (pid 2060) is running... xinetd (pid 2058) is running... [root@iggi ~]# exit [iggi@iggi ~]$ rsh n1 w 15:01:07 up 1:25, 2 users, load average: 2.07, 1.66, 1.42 USER TTY LOGIN@ IDLE JCPU PCPU WHAT root tty1 14:13 38:53 0.07s 0.07s -bash
All nodes are ready, so i can submit a job with oarsub, tell that my job will take 1 hour and 20 minutes, and i need 3 nodes.
[iggi@iggi ~]$ oarsub -l nodes=3,walltime=1:20 ./chessItMpi IdJob = 20
Blender is the open source software for 3D modeling, animation, rendering, post-production, interactive creation and playback. Blender 2.42 for testing purpose is include in the IGGI distribution. Why ? Because it's a good way to to benchmark our cluster, and check that's all is OK.
This documentation will explain you how to, step by step, do rendering on your cluster with blender and 1drqueue. To setup our rendering farm, we will use our server (iggi.guibland.com) has a master drqueue server, and 3 nodes (node1, node2, node3) and the server has slave renderer. I strongly recommend to use node with a SWAP partition available, cause blender can work with a large amount of Memory.
First we need to install drqueue and blender on our server.
[root@iggi ~]# urpmi blender drqueue
Now drqueue must be configured, using our server has a master drqueue server. Just edit the /etc/profiles.d/drqueue.sh file.
[root@iggi ~]# cat /etc/profile.d/drqueue.sh #!/bin/sh export DRQUEUE_DB=/var/db/drqueue export DRQUEUE_LOGS=/home/nis/renderuser/log export DRQUEUE_MASTER=iggi.guibland.com export DRQUEUE_BIN=/usr/bin export DRQUEUE_ETC=/etc export DRQUEUE_ROOT=/
Now start the drqueue master and slave daemon on server.
[root@iggi ~]# master.Linux.i686 Could not open config file: '/etc/drqueue/master.conf' Parsing config at: /etc/master.conf Waiting for connections... [root@iggi ~]# slave.Linux.i686 Could not open config file: '/etc/drqueue/slave.conf' Parsing config at: /etc/slave.conf HWINFO Report Name: iggi.guibland.com Architecture: Intel OS: Linux Processor type: Pentium 4 64bit-32bit cpu: 32bit Processor speed: 3001 MHz Number of processors: 2 Memory: 1010 Mbytes Pools: 0 - Default DEBUG: pool list that was sent: Pools: 0 - Default Highest file descriptor after initialization 7 Waiting for connections...
All is ok. Our Cluster server is now a master and a slave node to do blender rendering. We will now create a dedicated renderuser user to launch rendering, store log. This user will be a NIS user, part of the cluster NIS + Autofs + NFS services availables on our cluster.
[root@iggi ~]# adduserNis.pl ----------------------------------------------------------- Add New user in NIS environnement on iggi.guibland.com user with an uid > 500 are NIS user ----------------------------------------------------------- Login : renderuser Group(s) [users] (You are member of mpi, oar, pvm by default) : - Backup of /etc/group configuration Adding renderuser in mpi group. mpi group not found! Exiting Adding renderuser in pvm group. Adding renderuser in oar group. ---------------------------------------------------------- Login: renderuser Group: users Comment: passwd renderuser: Changing password for user renderuser. New UNIX password: Retype new UNIX password: passwd: all authentication tokens updated successfully. gmake[1]: Entering directory `/var/yp/guibland.com' Updating passwd.byname... Updating passwd.byuid... Updating group.byname... Updating group.bygid... Updating netid.byname... # mail netgrp publickey networks ethers bootparams printcap \ # amd.home auto.master auto.home auto.local passwd.adjunct \ # timezone locale netmasks gmake[1]: Leaving directory `/var/yp/guibland.com' - Creating ssh key for user renderuser Generating public/private dsa key pair. ssh_askpass: exec(/usr/lib/ssh/ssh-askpass): No such file or directory ssh_askpass: exec(/usr/lib/ssh/ssh-askpass): No such file or directory Your identification has been saved in /home/nis/renderuser/.ssh/id_dsa. Your public key has been saved in /home/nis/renderuser/.ssh/id_dsa.pub. The key fingerprint is: 58:54:82:29:7f:da:12:5a:37:1d:83:10:c2:2a:d7:90 [email protected] - Authorize user to ssh himself - Setting .rhosts file for renderuser - Setting default .xinitrc for user - Create mutt config - Setting permission on file - Adjust chmod to 0644 on .rhost key
User renderuser has been succesfully created. To be sure that the drqueue ENV will be set at each login, you can add source /etc/profile.d/drqueue.sh in renderuser's ~/.bashrc file.
We can use a diskless system image with blender and drqueue to be able to do rendering on nodes. Use the prepare_diskless_image script to create this system image.
[root@iggi ~]# prepare_diskless_image create blender drqueue tcsh ..... ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ If you want to change something in the chroot, do it now /root/chroot_test Else just press [ENTER] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Now it's time to do some modifications on our diskless system image. Copy the /etc/profiles.d/drqueue.sh server's file in the chroot. It's not mandatory, you can do it later, with a parrallel copy command (mput2 $NKA -- /etc/profiles.d/drqueue.sh /etc/profiles.d/drqueue.sh)
[root@iggi ~]# cp -v /etc/profile.d/drqueue.sh /root/chroot_test/etc/profile.d/drqueue.sh cp: overwrite `/root/chroot_test/etc/profile.d/drqueue.sh'? y `/etc/profile.d/drqueue.sh' -> `/root/chroot_test/etc/profile.d/drqueue.sh'
Now press [ENTER] and finish the diskless image creation process. Set your PXE server to dolly mode.
[root@iggi ~]# setup_pxe_server.pl boot dolly - Entering setup mode - Patching /var/lib/tftpboot/X86PC/linux/pxelinux.cfg/default PXE Server is now set to boot on 'dolly' Entry
Log on the server with the renderuser user. Launch blender, open your blend file, and set the rendering path to your home user, or nodes will not be able to write files. Use something like: /home/nis/renderuser/render_tmp/. Save your blend file. For testing purpose, you can use blend files from Elephant dream project.
Just boot all nodes, once up, start the dolly server on your cluster server.
[root@iggi ~]# dolly -v -s -f dolly.cfg done. I'm number -2 Parameter file: infile = '/root/diskeless_node.img' outfile = '/dev/ram3' using data port 9998 using ctrl port 9997 myhostname = 'iggi.guibland.com' fanout = 1 nr_childs = 1 server = 'iggi.guibland.com' I'm the server. I'm not the last host. There are 3 hosts in the ring (excluding server): '12.12.12.1' '12.12.12.2' '12.12.12.3' Next hosts in ring: 12.12.12.1 (0) All parameters read successfully. No compression used. Using transfer size 4096 bytes. Trying to build ring... Connecting to host 12.12.12.1... data control. Waiting for ring to build... Host got parameters '12.12.12.1'. Machines left to wait for: 3 read returned 0 from backflow in buildring. [root@iggi ~]# dolly -v -s -f dolly.cfg done. I'm number -2 Parameter file: infile = '/root/diskeless_node.img' outfile = '/dev/ram3' using data port 9998 using ctrl port 9997 myhostname = 'iggi.guibland.com' fanout = 1 nr_childs = 1 server = 'iggi.guibland.com' I'm the server. I'm not the last host. There are 3 hosts in the ring (excluding server): '12.12.12.1' '12.12.12.2' '12.12.12.3' Next hosts in ring: 12.12.12.1 (0) All parameters read successfully. No compression used. Using transfer size 4096 bytes. Trying to build ring... Connecting to host 12.12.12.1... data control. Waiting for ring to build... Host got parameters '12.12.12.1'. Machines left to wait for: 3 Host ready '12.12.12.1'. Machines left to wait for: 2 Host got parameters '12.12.12.2'. Machines left to wait for: 2 Host ready '12.12.12.2'. Machines left to wait for: 1 Host got parameters '12.12.12.3'. Machines left to wait for: 1 Host ready '12.12.12.3'. Machines left to wait for: 0 Accepted. Sending... Sent MB: 270, MB/s: 11.310, Current MB/s: 11.353 Read 272281600 bytes from file(s). Writing maxbytes = 272281600 to ctrlout Sent MB: 272. Synced. Clients done. Time: 24.73430 MBytes/s: 11.310 Aggregate MBytes/s: 33.931 Transmitted.
Nodes now probe all their needed modules, and go in auto-configuration mode. They are now availble to do the blender rendering.
[root@iggi ~]# rshp2 -v $NKA -- blender -v <node1.guibland.com> [rank:1]:Blender 2.42 <node2.guibland.com> [rank:2]:Blender 2.42 <node3.guibland.com> [rank:3]:Blender 2.42
Drqueue slave daemon must be started on all nodes.
[root@iggi ~]# rshp2 -v $NKA -- "source /etc/profile.d/drqueue.sh && slave.Linux.i686 "
Log as renderuser and launch drqman.Linux.i686.
Check that all nodes are available, and create a new job (page job).
Check that scene file, script directory, command use path from renderuser's home directory, or nodes will not be able to acces them. Now clic the Submit button.