UNCC ChargerNet and MPI

From RCSWiki

Revision as of 20:20, 20 March 2009 by Klhammon (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Contents

About

Summary

  • The NAS Parallel Benchmarks (NPB) are a small set of programs designed to help evaluate the performance of parallel supercomputers. The benchmarks, which are derived from computational fluid dynamics (CFD) applications, consist of five kernels and three pseudo-applications. The NPB come in several "flavors."


ChargerNet


Installing the NAS Benchmarks onto your account.

 # NOTE:  You will also need to register with the NASA site to download software.
  • I downloaded the NPB 3.3 version, simply because it is the latest version. Prior versions are available as well.
  • I downloaded it to my local PC. Because I was working from home, ChargerNet did not like the scp from home. I copied it to my home directory on Homer, and copied it to ChargerNet from there.
  • Login to your ChargerNet account. Navigate to the directory where you want to install NPB, type:
 tar -xvf NPB3.3.tar.gz
  • This will extract all of the source code files into the proper directory tree.


Setting EXPORT Variables

  • When I started to make my programs, mpicc was not in the path. I did some searching and found an instance of mpi installed onto the filesystem. I edited my .bash_profile to include the bin directory of that mpicc. I do believe it was for an earlier version of MPI-CH
  • That was a bad idea.
  • What I should have done was simply source the environment variables for OpenMpi. ChargerNet has different settings, depending on type version of compiler you want to use.
 . /apps/usr/env/openmpi-1.2.4-gnu.sh                   // To use the GNU compiler.
 . /apps/usr/env/openmpi-1.2.4-intel.sh                 // To use the INTEL compiler
 . /apps/usr/env/openmpi-1.2.4-pgi.sh                   // To use the PGI compiler.
  • This sets the proper mpicc and mpi77 in the PATH and sets the proper library locations as well.


Setting the compiler options

  • I was interested only in the MPI enabled versions of these benchmark programs, so I concentrated on them mostly.
  • Assuming you unpacked the .tar file directly in your home directory,
 # cd ~/NPB3.3/NPB3.3-MPI/config
   cp make.def.template make.def
   nano make.def (or vi, or emacs, or whatever you use to edit text files)
  • change the line:
 MPIF77 = f77

to

 MPIF77 = mpif77
  • and
 MPICC = cc

to

 MPICC = mpicc
  • You should now be able to compile each of the individual programs.


A brief description of the actual benchmark programs

The benchmarks are divided into two types of programs: kernel and CFD programs.

  • KERNALS
    • CG: A conjugate gradient method is used to compute an approximation to the smallest eigenvalue of a large, sparse, symmetric positive definite matrix. This kernel is typical of unstructured grid computations in that it tests irregular long distance communication, employing unstructured matrix vector multiplication.
    • EP: An "embarrassingly parallel" kernel. It provides an estimate of the upper achievable limits for floating point performance, i.e., the perfor-mance without signifcant interprocessor communication.
    • FT: A 3-D partial differential equation solution using FFTs. This kernel performs the essence of many "spectral" codes. It is a rigorous test of long-distance communication performance.
    • IS: A large integer sort. This kernel performs a sorting operation that is important in 'particle method" codes. It tests both integer computation speed and communication performance.
    • MG: A simplifed multigrid kernel. It requires highly structured long distance communication and tests both short and long distance data communication.
  • APPLICATIONS
    • BT: A simulated CFD application that uses an implicit algorithm to solve 3-dimensional (3D) compressible Navier-Stokes equations. The finite differences solution to the problem is based on an Alternating Direction Implicit (ADI) approximate factorization that decouples the x, y and z dimensions. The resulting systems are Block-Tridiagonal of 5×5 blocks and are solved sequentially along each dimension.
    • LU: A simulated CFD application that uses symmetric successive over-relaxation (SSOR) method to solve a seven-block-diagonal system resulting from finite-difference discretization of the Navier-Stokes equations in 3-D by splitting it into block Lower and Upper triangular systems.
    • SP: A simulated CFD application that has a similar structure to BT. The finite differences solution to the problem is based on a Beam-Warming approximate actorization that decouples the x, y and z dimensions. The resulting system has Scalar Pentadiagonal bands of linear equations that are solved sequentially along each dimension.


Making the programs

  • In your ~/NPB3.3/NPB3.3-MPI directory, type:
 make

the following will be displayed.

  =========================================
  =      NAS Parallel Benchmarks 3.3      =
  =      MPI/F77/C                        =
  =========================================
  To make a NAS benchmark type 
        make <benchmark-name> NPROCS=<number> CLASS=<class> [SUBTYPE=<type>]
  where <benchmark-name>  is "bt", "cg", "ep", "ft", "is", "lu",
                             "mg", or "sp"
        <number>          is the number of processors
        <class>           is "S", "W", "A", "B", "C", or "D"
  Only when making the I/O benchmark:
        <benchmark-name>  is "bt"
        <number>, <class> as above
        <type>            is "full", "simple", "fortran", or "epio"
  To make a set of benchmarks, create the file config/suite.def
  according to the instructions in config/suite.def.template and type
        make suite
***************************************************************
* Remember to edit the file config/make.def for site specific *
* information as described in the README file                 *
***************************************************************
  • Obviously, the <benchmark-name> is one of the above types of benchmark that you are trying to compile.
  • The <class> of a benchmark is defined as follows:
# S
# W
# A
# B
# C
# D
# E
  • Each type of program is only valid for certain a <number> of parallel NPROCS
    • BT, SP, & EP must be compiled with a square (n^i) number of processes, i.e., 1, 4, 9, 16, 25, 36, 49, 64.
      • BT also includes 4 subtypes: full, simple, fortran and epio.
    • FT, IS, CG, LU & MG must be compiled using a power of 2 (2n) number of processes, i.e., 1, 2, 4, 8, 16, 32, 64.

A bash script to compile all the programs

This script should walk through all the possible combinations of application, data class and processor number, 'make'-ing all of them.

#!/bin/sh
#
#Script to submit all possible NPB make combinations to "make"
#
for APP in bt
do
   echo "#$APP submit files"
   for CLASS in S W A B C D
   do
       for NODE in 1 4 9 16 25 36 49 64
       do
           make $APP NPROCS=$NODE CLASS=$CLASS
           for SUBTYPE in full simple fortran epio
           do
               make $APP NPROCS=$NODE CLASS=$CLASS SUBTYPE=$SUBTYPE
           done
       done
   done
done
for APP in ep sp
do
   echo "#$APP submit files"
   for CLASS in S W A B C D E
   do
       for NODE in 1 4 9 16 25 36 49 64
       do
           make $APP NPROCS=$NODE CLASS=$CLASS
       done
   done
done
for APP in ft is cg lu mg
do
   echo "#$APP submit files"
   for CLASS in S W A B C D E
   do
       for NODE in 1 2 4 8 16 32 64
       do
           make $APP NPROCS=$NODE CLASS=$CLASS
       done
   done
done

ChargerNet and Condor

  • ChargerNet uses Condor to handle cluster load balancing and job scheduling. Refer to http://www.cs.wisc.edu/condor/ for more information about condor. The biggest thing to realize is that you need to generate submit files for each of the above program iterations. Quite a daunting task!
  • If you type:
 which condor_submit

you should get

/apps/condor/bin/condor_submit

this is the program you will use to submit jobs to the cluster.

  • Some other useful condor commands:
condor_q -  prints the current condor queue
condor_rm <job_no> - removes the selected job from the queue.

Building a Submit File

  • Already we are talking about lots of different programs that need to be run. Scripts are here to help. I use the following to generate all the job submission files.
  • Copy the make script and add the following lines:

NPB=NPB3.3/NPB3.3-MPI OUT_DIR=NPB.Output SUB_DIR=NPB.Submit UNIVERSE=parallel EXECUTABLE=/apps/condor/scripts/openmpi-64 CONDOR=/apps/condor/bin/condor_submit function build_submit {

   S_APP=$2.$3.$4
   echo "Building Submit File for $S_APP"
   echo "Universe = $UNIVERSE" > $1
   echo "Executable = $EXECUTABLE" >> $1
   echo "arguments = ~/$NPB/bin/$S_APP" >> $1
   echo "machine_count = $4" >> $1
   echo "output = $OUT_DIR/out.$S_APP" >> $1
   echo "error = $OUT_DIR/err.$S_APP" >> $1
   echo "log = $OUT_DIR/log.$S_APP" >> $1
   echo "getenv = true" >> $1
   echo "queue" >> $1
   echo "" >> $1

}


then instead of

 SUBFILE=$SUB_DIR/submit.$APP.$CLASS.$NODE
 make $APP NPROCS=$NODE CLASS=$CLASS

change to

 SUBFILE=$SUB_DIR/submit.$APP.$CLASS.$NODE
 build_submit $SUBFILE $APP $CLASS $NODE
 $CONDOR -n mpi $SUBFILE


you could also add

condor_submit -n mpi $OUTFILE

after the build_submit line and submit your job as soon as the submission file is generated!

Sit Back & Watch

  • The ChargerNet is quite busy these days as even a small program can take some time to get through the queue. Using condor_submit with the correct mpi parameters -n mpi for mpi jobs greatly increases the chance that your job will run.
  • Check the status of your jobs at
https://chargernet.uncc.edu/8001/jobstatus/

References

Personal tools