This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong.
$$$$$$$$$$$$$$$ ====== Array Jobs ======$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ Array jobs are essentially a mecanism for executing the very same script several times. Say that, for instance, you need to run a certain job script N times (you want to apply a certain action to an image). You would typically call the same script N times and change just one parameter (the image index i). In array jobs, you can specify the index range you want to execute and SGE will take care of the rest. There are plenty of advantages:$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ In order to execute an array job, simply add the following to a qsub call or a script header:$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ qsub -t 1-N ...$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ Where 1-N is the range you want to cover (note that a range 0-N is invalid).$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ In the script side, we will control the i-th call of our script using the variable SGE_TASK_ID. Take this as an example:$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ #!/bin/bash$$$$$$$$$$$$$$$ #$$$$$$$$$$$$$$$ # MatchDist.sh$$$$$$$$$$$$$$$ # Create a script for distributedly match a list of key files$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ export PATH=~/software/bundler_sfm/bin:$PATH$$$$$$$$$$$$$$$ export LD_LIBRARY_PATH=~/software/bundler_sfm/bin:$LD_LIBRARY_PATH$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ WORKDIR=$4$$$$$$$$$$$$$$$ LIST=$1$$$$$$$$$$$$$$$ TMPDIR=$5$$$$$$$$$$$$$$$ OUTDIR=$6$$$$$$$$$$$$$$$ mkdir -p $TMPDIR$$$$$$$$$$$$$$$ mkdir -p $OUTDIR$$$$$$$$$$$$$$$ RATIO=$3$$$$$$$$$$$$$$$ I=$((${SGE_TASK_ID}-1))$$$$$$$$$$$$$$$ OUT=$(echo $(printf "d" $I)_${2})$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ cd $WORKDIR$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ echo "KeyMatchSingle $LIST $OUTDIR/$OUT $RATIO $I"$$$$$$$$$$$$$$$ KeyMatchSingle $LIST $TMPDIR/$OUT $RATIO $I$$$$$$$$$$$$$$$ cp -f $TMPDIR/$OUT $OUTDIT/$OUT$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ In the example above, the SGE_TASK_ID variable is used as the i-th call of our array job, so we can call it by typing:$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ qsub -t ttn.q -t 1-1000 MatchDist.sh$$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ ==== Preventing too many tasks to be run simultaneously ====$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ If we know that our jobs will demand too many resources and might stall a node, we can prevent this by limiting the number of concurrent tasks for that specific job. Just add the following parameter in the qsub call or in the script header:$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ qsub -t ttn.q -t 1-1000 -tc 100 MatchDist.sh$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$
$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ This will allow at most 100 tasks to be executed simultaneously.$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$