Research: BioCluster

Gained access to the BioCluster

Login: ssh username@hpc.oit.uci.edu

Help: cat /data/help/cheat-sheet.txt

Guidebook on how to use the BioCluster created by Kevin Thornton:

Example to run jobs on the cluster:

Batch jobs are jobs that contain all of necessary information and instructions to run inside a script. You create a script with your favorite editor (like emacs) and then submit the script to the scheduler to run.

Some jobs can run for days, weeks, or longer so batch is the way to go for such work. Once you submit a job to the scheduler, you can log off and come back at a later time and check on the results.

Serial batch jobs are usually the simplest to use. Serial jobs run with only one core and are also the slowest since they only consume 1-core per job.

Consider the following serial job script available from the HPC demo account.

cat ~demo/serial.sh

#!/bin/bash
#$ -N TEST
#$ -q free64
#$ -m beas

date  > out

Grid Engine Directive	What It Does
#!/bin/bash	Running shell to use ( the bash shell )
#$ -N TEST	Our Job Name is TEST. If output is produced to standard out, you will see a file name TEST.o<jobid> and TEST.e<jobid> for errors (if any occurred)
#$ -q free64	Request the free64 queue
#$ -m beas	Send you email of job status (b)egin, (e)rror, (a)bort, (s)suspend

Grid Engine Directive

What It Does

#!/bin/bash

Running shell to use ( the bash shell )

#$ -N TEST

Our Job Name is TEST. If output is produced to standard out, you will see a file name TEST.o<jobid> and TEST.e<jobid> for errors (if any occurred)

#$ -q free64

Request the free64 queue

#$ -m beas

Send you email of job status (b)egin, (e)rror, (a)bort, (s)suspend

The first line #!/bin/bash is the shell to use. Grid Engine (GE) directives start with #$. GE directives are needed in order to tell the scheduler what queue to use, how many cores to use, whether to send email or not, etc.

The last line in our serial.sh script is the program to run. In this example it is a simple date program writing the output to out file.

date > out

Now that we have a basic understanding let’s run our first serial batch job on the HPC Cluster. First create a test directory, change to the test directory, copy the demo serial.sh script to our new directory and submit the job.

From your HPC account, do the following:

$ mkdir serial-test
$ cd serial-test
$ cp ~demo/serial.sh .
$ qsub serial.sh
$ qstat -u $USER

After you submit the job (qsub), GE will respond with a job ID:

Your job 1961 ("TEST") has been submitted

and qstat will display something similar to this:

job-ID  prior   name   user     state submit/start  queue       slots

  1961 0.00000  TEST  jfarran   qw    08/16/2012                 1

The state of our job is qw queue wait (meaning the job is sitting in the queue waiting for a compute node). The core count (slots) shows as 1 (this is the default which is one core).

When we run qstat -u $USER again a few seconds later, we see:

job-ID  prior   name   user    state submit/start  queue               slots

  1961 0.50659  TEST  jfarran   r   08/16/2012    free64@compute-7-11   1

The scheduler found compute-7-11 on free64 queue available with 1 core (slots) and started our job #1961 on it. The job state changed from queue wait qw to running r.

Once you submit your job (qsub), things happen rather quickly so you may need to type qstat repeatedly and fast to see your job. Or open a new window and run: watch -d "qstat -u $USER"

Once the job completes you will get an email notification and the qstat output will be empty.

Now do an ls and you will see the following files:

out  serial.sh

The serial.sh is the batch job we submitted and file out is the output from the date program. To see the output type:

$ cat out

Research

Wednesday, January 28, 2015

BioCluster

No comments:

Post a Comment