钻石画成品有公司收吗:OpenPBS 脚本样本 非常值得参考

来源:百度文库 编辑:九乡新闻网 时间:2024/04/28 03:30:24

OpenPBS 脚本样本 非常值得参考

Hereis a list of frequently asked questions that may help you answer anyquestions you may have before you even have to ask them.

  • "Using your account"
  • "Login without using password"
  • "Using a job queuing system"
  • "File backup"
  • "Using the cluster in a courteous way"


"Using your account"

  • Login and file transfer

    TELNET, RLOGIN and FTP have been disabled on ABACUS for security reasons. User can use SSH or SLOGIN to log into your accounts on ABACUS and use SCP to transfer files between different machines. Windows users can use "Secure Shell Client" for login and "Secure File Transfer Client" for file transfer. For those users who have no access to "Secure Shell Client" on Windows machines, can download a free SSH Client called putty.exe from the PuTTY web page. SFTP can be used instead of FTP for those who prefer using FTP for file transfer.

    Head node is the Login node. Its IP address is abacus.uwaterloo.ca. The following example shows how users can log into the head node. Login to compute nodes is not recommended, but they can do so when necessary. Users will get a uniform interface for the home directories no matter which node they log into.

    Example of login to ABACUS from another UNIX/Linux machine: Suppose that you are a user on a UNIX machine named monolith, you want to log into abacus, you have a user name of "foobar" on ABACUS and a password of "tricky". You do following (the texts in bold face are the commands you need to type in),

    monolith:~% ssh -l foobar abacus.uwaterloo.ca    foobar@abacus's password: tricky    [foobar@head ~]$    
    Example of transfer files between ABACUS and another UNIX/Linux machine: Suppose that you are a user on UNIX machine monolith, you want to transfer a file named file.txt which is located in the home directory of monolith, to ABACUS, your user name is "foobar" on ABACUS and password is "tricky". You do following,
    monolith:~% scp file.txt foobar@abacus.uwaterloo.ca:    foobar@abacus's password: tricky    
    Example of using SFTP: Suppose that you are a user on UNIX machine monolith, you want to transfer files between monolith and ABACUS, your user name is "foobar" on ABACUS and password is "tricky". You do following,
    monolith:~% sftp foobar@abacus.uwaterloo.ca    foobar@abacus's password: tricky    sftp>    
  • Changing the password

    First, login to ABACUS, then issue the command 'passwd'. The system will prompt you for the old (existing) password and ask you to choose a new password. Please follow this guideline in choosing a password,

    [foobar@head ~]$ passwd    

  • Login to compute nodes from a head node

    Supposed you have logged into head, you now want to log into node035 (i.e., quad32g001), you do,

    [foobar@head ~]$ ssh node035    

"Login without using password"


Users can generate an authentication key to login to ABACUS from anotherUNIX machine without using the password. The authentication key isdifferent for each machine, each pair of machines need to set it upindividually. Suppose a user named "foobar" wants to login to ABACUSfrom another UNIX machine monolith, follow these steps,
monolith:~% ssh-keygen -t rsaGenerating public/private rsa key pair.Enter file in which to save the key (/home/foobar/.ssh/id_rsa):Enter passphrase (empty for no passphrase):Enter same passphrase again:Your identification has been saved in /home/foobar/.ssh/id_rsa.Your public key has been saved in /home/foobar/.ssh/id_rsa.pub.The key fingerprint is:0c:44:8c:3e:b9:b4:20:e3:83:4b:19:d9:54:cf:65:35 foobar@monolith
Please note, when the system prompts for passphrase, just enter, don't type any passphrase.
monolith:~% cd .sshmonolith:~/.ssh% scp id_rsa.pub abacus:
On ABACUS,
[foobar@head ~]$ cd .ssh
If the file authorized_keys does not already exist,
[foobar@head .ssh]$ touch authorized_keys[foobar@head .ssh]$ cat ~/id_rsa.pub >> authorized_keys
Now, user foobar can login to ABACUS from monolith without typing the password,
monolith:~% ssh foobar@abacus

"Using a job queuing system"


TORQUE/PBS and Maui were installed on ABACUS for batch processing.

The Portable Batch System, PBS, is a workload management system forLinux clusters. It supplies command to submit, monitor, and delete jobs.It has the following components.

Job Server - also called pbs_server provides the basic batchservices such as receiving/creating a batch job, modifying the job,protecting the job against system crashes, and running the job.

Job Executor - a daemon (pbs_mom) that actually places the jobinto execution when it receives a copy of the job from the Job Server,and returns the job's output to the user.

Job Scheduler - a daemon that contains the site's policycontrolling which job is run and where and when it is run. PBS allowseach site to create its own Scheduler. Maui Scheduler is used on ABACUS.

Below are the steps needed to run user job:

  • Create a job script containing the PBS options.
  • Submit the job script file to PBS.
  • Monitor the job.

PBS Options

Below are some of the commonly used PBS options in a job script file. The options start with "#PBS."

Option                     Description======                     ===========#PBS -N MyJob              Assigns a job name. The default is the nameof PBS job script.#PBS -l nodes=4:ppn=2      The number of nodes and processors per node.#PBS -q queuename          Assigns the queue your job will use.#PBS -l walltime=01:00:00  The maximum wall-clock time during which thisjob can run.#PBS -o mypath/my.out      The path and file name for standard output.#PBS -e mypath/my.err      The path and file name for standard error.#PBS -j oe                 Join option that merges the standard error streamwith the standard output stream of the job.#PBS -W stagein=file_list  Copies the file onto the execution host beforethe job starts.#PBS -W stageout=file_list Copies the file from the execution host after thejob completes.#PBS -m b                  Sends mail to the user when the job begins.#PBS -m e                  Sends mail to the user when the job ends.#PBS -m a                  Sends mail to the user when job aborts (with anerror).#PBS -m ba                 Allows a user to have more than 1 command with thesame flag by grouping the messages together on 1line, else only the last command gets executed.#PBS -r n                  Indicates that a job should not rerun if it fails.#PBS -V                    Exports all environment variables to the job.
Job Script Example

A job script may consist of PBS directives, comments and executablestatements. A PBS directive provides a way of specifying job attributesin addition to the command line options.

For example, a simple job script, named geo1.bash, contains the following lines:

  #!/bin/bash#PBS -l nodes=1:ppn=1#PBS -VPBS_O_WORKDIR=/home/huang/tempmyPROG='/home/huang/software/nwchem-4.7/bin/LINUX64_x86_64/nwchem'myARGS='/home/huang/software/tce-test/geo-0.98.nw'cd $PBS_O_WORKDIR$myPROG $myARGS >& out1
An example to run a job in a specific node, contains the following lines:
  #!/bin/bash#PBS -l nodes=node035:ppn=1#PBS -VPBS_O_WORKDIR=/home/huang/tempmyPROG='/home/huang/software/nwchem-4.7/bin/LINUX64_x86_64/nwchem'myARGS='/home/huang/software/tce-test/geo-0.98.nw'cd $PBS_O_WORKDIR$myPROG $myARGS >& out1
Another example, a MPI job scipt, named geo2.bash, contains the following lines:
  #!/bin/bash#PBS -l nodes=4:ppn=4#PBS -VNCPUS=16PBS_O_WORKDIR=/home/huang/tempcd $PBS_O_WORKDIRcat $PBS_NODEFILE > .machinefilemyPROG='/home/huang/software/nwchem-4.7/bin/LINUX64_x86_64/nwchem_mpi'myARGS='/home/huang/software/tce-test/geo-0.98.nw'MPIRUN='/opt/mpich.pgi/bin/mpirun'$MPIRUN -np $NCPUS -machinefile .machinefile $myPROG $myARGS >& out2

The above job script templates should be modified for the need of thejob. You need to change the contents of the variables PBS_O_WORKDIR,myPROG and myARGS only.


Submitting a Job

Use the qsub command to submit the job,

qsub geo2.bash
PBS assigns a job a unique job identifier once it is submitted (e.g.70.head). This job identifier will be used to monitor status of the joblater. After a job has been queued, it is selected for execution basedon the time it has been in the queue, wall-clock time limit, and numberof processors.


Monitoring a Job

Below are the PBS commands for monitoring a job:

Command       Function=======       ========qstat -a      check status of jobs, queues, and the PBS serverqstat -f      get all the information about a job, i.e. resources requested,resource limits, owner, source, destination, queue, etc.qdel JobID    delete a job from the queueqhold JobID   hold a job if it is in the queueqrls JobID    release a job from hold

There are some quite useful Maui commands for monitoring a job, too:

Command            Description=======            ===========showq              Show a detailed list of submitted jobsshowbf             Show the free resources (time and processors available)at the momentcheckjob JobID     Show a detailed description of the job JobIDshowstart JobID    Gives an estimate of the expected started time of thejob JobID

For example, to check the status of a job,

qstat -f 70.headorcheckjob 70.head

"File backup"

File systems on the head node are backed up to tape drives once aweek. Incremental backup for the /home file system to another Linuxmachine is done daily. Users are also encouraged to back up their filesto another system or any removable media by themselves for safety. Forexample, to copy file over to another UNIX/Linux machine, users can usersync or scp commands. To copy files over to their PCs, users can use'SSH Secure File Transfer Client'.


"Using the cluster in a courteous way"

You might be wondering why your jobs are running slowly sometimes. Thereare numerous possible explanations for abacus's performance. However,the system load and the NFS file system are the two common issuescausing the problem.
  • High system load.
    ABACUS has 37 nodes, 33 of them are dual CPU systems, 4 of them are quad CPU systems. In each individual node, if the number of running jobs are more than 2 on the dual systems or more than 4 on the quad systems, each job is effectively only assigned part of a CPU for computation. Therefore, users are recommended to submit a job through a job queuing system rather than logging into a compute node to run a job there directly. The queuing system will balance the load among the nodes automatically.
  • I/O intensive jobs.
    User home directories are mounted using the NFS file system. No matter which node a user's job is running on, file reading and writing to the /home file system are taking place on the head node via the NFS mounting. Running jobs can be slowed down significantly if many of them are I/O intensive, since these jobs need to access files on the head node simultaneously. Therefore, users are required to use the scratch space local to the compute nodes for the intermediate files created by the running programs.
Briefly saying, users should use the cluster in a courteous way, and shouldn't run too many jobs at one time.