Difference: CemCluster (1 vs. 18)

Revision 1824 Mar 2009 - Main.FabianaRenzi

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work
The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
To use the cluster properly, jobs should be submitted through a batch submission system
In general, do not run jobs on the head node (node 0)

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

When the script is finished, it will leave 2 output files:
- samplescript.e88888 -- standard error from program
- samplescript.o88888 -- standard output from program (may be a very large file with EMAN or spider)
To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

In the case of mpi spider, run it using mpirun
The PBS job file should be in the format -

#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh

Notes on command
- -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Spider Refinement on the cluster

Scripts for spider refinement are provided in the folling zipfile: cluster_refinement_scripts.zip
After unzipping, need to edit the file refine_setting.pam
- Change the number of nodes to the number you want. Currently set for 8 nodes
Follow the regular instructions for spider refinement
To start the refinement, type

      >qsub refine_clu.qsub

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Added:

>
>

limiting memory on a job

To limit total memory used per node:

#PBS -l nodes=13:ncpus=3,mem=6gb

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="spider refinement scripts for cluster" date="1219084947" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" user="Main.BillRice" version=""

Revision 1721 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work
The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
To use the cluster properly, jobs should be submitted through a batch submission system
In general, do not run jobs on the head node (node 0)

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

Added:

>
>

When the script is finished, it will leave 2 output files:
- samplescript.e88888 -- standard error from program
- samplescript.o88888 -- standard output from program (may be a very large file with EMAN or spider)

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

In the case of mpi spider, run it using mpirun
The PBS job file should be in the format -

#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh

Notes on command
- -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Spider Refinement on the cluster

Scripts for spider refinement are provided in the folling zipfile: cluster_refinement_scripts.zip
After unzipping, need to edit the file refine_setting.pam
- Change the number of nodes to the number you want. Currently set for 8 nodes
Follow the regular instructions for spider refinement
To start the refinement, type

      >qsub refine_clu.qsub

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="spider refinement scripts for cluster" date="1219084947" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" user="Main.BillRice" version=""

Revision 1618 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work
The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
To use the cluster properly, jobs should be submitted through a batch submission system
In general, do not run jobs on the head node (node 0)

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

In the case of mpi spider, run it using mpirun
The PBS job file should be in the format -

#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh

Notes on command
- -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Added:

>
>

Spider Refinement on the cluster

Scripts for spider refinement are provided in the folling zipfile: cluster_refinement_scripts.zip
After unzipping, need to edit the file refine_setting.pam
- Change the number of nodes to the number you want. Currently set for 8 nodes
Follow the regular instructions for spider refinement
To start the refinement, type

      >qsub refine_clu.qsub

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Changed:

<
<

META FILEATTACHMENT	attachment="cluster_refinement_scripts.zip" attr="" comment="spider refinement scripts for cluster" date="1219084946" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" stream="cluster_refinement_scripts.zip" user="Main.BillRice" version="0"

>
>

META FILEATTACHMENT	attr="" autoattached="1" comment="spider refinement scripts for cluster" date="1219084947" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" user="Main.BillRice" version=""

Revision 1518 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work
The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
To use the cluster properly, jobs should be submitted through a batch submission system
In general, do not run jobs on the head node (node 0)

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

In the case of mpi spider, run it using mpirun
The PBS job file should be in the format -

#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh

Notes on command
- -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

Added:

>
>

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Added:

>
>

META FILEATTACHMENT	attachment="cluster_refinement_scripts.zip" attr="" comment="spider refinement scripts for cluster" date="1219084946" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" stream="cluster_refinement_scripts.zip" user="Main.BillRice" version="0"

Revision 1416 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Added:

>
>

The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
To use the cluster properly, jobs should be submitted through a batch submission system
In general, do not run jobs on the head node (node 0)

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

In the case of mpi spider, run it using mpirun
The PBS job file should be in the format -

#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh

Notes on command
- -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Revision 1315 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Added:

>
>

Using the MPI version of spider

In the case of mpi spider, run it using mpirun
The PBS job file should be in the format -

#!/bin/sh
#PBS -l nodes=14:ncpus=4

Added:

>
>

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh

Notes on command
- -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Revision 1215 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

Added:

>
>

* Click on a node to get a detailed view of what how busy it is

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

Changed:

<
<

>
>

Changed:

<
<

* xpbsmon screenshot2:

>
>

META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT	attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Deleted:

<
<

META FILEATTACHMENT	attachment="xpbsmon1.png" attr="" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" stream="xpbsmon1.png" user="Main.BillRice" version="0"
META FILEATTACHMENT	attachment="xpbsmon2.png" attr="" comment="xpbsmon screenshot2" date="1218836256" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" stream="xpbsmon2.png" user="Main.BillRice" version="0"

Revision 1115 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

Added:

>
>

* xpbsmon screenshot2:

META FILEATTACHMENT	attachment="xpbsmon1.png" attr="" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" stream="xpbsmon1.png" user="Main.BillRice" version="0"

Added:

>
>

META FILEATTACHMENT	attachment="xpbsmon2.png" attr="" comment="xpbsmon screenshot2" date="1218836256" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" stream="xpbsmon2.png" user="Main.BillRice" version="0"

Revision 1015 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Added:

>
>

* xpbsmon screenshot1:

META FILEATTACHMENT	attachment="xpbsmon1.png" attr="" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" stream="xpbsmon1.png" user="Main.BillRice" version="0"

Revision 915 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

Added:

>
>

In general, pbs submission scripts should start with the following 3 lines:

#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

Added:

>
>

If the script is called samplescript.sh, then the following unix command will submit it to the cluster:

> qsub samplescript.sh

To monitor cluster use, use the qstat command:

> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq

Another way is to run xpbsmon, which will give a graphical view of cluster use.

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 815 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample refinement script:

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

There are two versions of spider on the cluster:
- spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
- spidermpi is the MPI aware version of spider, can use entire cluster
  - only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
    - ap nq
    - ap mq
    - ap sh
    - ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

Can use pbs to speed up loops on non-mp aware processes
example of a spider script to speed up a non-linear filter
Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

Changed:

<
<

filter.spi---------------------

>
>

filter.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script

The script which actually does the work:

Changed:

<
<

spider_script_filt1_clu_slave.spi----------------------------------------------

>
>

spider_script_filt1_clu_slave.spi

;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

Added:

>
>

To run the script:
- spider spi/spd @filter
even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 715 Aug 2008 - Main.BillRice

   META TOPICPARENT   name="CemIT" 

Contents
 
 General guidelines
   Batch submission system 
  EMAN
 
 


 General guidelines 
 
 home directory has 20 GB space allocated to all of CEM computing
  /usr/data has 100 GB of space allocated to all of CEM computing
  both disks operate ar same speed
  In general, do most of your processing on /usr/data, since there is more space available
  Please remove data after processing so that others will have space to work
 

 Batch submission system 
 
 We have PBS (commercial version) for handling of jobs
  Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
 

 EMAN 
 
 EMAN will work with the batch submission software (PBS) installed on the cluster
  Need to make a batch submission script for cluster use
-  META TOPICPARENT
+ name="CemIT"
-<
<
+ sample
->
>
+ sample refinement script:
 #!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

 
 Notes on script: 
 #!/bin/bash    --- use the bash shell
  #PBS -N EMAN  -- name the job EMAN (you can name it what you like)
  #PBS -l nodes=8:ncpus=4  --- Use 8 nodes on the cluster, with 4 processors per node
  cd $PBS_O_WORKDIR   ---  This is needed so that script will be run in the current directory, rather than your home directory
  refine 8 ang=8.5 ... proc=32    -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
 
  To submit the script, if it is named emanrefine.sh
 
   > qsub emanrefine.sh


 SPIDER 
 
 There are two versions of spider on the cluster: 
 spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
  spidermpi is the MPI aware version of spider, can use entire cluster 
 only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands: 
 ap nq
  ap mq
  ap sh
  ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
 
 
 
  Usually, the whole cluster is only used in the refinement process  * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation
-<
<
+ pubsub does not work on the cluster, but it can be replaced with qsub, as follows
->
>
+ pubsub does not work on the cluster, but it can be replaced with qsub, as follows:
  Using PBS with SPIDER to replace pubsub: 
 
 Can use pbs to speed up loops on non-mp aware processes
  example of a spider script to speed up a non-linear filter
-<
<
+ Note that this sccript is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used
->
>
+ Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used
->
>
+filter.spi---------------------
 ;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
 
 The script which actually does the work:
->
>
+spider_script_filt1_clu_slave.spi----------------------------------------------
 ;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d




 Protomo 
 

 Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  Use the qsub submission process, and make a qsub script for each command
  sample refinement script: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

 
 Notes on script: 
 #PBS -l nodes=1:ncpus=1  --- Only use one CPU on one node
  #PBS -N PROTOMO  --- name the job PROTOMO
 
  Sample script for calculating tomogram: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt




 

 cd $PBS_O_WORKDIR   --- required to change to the workign directory
 
 

 Set ALLOWTOPICVIEW = 
 

-- BillRice - 15 Aug 2008

Revision 615 Aug 2008 - Main.BillRice

   META TOPICPARENT   name="CemIT" 

Contents
 
 General guidelines
   Batch submission system
 


 General guidelines 
 
 home directory has 20 GB space allocated to all of CEM computing
  /usr/data has 100 GB of space allocated to all of CEM computing
  both disks operate ar same speed
  In general, do most of your processing on /usr/data, since there is more space available
  Please remove data after processing so that others will have space to work
 

 Batch submission system 
 
 We have PBS (commercial version) for handling of jobs
  Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
-  META TOPICPARENT
+ name="CemIT"
-<
<
+ EMAN
->
>
+ EMAN
  EMAN will work with the batch submission software (PBS) installed on the cluster
  Need to make a batch submission script for cluster use
  sample 

 
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

 
 Notes on script: 
 #!/bin/bash    --- use the bash shell
  #PBS -N EMAN  -- name the job EMAN (you can name it what you like)
  #PBS -l nodes=8:ncpus=4  --- Use 8 nodes on the cluster, with 4 processors per node
  cd $PBS_O_WORKDIR   ---  This is needed so that script will be run in the current directory, rather than your home directory
  refine 8 ang=8.5 ... proc=32    -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
 
  To submit the script, if it is named emanrefine.sh
 
   > qsub emanrefine.sh
-<
<
+ SPIDER 
 Protomo
->
>
+ SPIDER 
 
 There are two versions of spider on the cluster:
->
>
+ spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
  spidermpi is the MPI aware version of spider, can use entire cluster 
 only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands: 
 ap nq
  ap mq
  ap sh
  ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
 
 
 
  Usually, the whole cluster is only used in the refinement process  * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation
 
 

 pubsub does not work on the cluster, but it can be replaced with qsub, as follows
 
 Using PBS with SPIDER to replace pubsub: 
 
 Can use pbs to speed up loops on non-mp aware processes
  example of a spider script to speed up a non-linear filter
  Note that this sccript is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used 
 
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
 
 The script which actually does the work:
 
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d




 Protomo
  Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  Use the qsub submission process, and make a qsub script for each command
  sample refinement script: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

 
 Notes on script: 
 #PBS -l nodes=1:ncpus=1  --- Only use one CPU on one node
  #PBS -N PROTOMO  --- name the job PROTOMO
 
  Sample script for calculating tomogram: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt




 

 cd $PBS_O_WORKDIR   --- required to change to the workign directory
 
 

 Set ALLOWTOPICVIEW = 
 

-- BillRice - 15 Aug 2008

Revision 515 Aug 2008 - Main.BillRice

   META TOPICPARENT   name="CemIT" 

Contents
 
 General guidelines
   Batch submission system
   EMAN
   SPIDER
   Protomo
 


 General guidelines 
 
 home directory has 20 GB space allocated to all of CEM computing
  /usr/data has 100 GB of space allocated to all of CEM computing
  both disks operate ar same speed
  In general, do most of your processing on /usr/data, since there is more space available
  Please remove data after processing so that others will have space to work
 

 Batch submission system 
 
 We have PBS (commercial version) for handling of jobs
  Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
 

 EMAN 
 
 EMAN will work with the batch submission software (PBS) installed on the cluster
  Need to make a batch submission script for cluster use
  sample 

 
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

 
 Notes on script: 
 #!/bin/bash    --- use the bash shell
  #PBS -N EMAN  -- name the job EMAN (you can name it what you like)
  #PBS -l nodes=8:ncpus=4  --- Use 8 nodes on the cluster, with 4 processors per node
  cd $PBS_O_WORKDIR   ---  This is needed so that script will be run in the current directory, rather than your home directory
  refine 8 ang=8.5 ... proc=32    -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
 
  To submit the script, if it is named emanrefine.sh
 
   > qsub emanrefine.sh


 SPIDER 
 Protomo 
 

 Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  Use the qsub submission process, and make a qsub script for each command
  sample refinement script: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR
-  META TOPICPARENT
+ name="CemIT"
-<
<
+cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt
tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log
tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param

...etc
->
>
+cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
->
>
+tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param
  Notes on script: 
 #PBS -l nodes=1:ncpus=1  --- Only use one CPU on one node
  #PBS -N PROTOMO  --- name the job PROTOMO
 
  Sample script for calculating tomogram: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt




 

 cd $PBS_O_WORKDIR   --- required to change to the workign directory
 
 

 Set ALLOWTOPICVIEW = 
 

-- BillRice - 15 Aug 2008

Revision 415 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
- #!/bin/bash --- use the bash shell
- #PBS -N EMAN -- name the job EMAN (you can name it what you like)
- #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
- cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
- refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
To submit the script, if it is named emanrefine.sh

   > qsub emanrefine.sh

SPIDER

Protomo

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO

Deleted:

<
<

cd $PBS_O_WORKDIR

cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param

...etc

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 315 Aug 2008 - Main.BillRice

   META TOPICPARENT   name="CemIT" 

Contents
 
 General guidelines
   Batch submission system
   EMAN
 


 General guidelines 
 
 home directory has 20 GB space allocated to all of CEM computing
  /usr/data has 100 GB of space allocated to all of CEM computing
  both disks operate ar same speed
  In general, do most of your processing on /usr/data, since there is more space available
  Please remove data after processing so that others will have space to work
 

 Batch submission system 
 
 We have PBS (commercial version) for handling of jobs
  Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
 

 EMAN 
 
 EMAN will work with the batch submission software (PBS) installed on the cluster
  Need to make a batch submission script for cluster use
  sample 

 
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

 
 Notes on script:
-  META TOPICPARENT
+ name="CemIT"
-<
<
+ #!/bin/bash    --- use the bash shell
  #PBS -l nodes=8:ncpus=4  --- Use 8 nodes on the cluster, with 4 processors per node
  cd $PBS_O_WORKDIR   ---  This is needed so that script will be run in the current directory, rather than your home directory
  refine 8 ang=8.5 ... proc=32    -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
->
>
+ #!/bin/bash    --- use the bash shell
  #PBS -N EMAN  -- name the job EMAN (you can name it what you like)
  #PBS -l nodes=8:ncpus=4  --- Use 8 nodes on the cluster, with 4 processors per node
  cd $PBS_O_WORKDIR   ---  This is needed so that script will be run in the current directory, rather than your home directory
->
>
+ refine 8 ang=8.5 ... proc=32    -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
 
  To submit the script, if it is named emanrefine.sh
 
   > qsub emanrefine.sh
  SPIDER 
 Protomo 
 

 Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  Use the qsub submission process, and make a qsub script for each command
  sample refinement script: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO

cd $PBS_O_WORKDIR

cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt
tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log
tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param
 
...etc

 
 Notes on script: 
 #PBS -l nodes=1:ncpus=1  --- Only use one CPU on one node
  #PBS -N PROTOMO  --- name the job PROTOMO
 
  Sample script for calculating tomogram: 

 
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt




 

 cd $PBS_O_WORKDIR   --- required to change to the workign directory
 
 

 Set ALLOWTOPICVIEW = 
 

-- BillRice - 15 Aug 2008

Revision 215 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
#!/bin/bash --- use the bash shell
#PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)

SPIDER

Protomo

Added:

>
>

Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
Use the qsub submission process, and make a qsub script for each command
sample refinement script:

#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO

cd $PBS_O_WORKDIR

cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt
tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log
tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param
 
...etc

Notes on script:
- #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
- #PBS -N PROTOMO --- name the job PROTOMO
Sample script for calculating tomogram:

#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

cd $PBS_O_WORKDIR --- required to change to the workign directory

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 115 Aug 2008 - Main.BillRice

META TOPICPARENT	name="CemIT"

Contents

General guidelines

home directory has 20 GB space allocated to all of CEM computing
/usr/data has 100 GB of space allocated to all of CEM computing
both disks operate ar same speed
In general, do most of your processing on /usr/data, since there is more space available
Please remove data after processing so that others will have space to work

Batch submission system

We have PBS (commercial version) for handling of jobs
Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

EMAN will work with the batch submission software (PBS) installed on the cluster
Need to make a batch submission script for cluster use
sample

#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

Notes on script:
#!/bin/bash --- use the bash shell
#PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)

SPIDER

Protomo

Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

View topic | History: r18 < r17 < r16 < r15 | More topic actions...

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding this intranet, Send feedback