Difference: CemCluster (1 vs. 18)

Revision 1824 Mar 2009 - Main.FabianaRenzi

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work
  • The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
  • To use the cluster properly, jobs should be submitted through a batch submission system
  • In general, do not run jobs on the head node (node 0)

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • When the script is finished, it will leave 2 output files:
    • samplescript.e88888 -- standard error from program
    • samplescript.o88888 -- standard output from program (may be a very large file with EMAN or spider)
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use. xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

  • In the case of mpi spider, run it using mpirun
  • The PBS job file should be in the format -
#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh
  • Notes on command
    • -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Spider Refinement on the cluster

  • Scripts for spider refinement are provided in the folling zipfile: cluster_refinement_scripts.zip
  • After unzipping, need to edit the file refine_setting.pam
    • Change the number of nodes to the number you want. Currently set for 8 nodes
  • Follow the regular instructions for spider refinement
  • To start the refinement, type
      >qsub refine_clu.qsub

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory
Added:
>
>

limiting memory on a job

  • To limit total memory used per node:
#PBS -l nodes=13:ncpus=3,mem=6gb
 
  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="spider refinement scripts for cluster" date="1219084947" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" user="Main.BillRice" version=""

Revision 1721 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work
  • The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
  • To use the cluster properly, jobs should be submitted through a batch submission system
  • In general, do not run jobs on the head node (node 0)

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
Added:
>
>
  • When the script is finished, it will leave 2 output files:
    • samplescript.e88888 -- standard error from program
    • samplescript.o88888 -- standard output from program (may be a very large file with EMAN or spider)
 
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use. xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

  • In the case of mpi spider, run it using mpirun
  • The PBS job file should be in the format -
#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh
  • Notes on command
    • -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Spider Refinement on the cluster

  • Scripts for spider refinement are provided in the folling zipfile: cluster_refinement_scripts.zip
  • After unzipping, need to edit the file refine_setting.pam
    • Change the number of nodes to the number you want. Currently set for 8 nodes
  • Follow the regular instructions for spider refinement
  • To start the refinement, type
      >qsub refine_clu.qsub

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="spider refinement scripts for cluster" date="1219084947" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" user="Main.BillRice" version=""

Revision 1618 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work
  • The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
  • To use the cluster properly, jobs should be submitted through a batch submission system
  • In general, do not run jobs on the head node (node 0)

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use. xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

  • In the case of mpi spider, run it using mpirun
  • The PBS job file should be in the format -
#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh
  • Notes on command
    • -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's
Added:
>
>

Spider Refinement on the cluster

  • Scripts for spider refinement are provided in the folling zipfile: cluster_refinement_scripts.zip
  • After unzipping, need to edit the file refine_setting.pam
    • Change the number of nodes to the number you want. Currently set for 8 nodes
  • Follow the regular instructions for spider refinement
  • To start the refinement, type
      >qsub refine_clu.qsub
 

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

* cluster_refinement_scripts.zip: spider refinement scripts for cluster

META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
Changed:
<
<
META FILEATTACHMENT attachment="cluster_refinement_scripts.zip" attr="" comment="spider refinement scripts for cluster" date="1219084946" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" stream="cluster_refinement_scripts.zip" user="Main.BillRice" version="0"
>
>
META FILEATTACHMENT attr="" autoattached="1" comment="spider refinement scripts for cluster" date="1219084947" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" user="Main.BillRice" version=""
 

Revision 1518 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work
  • The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
  • To use the cluster properly, jobs should be submitted through a batch submission system
  • In general, do not run jobs on the head node (node 0)

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use. xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

  • In the case of mpi spider, run it using mpirun
  • The PBS job file should be in the format -
#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh
  • Notes on command
    • -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

Added:
>
>
* cluster_refinement_scripts.zip: spider refinement scripts for cluster
 
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
Added:
>
>
META FILEATTACHMENT attachment="cluster_refinement_scripts.zip" attr="" comment="spider refinement scripts for cluster" date="1219084946" name="cluster_refinement_scripts.zip" path="cluster_refinement_scripts.zip" size="46308" stream="cluster_refinement_scripts.zip" user="Main.BillRice" version="0"
 

Revision 1416 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work
Added:
>
>
  • The cluster consists of 16 nodes with 4 cpu's per node. Nodes are numbered 0-15
  • To use the cluster properly, jobs should be submitted through a batch submission system
  • In general, do not run jobs on the head node (node 0)
 

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use. xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Using the MPI version of spider

  • In the case of mpi spider, run it using mpirun
  • The PBS job file should be in the format -
#!/bin/sh
#PBS -l nodes=14:ncpus=4

/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh
  • Notes on command
    • -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Revision 1315 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use. xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it
Added:
>
>

Using the MPI version of spider

  • In the case of mpi spider, run it using mpirun
  • The PBS job file should be in the format -
#!/bin/sh
#PBS -l nodes=14:ncpus=4
 
Added:
>
>
/opt/hpmpi/bin/mpirun -np 56 -hostfile $PBS_NODEFILE /usr/apps/spider_mpi spi/spd @apsh
  • Notes on command
    • -np 56 : means number of processors = 56. MUST be equal to number of nodes times number of CPU's
 

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""

Revision 1215 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use.
Added:
>
>
xpbsmon1.png



* Click on a node to get a detailed view of what how busy it is xpbsmon2.png

 

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:

Changed:
<
<
xpbsmon1.png
>
>
 
Changed:
<
<
* xpbsmon screenshot2:
xpbsmon2.png
>
>
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" user="Main.BillRice" version=""
META FILEATTACHMENT attr="" autoattached="1" comment="xpbsmon screenshot2" date="1218836257" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" user="Main.BillRice" version=""
Deleted:
<
<
META FILEATTACHMENT attachment="xpbsmon1.png" attr="" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" stream="xpbsmon1.png" user="Main.BillRice" version="0"
META FILEATTACHMENT attachment="xpbsmon2.png" attr="" comment="xpbsmon screenshot2" date="1218836256" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" stream="xpbsmon2.png" user="Main.BillRice" version="0"
 

Revision 1115 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use.

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008 * xpbsmon screenshot1:
xpbsmon1.png

Added:
>
>
* xpbsmon screenshot2:
xpbsmon2.png
 
META FILEATTACHMENT attachment="xpbsmon1.png" attr="" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" stream="xpbsmon1.png" user="Main.BillRice" version="0"
Added:
>
>
META FILEATTACHMENT attachment="xpbsmon2.png" attr="" comment="xpbsmon screenshot2" date="1218836256" name="xpbsmon2.png" path="xpbsmon2.png" size="6977" stream="xpbsmon2.png" user="Main.BillRice" version="0"
 

Revision 1015 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR

  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use.

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Added:
>
>
* xpbsmon screenshot1:
xpbsmon1.png

META FILEATTACHMENT attachment="xpbsmon1.png" attr="" comment="xpbsmon screenshot1" date="1218836224" name="xpbsmon1.png" path="xpbsmon1.png" size="10985" stream="xpbsmon1.png" user="Main.BillRice" version="0"
 

Revision 915 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
Added:
>
>
  • In general, pbs submission scripts should start with the following 3 lines:
#!/bin/bash
#PBS -l nodes=4:ncpus=4 
cd $PBS_O_WORKDIR
 
Added:
>
>
  • If the script is called samplescript.sh, then the following unix command will submit it to the cluster:
> qsub samplescript.sh
  • To monitor cluster use, use the qstat command:
> qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
75291.node0       PROTOMO          asiebert          07:00:12 R workq
75292.node0       PROTOMO          asiebert          06:51:08 R workq
75294.node0       PROTOMO          asiebert          06:28:27 R workq
  • Another way is to run xpbsmon, which will give a graphical view of cluster use.
 

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used

filter.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:

spider_script_filt1_clu_slave.spi
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 815 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample refinement script:
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used
Changed:
<
<

filter.spi---------------------
>
>

filter.spi
 
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:
Changed:
<
<

spider_script_filt1_clu_slave.spi----------------------------------------------
>
>

spider_script_filt1_clu_slave.spi
 
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

Added:
>
>
  • To run the script:
    • spider spi/spd @filter
  • even better, make a qsub submission script for filter.spi, use one CPU and one node, and submit it
 

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 715 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
Changed:
<
<
  • sample
>
>
  • sample refinement script:
 
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

  • There are two versions of spider on the cluster:
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation
Changed:
<
<
  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows
>
>
  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows:
 

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
Changed:
<
<
  • Note that this sccript is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used
>
>
  • Note that this script is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used
Added:
>
>

filter.spi---------------------
 
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:
Added:
>
>

spider_script_filt1_clu_slave.spi----------------------------------------------
 
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 615 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)
Changed:
<
<

EMAN

>
>

EMAN

 
  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh
Changed:
<
<

SPIDER

Protomo

>
>

SPIDER

  • There are two versions of spider on the cluster:
Added:
>
>
    • spidermp is the "normal" version of spider, multiprocessor aware but cannot work across nodes (ie use only max 4 CPU's at once)
    • spidermpi is the MPI aware version of spider, can use entire cluster
      • only a small subset of commands are MPI -aware and working, therefore only use this version for the following commands:
        • ap nq
        • ap mq
        • ap sh
        • ap ref -- THIS COMMAND DOES NOT WORK IN MPI MODE, EVEN THOUGH IT SHOULD
  • Usually, the whole cluster is only used in the refinement process * A (modified) set of refinement scripts is attached to this page, these can be used on the cluster for refinement according to the spider documentation

  • pubsub does not work on the cluster, but it can be replaced with qsub, as follows

Using PBS with SPIDER to replace pubsub:

  • Can use pbs to speed up loops on non-mp aware processes
  • example of a spider script to speed up a non-linear filter
  • Note that this sccript is designed to run such that each node runs one process. This process uses up to 6 GB, so the place=scatter command is used
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================
;---------- following set of commands sets the main qsub script for the cluster, one script per loop event
vm
echo "#!/bin/sh" > qsub-header.txt
vm
echo "#PBS -l select=ncpus=1:mem=6GB " >> qsub-header.txt
vm
echo "#PBS -l place=scatter" >> qsub-header.txt
vm
echo "cd /usr/data/asiebert/30_04_08_Rbs/tomo1" >> qsub-header.txt

;-----------------------------------------------

do lb1 x10=1,10                 ; this loop only makes the scripts using a "slave" script as template

  RR x93             ; adjust lambda according to the following sequence
  0.01,0.02,0.05,0.1,0.2,0.5,1,2,5,10                               ;  a nice log sequence

   vm
   echo "x93={**X93}" > filtscript{**x10}.$PRJEXT
   vm
   echo "x10={**X10}" >> filtscript{**x10}.$PRJEXT
   vm
   cat spider_script_filt1_clu_slave.$PRJEXT >> filtscript{**x10}.$PRJEXT
   vm
   cp qsub-header.txt qsub-script{**x10}.txt
   vm
   echo "/usr/apps/spidermp $PRJEXT/$DATEXT @filtscript{**x10}" >> qsub-script{**x10}.txt
   vm
   qsub qsub-script{**x10}.txt
lb1

en      ; end the main script                                                                  
 
  • The script which actually does the work:
;========================input
fr l
[tomo]tomo1_crop.spd
;=========output=================
fr l
[filt]filt
;=========params==============
x90=60          ; cycles
x91=0.1         ; delta t
x92=0           ; sigma
;x93=0.05       ; lambda

;=============================

   ce ad
   [tomo]
   [filt]{***x10}
   HEG
   x90
   x91
   x92,x93
en d

Protomo

 
  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR

cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param
cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt
tomo-refine.sh gr668pt50t-01.param >& refine-01.log
tomo-fit.sh gr668pt50t-01.param
cp gr668pt50t-01.param gr668pt50t-02.param
cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt
tomo-refine.sh gr668pt50t-02.param >& refine-02.log
tomo-fit.sh gr668pt50t-02.param
cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param
cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt
tomo-refine.sh gr668pt50t-03.param >& refine-03.log
tomo-fit.sh gr668pt50t-03.param
cp gr668pt50t-03.param gr668pt50t-04.param
cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt
tomo-refine.sh gr668pt50t-04.param >& refine-04.log
tomo-fit.sh gr668pt50t-04.param
cp gr668pt50t-04.param gr668pt50t-05.param
cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt
tomo-refine.sh gr668pt50t-05.param >& refine-05.log
tomo-fit.sh gr668pt50t-05.param
cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param
cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt
tomo-refine.sh gr668pt50t-06.param >& refine-06.log
tomo-fit.sh gr668pt50t-06.param
cp gr668pt50t-06.param gr668pt50t-07.param
cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt
tomo-refine.sh gr668pt50t-07.param >& refine-07.log
tomo-fit.sh gr668pt50t-07.param
cp gr668pt50t-07.param gr668pt50t-08.param
cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt
tomo-refine.sh gr668pt50t-08.param >& refine-08.log
tomo-fit.sh gr668pt50t-08.param
cp gr668pt50t-08.param gr668pt50t-09.param
cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt
tomo-refine.sh gr668pt50t-09.param >& refine-09.log
tomo-fit.sh gr668pt50t-09.param
cp gr668pt50t-09.param gr668pt50t-10.param
cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt
tomo-refine.sh gr668pt50t-10.param >& refine-10.log
tomo-fit.sh gr668pt50t-10.param
cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param
cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt
tomo-refine.sh gr668pt50t-11.param >& refine-11.log
tomo-fit.sh gr668pt50t-11.param
cp gr668pt50t-11.param gr668pt50t-12.param
cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt
tomo-refine.sh gr668pt50t-12.param >& refine-12.log
tomo-fit.sh gr668pt50t-12.param
cp gr668pt50t-12.param gr668pt50t-13.param
cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt
tomo-refine.sh gr668pt50t-13.param >& refine-13.log
tomo-fit.sh gr668pt50t-13.param
cp gr668pt50t-13.param gr668pt50t-14.param
cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt
tomo-refine.sh gr668pt50t-14.param >& refine-14.log
tomo-fit.sh gr668pt50t-14.param
cp gr668pt50t-14.param gr668pt50t-15.param
cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt
tomo-refine.sh gr668pt50t-15.param >& refine-15.log
tomo-fit.sh gr668pt50t-15.param
cp gr668pt50t-15.param gr668pt50t-16.param
cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt
tomo-refine.sh gr668pt50t-16.param >& refine-16.log
tomo-fit.sh gr668pt50t-16.param

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 515 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
cd $PBS_O_WORKDIR
Changed:
<
<
cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param

...etc

>
>
cat gr668pt50t.param | sed 's/guess=false/guess=true/' > gr668pt50t-01.param cp gr668pt50t-ali.tlt gr668pt50t-01-itr.tlt tomo-refine.sh gr668pt50t-01.param >& refine-01.log tomo-fit.sh gr668pt50t-01.param cp gr668pt50t-01.param gr668pt50t-02.param cp gr668pt50t-01-fitted.tlt gr668pt50t-02-itr.tlt tomo-refine.sh gr668pt50t-02.param >& refine-02.log
Added:
>
>
tomo-fit.sh gr668pt50t-02.param cat gr668pt50t-02.param | sed 's/guess=true/guess=false/' > gr668pt50t-03.param cp gr668pt50t-02-fitted.tlt gr668pt50t-03-itr.tlt tomo-refine.sh gr668pt50t-03.param >& refine-03.log tomo-fit.sh gr668pt50t-03.param cp gr668pt50t-03.param gr668pt50t-04.param cp gr668pt50t-03-fitted.tlt gr668pt50t-04-itr.tlt tomo-refine.sh gr668pt50t-04.param >& refine-04.log tomo-fit.sh gr668pt50t-04.param cp gr668pt50t-04.param gr668pt50t-05.param cp gr668pt50t-04-fitted.tlt gr668pt50t-05-itr.tlt tomo-refine.sh gr668pt50t-05.param >& refine-05.log tomo-fit.sh gr668pt50t-05.param cat gr668pt50t-05.param | sed 's/cormod=xcf/cormod=mcf/' > gr668pt50t-06.param cp gr668pt50t-05-fitted.tlt gr668pt50t-06-itr.tlt tomo-refine.sh gr668pt50t-06.param >& refine-06.log tomo-fit.sh gr668pt50t-06.param cp gr668pt50t-06.param gr668pt50t-07.param cp gr668pt50t-06-fitted.tlt gr668pt50t-07-itr.tlt tomo-refine.sh gr668pt50t-07.param >& refine-07.log tomo-fit.sh gr668pt50t-07.param cp gr668pt50t-07.param gr668pt50t-08.param cp gr668pt50t-07-fitted.tlt gr668pt50t-08-itr.tlt tomo-refine.sh gr668pt50t-08.param >& refine-08.log tomo-fit.sh gr668pt50t-08.param cp gr668pt50t-08.param gr668pt50t-09.param cp gr668pt50t-08-fitted.tlt gr668pt50t-09-itr.tlt tomo-refine.sh gr668pt50t-09.param >& refine-09.log tomo-fit.sh gr668pt50t-09.param cp gr668pt50t-09.param gr668pt50t-10.param cp gr668pt50t-09-fitted.tlt gr668pt50t-10-itr.tlt tomo-refine.sh gr668pt50t-10.param >& refine-10.log tomo-fit.sh gr668pt50t-10.param cat gr668pt50t-10.param | sed 's/cormod=mcf/cormod=pcf/' > gr668pt50t-11.param cp gr668pt50t-10-fitted.tlt gr668pt50t-11-itr.tlt tomo-refine.sh gr668pt50t-11.param >& refine-11.log tomo-fit.sh gr668pt50t-11.param cp gr668pt50t-11.param gr668pt50t-12.param cp gr668pt50t-11-fitted.tlt gr668pt50t-12-itr.tlt tomo-refine.sh gr668pt50t-12.param >& refine-12.log tomo-fit.sh gr668pt50t-12.param cp gr668pt50t-12.param gr668pt50t-13.param cp gr668pt50t-12-fitted.tlt gr668pt50t-13-itr.tlt tomo-refine.sh gr668pt50t-13.param >& refine-13.log tomo-fit.sh gr668pt50t-13.param cp gr668pt50t-13.param gr668pt50t-14.param cp gr668pt50t-13-fitted.tlt gr668pt50t-14-itr.tlt tomo-refine.sh gr668pt50t-14.param >& refine-14.log tomo-fit.sh gr668pt50t-14.param cp gr668pt50t-14.param gr668pt50t-15.param cp gr668pt50t-14-fitted.tlt gr668pt50t-15-itr.tlt tomo-refine.sh gr668pt50t-15.param >& refine-15.log tomo-fit.sh gr668pt50t-15.param cp gr668pt50t-15.param gr668pt50t-16.param cp gr668pt50t-15-fitted.tlt gr668pt50t-16-itr.tlt tomo-refine.sh gr668pt50t-16.param >& refine-16.log tomo-fit.sh gr668pt50t-16.param
 
  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 415 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh

SPIDER

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO
Deleted:
<
<
 cd $PBS_O_WORKDIR

cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param

...etc

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 315 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
Changed:
<
<
  • #!/bin/bash --- use the bash shell
  • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
  • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
  • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
>
>
    • #!/bin/bash --- use the bash shell
    • #PBS -N EMAN -- name the job EMAN (you can name it what you like)
    • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
    • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
Added:
>
>
    • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)
  • To submit the script, if it is named emanrefine.sh
   > qsub emanrefine.sh
 

SPIDER

Protomo

  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO

cd $PBS_O_WORKDIR

cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt
tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log
tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param
 
...etc

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 215 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
  • #!/bin/bash --- use the bash shell
  • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
  • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
  • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)

SPIDER

Protomo

Added:
>
>
  • Protomo software will only use one CPU, but you can process multiple tomograms at once using the cluster
  • Use the qsub submission process, and make a qsub script for each command
  • sample refinement script:
#! /bin/sh
#PBS -l nodes=1:ncpus=1
#PBS -N PROTOMO

cd $PBS_O_WORKDIR

cat gr668_3_2a80nm11500xgkpt42t.param | sed 's/guess=false/guess=true/' > gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-ali.tlt gr668_3_2a80nm11500xgkpt42t-01-itr.tlt
tomo-refine.sh gr668_3_2a80nm11500xgkpt42t-01.param >& refine-01.log
tomo-fit.sh gr668_3_2a80nm11500xgkpt42t-01.param
cp gr668_3_2a80nm11500xgkpt42t-01.param gr668_3_2a80nm11500xgkpt42t-02.param
 
...etc

  • Notes on script:
    • #PBS -l nodes=1:ncpus=1 --- Only use one CPU on one node
    • #PBS -N PROTOMO --- name the job PROTOMO
  • Sample script for calculating tomogram:
#! /bin/sh
#PBS -l nodes=1:ncpus=1

cd $PBS_O_WORKDIR
tomo-map.sh gr668pt50t-02.param gr668pt50t-02-fitted.tlt

  • cd $PBS_O_WORKDIR --- required to change to the workign directory
 
  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

Revision 115 Aug 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

General guidelines

  • home directory has 20 GB space allocated to all of CEM computing
  • /usr/data has 100 GB of space allocated to all of CEM computing
  • both disks operate ar same speed
  • In general, do most of your processing on /usr/data, since there is more space available
  • Please remove data after processing so that others will have space to work

Batch submission system

  • We have PBS (commercial version) for handling of jobs
  • Also have mpirun installed for mpi -aware applications (Spider is the only mpi -aware EM software)

EMAN

  • EMAN will work with the batch submission software (PBS) installed on the cluster
  • Need to make a batch submission script for cluster use
  • sample
#!/bin/bash

#PBS -N EMAN
#PBS -l nodes=8:ncpus=4
cd $PBS_O_WORKDIR

refine 8 ang=8.5 mask=38 pad=120 hard=25 classkeep=0.8 classiter=8 sym=c1 phasecls median proc=32

  • Notes on script:
  • #!/bin/bash --- use the bash shell
  • #PBS -l nodes=8:ncpus=4 --- Use 8 nodes on the cluster, with 4 processors per node
  • cd $PBS_O_WORKDIR --- This is needed so that script will be run in the current directory, rather than your home directory
  • refine 8 ang=8.5 ... proc=32 -- proc= command means use 32 processors (8 nodes * 4 CPU per node)

SPIDER

Protomo

  • Set ALLOWTOPICVIEW =

-- BillRice - 15 Aug 2008

 
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding this intranet, Send feedback