Difference: CemEmanRefinement (12 vs. 13)

Revision 1316 Dec 2008 - Main.BillRice

 
META TOPICPARENT name="CemIT"
Contents

EMAN Refinement

Files needed

  • threed.0a.mrc -- the strating model, perhaps produced by common lines
  • start.hed, start.img -- the particle stack
  • optional: structure factor file, if doing full CTF correction
  • optional: start.filt.hed, start.filt.img: if using filtered particles for projection matching

Description of command line options

  • EMAN uses a single command, refine, to do all refinement
  • A typical command is refine 8 hard=25 pad=120 sym=C1 mask=40 ctfcw=structure.sf dfilt classkeep=1 classiter=8 refine
  • There are a large number of switches, some notes on what the commands mean are as follows

Size options

  • ang=[Angular increment in degrees]
    • Take arctan (desired resolution / particle radius) as a first estimate
    • A fairly critical parameter. Highly resolution dependant, dependant on size of model and level of detail
    • Do not be too aggressive at first
    • If you entered all the info in the first eman window, the program will give you an appropriate value to use
    • Time varies with (1/n) squared
  • mask=[Mask radius in A]
    • should be bigger than the particle size, in most cases substantially bigger
    • most of the time, use 1/2 box size
  • pad=[padded box size in pixels]
    • used to prevent artifacts caused by direct Fourier inversion
    • spreads particles out so that ripples are pushed to edge of padded box
    • why not automatically do this? Memory considerations
    • Should be 25% larger than box size and have small prime factors

CTF correction parameters (one of the following three options)

  • ctfc=[RES in Angstroms]
    • No structure factor file needed
    • Good for low-res work only (within 1st 0 of CTF)
    • Advantage of using it is that it avoids the HP filtration effect of CTF
    • Do not use in most instances, as ususally you want full CTF correction
  • ctfcw=[SF filename]
    • Need structure factor file, which must be the same as was used in determining the ctf value
  • median
    • Math of this is poorly defined (don't use often)
  • Note that ctfc and ctfcw require that ctf params have been added to the particle stack, otherwise the command will likely fail or give wrong results

Phase error options

  • dfilt, fscls, phasecls
    • if no CTF correction is being done, use fscls or phasecls (particle dependant)
    • fscls uses Fourier shell correlation to determine similarity
    • phasecls uses phase residual
    • dfilt is an optimized variance with a phase filter
    • if doing CTF correction, use dfilt
    • dfilt : works slightly better than the others, expect a 5 - 10% improvement in final resolution over fscls
  • hard=[phase error in Angstroms]
    • looks at mean phase error between class average and reconstruction projection
    • decides whether or not a given class average should be used
    • discards class averages which do not match
    • calculates mean phase error over predifined frequency range
    • Need to watch output in order to see which class averages it is throwing away
    • In most cases, 25 is a good value
    • However, if starting with a Gaussian blob, may need to start bigger (45-55)
    • Don't make much smaller than 25 -- throw away too much
Changed:
<
<
* refine
>
>
  • refine
 
    • amost always specified
    • refers to 2d alignment process
    • for alignment, EMAN uses an autocorrelation function, the translates to polar coordinates aligns, then back to real space
    • refine then does a final refinement in real space, to sub-pixel resolution (0.02 pixel, 0.05 degrees)
    • Why not use? Takes twice as long
  • sym=[cn, dn, oct, or icos]
    • symmetry enforcement
    • for asymmetric particles, use sym=C1 or omit entirely
    • cn describes single n-fold rotational symmetry about the z-axis
    • dn describes n-fold dihedral symmetry (cn with n 2-folds in xy plane)
    • oct is octahedral (2-3-4 symmetry of cube)
    • icos is icosahedral (2-3-5)

Iteration options

  • [N]
    • Number of refinement iterations to do
    • It will start with the most previously completed ieteration, so refine 10 after refine 8 will start with iteration 9
  • classiter=[number]
    • Actually uses (N-2) iterations (why???)
    • so classiter=3 will do 1 iteration (1 and 2 are undefined)
    • 0 can be used, but risky -- use at end, or alternate with 3 to get more resolution
    • The number refers to the number of reference-free alignments of each class average done at each overall iteration
  • classkeep=[number]
    • happens same time as classiter
    • Decides how many particles will be kept for making class average
    • goes "classkeep" number of std deviations away from mean
    • Classkeep=0: throw away 50% of particles
    • classkeeep=1: throw away 10-20%
    • Usually use 1-2

Additional options

  • sep=[n]
    • useful -- "a poor-man's maximum liklihood"
    • each particle is put into best n classes, rather than just the best class
    • Use with oversampled ang values at end of refinement
    • no weighting, just puts into n bins
    • combine with a low class-keep value (<1, maybe <0) to throw away bad particles
  • xfiles=[a/pix,mass(kD),ali-to]
    • for each threed.1a.mrc, makes a x.1.mrc file scaled so that surface threshold of 1 gives mass specified
    • ali-to means to align to a specific iteration number, to prevent moleculae from drifting.
      • Using 99 is common, to prevent it from doing the alignment
    • a/pix -- need to specify pixel size for proper scaling
  • amask=<radius,threshold,iteration>
    • "Very important option"
    • does a 3d floodfill outside of particle (solvent flattening)
    • radius -- in pixels -- just big enough to touch inside density of particle
    • isosurface -- the line that is floodfilled
    • iter -- add a bunch of trailing shells outside of tight floodfilled mask
    • must use with xfiles option since threshold selection requires volume normalization
    • radius -- radius of particle
    • threshold --slightly lower than surface threshold needed to see particle (if 1 is edge, use 0.7)
    • iteration: 10% of box size
    • If too tight, get artificial increases in resolution
  • shrink=[n]
    • reduces image size n-fold
    • use if you don't want to bother shrinking data at start
    • Don't use on final model
    • shrink=2: gives an 8-fold speed improvement
  • usefilt
    • No longer experimental, is now in frequent use
    • most helpful in full CTF correction
    • do any filtration you like (lowpass, Weiner, ...) and save particle stack as start.filt.hed
  • refmaskali
    • Use mask that was projected from 3d volume to do alignment (2d), but do nor apply mask to aligned particles
    • prevent mask from throwing away extra chunks
    • good idea in early stages, but will give you lower resolution
    • A good safe option at first, until you are sure

Options not recommended

  • tree=[2,3]
    • uses a 2-step classification, to speed things up
    • thinks it is working ok, but may not work on cluster
    • shrinking particle beforehand is a better way to speed things up
  • filt3d
    • "If you need it, then something else is wrong"
  • euler2
    • Not suggested -- often only makes things worse
  • perturb
    • "probably not useful"
  • imask=[radius]
    • Apparently still works, but not tested in a long time
    • only useful in specific circumstances
  • collapse
    • Don't use
  • any other options not mentioned in this list
    • "Lots of options have been added to EMAN over the years, but never removed"
    • "Documentation has not been updated in several years"

Resolution determination

  • use eotest with same options as refine
  • Delete any options that give you an error

  • Set ALLOWTOPICVIEW =

-- BillRice - 18 Aug 2008

 
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding this intranet, Send feedback