Difference: XeasyModel (1 vs. 3)

Revision 317 Sep 2008 - Main.DavidCowburn

 Contents XEASY manual: 3. The XEASY Model previous: 2. Using XEASY / contents

3. The XEASY Model

To enable computer supported spectra interpretation, an operational model is needed which defines the terms on which the computer interacts with the spectroscopist. This Model has to include the spectra which are the raw data and abstractions such as peaks, assignments, line shapes, geometries and strips which are derived from the spectra and constitute the results of the program. These different elements are detailed in this chapter.

3.1 Spectrum

A spectrum is a 2D, 3D or 4D dimensional box of intensities containing the frequency domain data of an NMR experiment. Spectra are loaded using the command "New spectrum [ns]" .

The intensity information for each data point is either encoded in 8 bits or in 16 bits. The format with 8 bits uses a logarithmic representation of the data with 1 byte per real data point. For a given data point sk the program first determines the integer l that minimizes the expression

                 3.1.A
(i. e. ) and then stores in one byte

                 3.1.B
This format can represent numbers approximately in the range , i. e. with a relative error of less than 20%. The format with 16 bits uses a 16 bit floating point format with the "exponent" ek given by Eq. 3.1.B in the lower valued byte and the mantissa
                 3.1.C

( if ) in the higher valued byte (Eccles et al., 1991). This format can represent numbers in the same range as the 8 bit format but with a relative error of less than 1%.

The intensity data is stored in a spectrum data file with the extension ".3D.8" or ".3D.16" depending on the used accuracy. The information about the spectrum is kept in the parameter file with the extension ".3D.param". The example below

Version ....................... 1
Number of dimensions .......... 2
16 or 8 bit file type ......... 16
Spectrometer frequency in w1 .. 60.811001
Spectrometer frequency in w2 .. 600.138000
Spectral sweep width in w1 .... 34.840000
Spectral sweep width in w2 .... 6.719000
Maximum chemical shift in w1 .. 135.384872
Maximum chemical shift in w2 .. 11.538800
Size of spectrum in w1 ........ 128
Size of spectrum in w2 ........ 512
Submatrix size in w1 .......... 256
Submatrix size in w2 .......... 256
Permutation for w1 ............ 2
Permutation for w2 ............ 1
Folding in w1 ................. RSH
Folding in w2 ................. RSH
Type of spectrum .............. C
is from a 2D spectrum. The intensities are stored in the 16 bit format. Spectrometer frequencies and spectral sweep widths for both dimensions are given in ppm. The maximum chemical shifts give the ppm frequency of the lower left corner of the spectrum. The total size of the spectrum is given in data points. The submatrix size gives the size of the blocks used to store the intensities in order to allow fast access to small parts of the spectrum. The permutation defines the sequential order in which the intensities are stored in the data file. The folding has to be set to either RSH, for Ruben States Haberkorn (Marion, D., Ikura, M., Tschudin, R. & Bax. A. (1989) J. Magn. Reson. 85, 393-399), or TPPI, for time proportional phase increment (Marion, D. and Wüthrich, K. (1983) Biochem. Biophys. Res. Comm. 113, 967-974). The folding type is used to fold the peaks into the spectrum when loading a peak list. The type of spectrum is not used by the program.

To set the calibration of a spectrum the command "Calibration [ca]" is used. Since the peak positions are stored in ppm it is not recommended to work with uncalibrated spectra. The way the parameter file is interpreted is influenced by the value of the resource "XEasy*traditional_calibration".

A spectrum can be turned around without changing the data file by editing the parameter file. Using the following parameter file defines the data file from the above example to be from a three dimensional spectrum extending only one data point in the third dimension. An arbitrary calibration is used in w1:

Version ....................... 1
Number of dimensions .......... 3
16 or 8 bit file type ......... 16
Spectrometer frequency in w1 .. 500.000000
Spectrometer frequency in w2 .. 60.811001
Spectrometer frequency in w3 .. 600.138000
Spectral sweep width in w1 .... 10.000000
Spectral sweep width in w2 .... 34.840000
Spectral sweep width in w3 .... 6.719000
Maximum chemical shift in w1 .. 500.000000
Maximum chemical shift in w2 .. 135.384872
Maximum chemical shift in w3 .. 11.538800
Size of spectrum in w1 ........ 1
Size of spectrum in w2 ........ 128
Size of spectrum in w3 ........ 512
Submatrix size in w1 .......... 1
Submatrix size in w2 .......... 256
Submatrix size in w3 .......... 256
Permutation for w1 ............ 3
Permutation for w2 ............ 2
Permutation for w3 ............ 1
Folding in w1 ................. RSH
Folding in w2 ................. RSH
Folding in w3 ................. RSH
Type of spectrum .............. C
The different possibilities to generate spectra files in the XEASY format are given below. A description of conversion programs between different formats is given under external programs .
  • directly from PROSA: using the spectrum processing program PROSA (P. Güntert, V. Dötsch, G. Wider and K. Wüthrich, J. Biomol. NMR, 2 (1992) 619-629) data suitable for XEASY may be written out with the command write easy8 [filename] or write easy16 [filename]
  • converting a Bruker smx file into a 2D spectrum using the filegen2d program.
  • converting a series of Bruker smx files into one 3D spectrum using the filegen3d program.
  • copying Bruker smx files from the X32 to the workstation and converting them into 2D or 3D spectra. This can be done with the shell scripts cpx32 and cpx32_16bit.
  • converting a spectrum from the old EASY format into the new XEASY format with the program filecon2d.

In addition to the file format for XEASY, two outdated spectral formats from the EASY and EASY3D programs exist. The format from the EASY program can no longer be read by the XEASY program. These files have to be converted by the filecon2d program. Files used in EASY3D can still be read by the XEASY program. Since these parameter files contain binary information they may not be edited.

3.2 Zoom Regions

Many routines of the XEASY program display selected regions of a spectrum. One dimensional cross sections of the spectrum are displayed as plots of intensity vs. data points in the slice window (see section 12.8 on page 95). Two dimensional regions of the spectrum may be shown in the main window as contour plots or as intensity plots. The intensity plot maps the intensities onto a color scale. The contour plot draws lines of equal intensity. Following commands may be used to change between the different display modes:

3.2.1 Selecting Zoom Regions

XEASY provides routines to define and select regions from the spectrum for displaying. The interactive routines for zooming using no information from peaks, assignments or strip sequences are described below. In addition, there exist a number of commands which select zoom regions based on picked peaks or on strip sequences. These are described in the corresponding chapters about peak lists (page 23) and strip sequences (page 36).

To select an arbitrary region in the spectrum and also to restore the full spectrum the command

"Permutation [pm]"

can be used. This command as well as the "New spectrum [ns]" command define the view onto the spectrum, that is which dimension will be displayed horizontally (in x direction) and which vertically (in y direction). Many routines within XEASY refer to this view.

The following commands allow the definition of regions to be displayed by manually selecting their boundaries within the currently displayed zoom region:

In addition the two letter code "mo" may be used to select a region in the overview window. To modify the size or position of the displayed regions the following commands are available:The zoom commands to move or resize regions take as a parameter the zoom factor which can be defined using the command

"Zoom factor [zf]".

3.2.2 Comparing Zoom Regions

To compare different spectra, it is possible to display the contour plot of one spectrum on top of the intensity plot of an other spectrum. The command

"Alternative spectrum [as]".

may be used to define additional spectra for which the commands

"Replace contour [rc]".
"Replace spectrum [rs]".

select the ones to display as contour and intensity plots. The spectra currently used for the intensity plot and the contour plot are indicated in upper left corner of the main window. The upper name gives the spectrum used for intensity plots the lower the one for the contour plot. The routines in XEASY which access the intensities of a spectrum use always the spectrum of the intensity plot. To compare different regions the command

"Zoom alignment [za]".

is used to get several aligned regions. The large cursor

"Draw cursor [dc]".

allows the precise comparison of the positions of peaks within these aligned regions. If displaying the intensity plots of different spectra together it might be necessary to adjust the scale of color vs. intensities of one or more of them. This is possible with the command

"Scale display [sd]".

3.2.3 Zoom Region History

A history of the displayed zoom regions is kept and is used to call back previous displays. The command

"Restore zoom [rz]".

brings back the last stored display. Commands that can easily be reversed (e.g. the commands to move around a zoom region) are not stored in the zoom region history and can not be restored with the [rz] command. The two-letter codes "zb", "zf" and "zo" can be used in the overview window to select any zoom from the zoom history.

Zooms may also be written to a file and recovered again. The commands are

"Load zoom [lz]".
"Write zoom [wz]".

These files have the extension .zoom and contain the information about the zoom regions. The data in the zoomed regions are not stored, but rather the ppm coordinates of the regions in the spectrum are saved. Whenever such a file is loaded, the display will be accordingly updated.

3.3 Phase Correction

The program XEASY can be used to determine phase correction parameters interactively on the basis of 1D cross-sections that are displayed in the slice window (see "Slice window [sw]").

With the conventions used by the program PROSA (P. Güntert, V. Dötsch, G. Wider and K. Wüthrich, J. Biomol. NMR, 2 (1992) 619-629), the phase-corrected spectrum is related to the original spectrum by

                 3.1.B
for the data points 0, ..., n-1 in every row along the dimension of interest. is the constant, the linear (or first order) phase correction parameter. The program XEASY allows real-time change of the phase correction parameters and and display of the corresponding, phase-corrected 1D cross-sections, and thus provides a convenient environment for the accurate interactive selection of phase correction parameters. The phase correction parameters found with the help of XEASY can subsequently be used in data processing programs such as PROSA to perform the phase correction on the complete multidimensional spectrum.

To use the interactive phase correction routine in XEASY, first two separate spectrum files containing the real and imaginary parts of the complex spectrum are prepared (e.g., with the program PROSA), then the real part of the spectrum is read into XEASY with the "New spectrum [ns]" command and the imaginary part of the spectrum is defined as alternative spectrum withe the "Alternative spectrum [as]" command. Next the user selects and displays suitable rows (or columns) in the slice window (see "Slice window [sw]" ) and clicks the Phase button in the slice window. Phase correction will be performed for the spectrum in the dimension of the current slice (i.e., the slice for which the Current button on top of the slice window is activated). The alternative spectrum window pops up and the user selects the file containing the imaginary part of the spectrum. The constant and linear phase correction parameters can now be adjusted using the keyboard according to table 3.3.

The linear phase correction parameter is changed such that the phase of the current slice at the cursor position in the slice window remains constant. The actual values of the phase correction parameters are monitored in the lower left corner of the slice window.

The user deactivates the phase correction mode by clicking the Phase button again. The 1D cross-sections of the original spectrum (i.e., the real part loaded with the [ns] command) are displayed again.

3.4 Assignments

Assignments represent the main result of the work with the program XEASY. Three lists, namely the peak list, the atom list and the fragment list contain the information about the assignments. The peak list contains the coordinates of the picked peaks used to assign the spectrum. The atom list contains the names and frequencies of possible resonances. They define the possible assignments for each dimension of a peak. For homonuclear, single quantum, proton spectra the atom list contains the names and frequencies of all protons in the molecule. In this case it corresponds to the proton list used in the programs EASY and EASY3D. The fragment list contains the names of the fragments which are used at the different stages of the assignment process. In early stages of the assignment, spin systems independent of the primary sequence of the molecule are used. Then increasingly more of them become mapped onto the primary sequence of the protein until finally the residues of the molecule under investigation are used. In this final stage the fragment list corresponds to the sequence list of the programs EASY and EASY3D.

In addition to these three lists, XEASY uses a library file defining the atoms and pseudo atoms for each fragment type. The following chapters provide first, a detailed description of the three lists and the library file, and then a description of how they can be used in conjunction in order to proceed through the different stages of the assignment process.

3.4.1 Peak List

A peak list contains entries for peaks picked in the spectrum. The following paragraphs present in detail the peak list file format, peak picking, assigning peaks, editing peaks, displaying relevant information from a peak list, folding, and how to treat different dimensionality of peak lists and spectra.

A peak list contains an entry for each peak. The fields describing a peak are listed in Table 3.4.1.A. In a 2D spectrum the fields for the w3 and w4 dimension and in a 3D spectrum those for w4 are not used. The peak numbers in a peaklist must be unique but not necessarily continuous.

Table 3.4.1.A Peak Fields

Dim.   Fields      Description

   peak number   unique number identifying the peak
   colour      colour in the range [1,6] used for displaying the peaks
   volume      volume of the peak
   volume error   volume error in percent
   integration method  
         method used for the integration: d, r, e, m, a, -
   comment      user defined comment
   possible ass.   data structure containing possible assignments

   shift      folded w1 position in ppm
w1   fold      number of times peak is folded in w1
   atom number   assignment in w1: reference into the atom list

   shift      folded w2 position in ppm
w2   fold      number of times peak is folded in w2
   atom number   assignment in w2: reference into the atom list

   shift      folded w3 position in ppm
w3   fold      number of times peak is folded in w3
   atom number   assignment in w3: reference into the atom list

   shift      folded w4 position in ppm
w4   fold      number of times peak is folded in w4
   atom number   assignment in w4: reference into the atom list

The peak list is stored in a peak list file with extension .peaks. It can be read by the program ASNO (P. Güntert et al., 1993) which uses a peak list, an atom list and selected structure coordinate files to generate possible assignments for NOESY cross peaks which can be loaded into XEASY. The program CALIBA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530) translates the peak lists containing integrated peaks into distance constraints which can be used by the program DIANA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530). An example of the first few lines of a two dimensional peak list is given below:
# Number of dimensions 2
   11  7.289  10.169 1 ?          2.048e+03  0.00e+00 -   0  126  128  0
       #  first peak
   12  7.119   9.413 1 ?          1.280e+02  0.00e+00 -   0  517  506  0
   3   7.106   7.497 1 ?          4.096e+03  0.00e+00 -   0  129  130  0
   4   7.228   7.411 1 ?          4.096e+03  0.00e+00 -   0  131  127  0
   5   7.106   7.411 1 ?          5.120e+02  0.00e+00 -   0  327  328  0
   6   6.838   7.094 1 ?          8.192e+03  0.00e+00 -   0  489  488  0
The number on the first line after the hash "#" indicates the dimensionality of the peak list. Subsequent lines, starting with a number, contain the fields for one peak. Additional lines starting with a hash "#" are comments for the peak on the line above. The first field for each peak is the peak number, followed by: the unfolded chemical shift coordinates in ppm in w1 and w2 (more numbers are listed for higher dimensional spectra), the color code (a number from 1 to 6), the user defined type of the spectrum where the peak is observed, the peak volume, the uncertainty of the volume in percent, the integration method ("d" for Denk integration, "r" for rectangular integration, "e" for elliptical integration, "m" for maximum integration, "a" for automatic integration, "-" for not integrated), an unused number, the assignments in w1 and w2 is given by the two following atom numbers (more numbers are listed for higher dimensional spectra), the last number is not used. The commands to load or write peak lists are "Load peaklist [lp]" and "Write peaklist [wp]".

Loading a peak list with different dimensionality than the spectrum is possible. If the peak list has more dimensions than the spectrum the additional dimensions are just ignored when working with the peak list. When loading a peak list with less dimensions than the spectrum, it has to be specified whether the needed additional dimensions are copied from the available dimensions or set to a fixed ppm value. For example, the 2D peak list from a [1H,15N]-HMQC spectrum can be loaded onto a 3D 15N-correlated [1H,1H]-NOESY by copying the 1H chemical shift of the 2D list to both 1H dimensions in the 3D spectrum. The resulting peaks are positioned on the diagonal of the 1H planes in the 3D spectrum.

Peaks may be picked either automatically or manually. For automatic peak picking of anti phase peaks in 2D spectra the command "Anti-phase peaks [an]" is used. For automatic picking of inphase peaks the external program pick described on page 105 or the command

"In-phase peak picking [in]" can be used.

Issuing the command "Peak picking [pp]" enters the manual peak picking mode. By selecting a position on the screen a peak placed at the corresponding position of the spectrum is added to the peak list. In 3D and 4D spectra the added peak is picked at the position of the displayed 2D region.

The peak list file contains always the unfolded chemical shifts. When loading a peak list onto a spectrum all peaks are folded into the spectrum. The folding information is stored in the field fold. When writing out the peak list, this field is used to back calculate the unfolded peak positions. The information of the unfolded chemical shifts of a peak can in this manner be transferred from a spectrum with a large sweep width (i.e. 2D HMQC experiment) to spectra with smaller sweep widths (i.e. 3D 15N correlated [1H,1H] spectra). When picking additional peaks, one wants to retain the folding information from peaks already present in the spectrum. This is possible with the command "Copy and move peaks [cm]", which allows to copy the folding information and the assignment from an existing peak to new peaks. Other methods to set the folding of a peak are to enter the unfolded chemical shift into the w field of the peak editing window or to use the "Set selected peak entries [pe]", command which can be used to set or reset any field in the peak list.

Once peaks are picked, they may be assigned to atoms present in the atom list. This is done either for a single peak using the command "Assign peak [ap]"

or for all selected peaks within a region using the command "Assign peaks in one region [ar]".

For example, all peaks in the fingerprint region of a homonuclear proton COSY spectrum can be assigned to the HN and Ha atoms of a set of spin systems, or all peaks in a [1H,15N]-HMQC spectrum may be assigned to the N and HN atoms of back bone fragments.

When working with a peak list it is crucial to display information relevant for the current assignment task. First, often only a subset of peaks is relevant (e.g. only the peaks assigned to a certain fragment). The commands "Select peak [sp]" and "Display all peaks [da]" provide different criteria for selecting peaks to display. In 3D and 4D spectra, usually peaks of interest are only those close to the displayed 2D zoom region. The option "Maximal distance in planes" in the default window controls the number of adjacent planes from which to display peaks.

Second, peaks with different properties (i.e. assigned peaks versus unassigned peaks) can be displayed with different colors or peak shapes. The options: "Peak color determined by", "Peak cross displayed is", "Lineshape displayed for" and "Coloring interval" in the default window, control the peak colors.

Third, peaks may also be labelled with their assignment or the volume. The command

"Peak data window [pw]" pops up the peak data window in which the peak labels may be defined.

Fourth, peaks may be used to zoom interesting regions out of a spectrum. In the simplest case of the command "View peak [vp]" a region is zoomed that contains a specified peak. The command "Zoom peak [zp]" displays a small region around each of the peaks selected with the "Select peak [sp]" command. The command "View reference peak [vr]" displays for each peak with the same assignment as the selected one, a slice in the slice window. An other powerful method to select relevant regions out of a spectrum is the strip list. It is described in the section 6.0 on page 56.

It is possible to work with several peak lists together. When a peak list is loaded it is kept in memory until it is removed from memory with the

"Erase peak list [ep]"

command. To select a different peak list for displaying with a spectrum the command

"Exchange peak list [xc]"

is used. This command does not change the dimensionality or the folding of the loaded peak list. To adapt the folding of the peaks to a certain spectrum, to change the dimensionality or to permute the dimensions the command "Adapt peaks to spectrum [ad]" is used.

Several commands exist to edit the peak list entries. they are:

Worthwhile mentioning is the "Move reference peak [mr]" command. It allows one to move together all peaks with the same assignment. This helps when adjusting a peak list picked and assigned in one spectrum to a related, but slightly different spectrum. Two examples are pH titrations and the use of a TOCSY peak list to identify the intra-residual peaks in a NOESY spectrum.

To check or to list the assignments the commands "List peak entries [le]" and "Report ass. stat.[ra]" are used. They check for duplicated assignments, i.e. two peaks that have the same assignment, and for large chemical shift errors, i.e. peaks at different frequencies which are assigned to the same atom. Both informations are useful to identify wrong assignments.

3.4.2 Atom List

The atom list contains the names and frequencies of resonances. They define the possible assignments for each dimension of a peak.

Table 3.4.2.A - Atom Fields


Fields      Description

atom number   unique number identifying the atom
shift      mean chemical shift in ppm
shift error   deviation of the assigned peaks from the mean value
name      atom name
fragment number   number of the fragment to which the atom belongs
lineshapes   data structure containing the reference lineshapes
The fields listed in Table 3.4.2.A constitute an atom entry. They can be modified in the peak editing window. The atom numbers, used to reference the atoms, must be unique but not necessarily continuous. The number -9999 is reserved to denote invalid entries. The average chemical shift and the shift error can be calculated from the assigned peaks. The command is

"Average chem. shift [ac]".

If the chemical shift is not defined it is set to the value 999.000.

A new atom list is generated each time when a fragment list is loaded. For each fragment the corresponding atoms are looked up in the fragment library file and added to the list. New atom entries are added to the atom list if a non existing atom is used with the "Assign peak [ap]" command, if the fragment type is changed in the peak editing window, or with the "Add new fragment [af]" command.

The atom list file has the extension ".prot" originating from the old EASY format. The following line is taken from such a file:

32   4.370  0.004   HA  2
The first number is the atom number, followed by its mean chemical shift and the deviation from the mean value. The atom name and the fragment number follow. The commands to read or write an atom list are

"Load atoms (chem. shift) [lc]" and "Write atoms (chem. shift) [wc]".

3.4.3 Fragment List

The fragment list contains the fragments currently used for the assignment. Depending on the stage of the assignment, the fragments are either spin systems or residues of the molecule. In the latter case, the fragment list corresponds to the sequence list used in the programs EASY and EASY3D.

Table 3.4.3.A Fragment Fields

Fields         Description

fragment number   unique number identifying the fragment
fragment type     name of the fragment
mapping number    used to map spin system fragments to a residue fragments
comment           used to store possible spin system types or possible sequential neighbours
The fields listed in Table 3.4.3.A constitute a fragment. The fragment numbers, which are used in the atom list to reference fragments, must be unique but not necessarily continuous. They are referenced in the atom list. The number -9999 is reserved to denote invalid entries. The fragment type must be defined in the fragment library. When starting a resonance assignment it is usually set to "SS" meaning that there is no information about the specific spin system type. In later stages it may be changed, for example to "ASP", using the peak editing window. The command "Add new fragment [af]" will add additional fragments to the list.

The mapping number defines a mapping from one set of fragments (e.g., spin systems) to another set (e.g., amino acid residues). When working with spin system fragments the mapping number is the residue number of the spin system. For example: a fragment with number 207, type GLY and mapping number 55 describes spin system 207 which corresponds probably to GLY 55. The mapping number -1 is reserved to denote invalid mappings. The comment can, for example, be used to store the fragment numbers of the sequentially neighboring spin system. When using the automatic sequential assignment routine "Sequential assignment [op]" the fragment comment can be used to indicate for each spin system the probable amino acid types.

The fragment list is stored in a file with extension ".seq". If the first line starts with a hash "#", it is treated as a comment. The subsequent lines list the fragment types with one line per fragment. For example, the following file defines a tripeptide:

# tripeptide
ASP 0
GLY 5 203
LYS 7 209 "-1: 203"
The number, if given, after the fragment name denotes the fragment number. If there is no number given for a fragment, its number will be assigned according to the last fragment number plus one. If no number is given for the first fragment, it will be set to one. The optional second number is the mapping number. An entry "-1" for the mapping number indicates that it is not defined. The optional text within the quotes is the comment. If the comment is present the mapping number must also be listed.

The command to read or write a sequence file is "Load sequence [ls]" and "Write sequence [ws]".

3.4.4 Fragment Library

The fragment library defines the different fragment types. All the atoms constituting a given fragment type are listed. The format is the one of the program DIANA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530). Amino acid fragments, nucleic acid fragments, a general spin system fragment and a general amino acid backbone fragment are currently defined. The extension is ".lib". The default library, defined by the environmental variable XEASY_LIB is loaded when starting up XEASY. To load a different library the command "Load library [ll]" is used.

3.4.5 From Spin System Assignments to Residue Assignments

A reasonable setup for starting out with resonance assignments is to define a set of general spin systems in the fragment list and generate the corresponding atom list. The first few lines of such a fragment list are given below:
  SS 201
  SS 202
  SS 203
  SS 204
  SS 205
  SS 206
  SS 207
  SS 208
  SS 209
  ...
Some lines extracted from the corresponding atom list, which is generated by loading the fragment list into XEASY are given as well:
   1 999.000 0.000 N     201
   2 999.000 0.000 HN    201
   3 999.000 0.000 CA    201
   4 999.000 0.000 HA    201
   5 999.000 0.000 QA    201
   6 999.000 0.000 CB    201
   7 999.000 0.000 HB2   201
   8 999.000 0.000 HB3   201
   9 999.000 0.000 QB    201
...
  34 999.000 0.000 N     202
  35 999.000 0.000 HN    202
  36 999.000 0.000 CA    202
  37 999.000 0.000 HA    202
  38 999.000 0.000 QA    202
  39 999.000 0.000 CB    202
  40 999.000 0.000 HB2   202
  41 999.000 0.000 HB3   202
  42 999.000 0.000 QB    202
...
With these two lists, peaks may be grouped by assigning them to the same fragment and further classified by assigning them to certain atoms within the fragments. For example when working with heteronuclear spectra, each peak in a [1H,15N] HMQC experiment can be assigned to the N and HN atoms of different fragments. Similarly when working with homonuclear spectra, each COSY cross peak in the fingerprint region may be assigned to the Ha and HN atoms of a different fragment. Peaks in the COSY or TOCSY spectrum lying on the same amide frequency as the HN Ha peak can then be assigned to the HN Hb2, HN Hg2, ... of the same fragment.

After the resonance assignment has been finished, the fragments of choice are the amino acid residues and the atom list has entries for the atoms in the molecule. The first few lines of the fragment list for the interleukin receptor antagonist are given below:


ARG 1
PRO 2
SER 3
GLY 4
ARG 5
LYS 6
Some lines extracted from the corresponding atom list with undefined chemical shifts are given below:
   1   999.000  0.000   C       1
   2   999.000  0.000   CA      1
   3   999.000  0.000   CB      1
   4   999.000  0.000   CD      1
   5   999.000  0.000   CG      1
   6   999.000  0.000   CZ      1
   7   999.000  0.000   HA      1
   8   999.000  0.000   HB2     1
   9   999.000  0.000   HB3     1
  10   999.000  0.000   HD2     1
  11   999.000  0.000   HD3     1
  12   999.000  0.000   HE      1
  13   999.000  0.000   HG2     1
  14   999.000  0.000   HG3     1
  15   999.000  0.000   HH1     1
  16   999.000  0.000   HH21    1
  17   999.000  0.000   HH22    1
  18   999.000  0.000   HN      1
  19   999.000  0.000   N       1
  20   999.000  0.000   NE      1
  21   999.000  0.000   NH1     1
  22   999.000  0.000   NH2     1
  23   999.000  0.000   QB      1
  24   999.000  0.000   QD      1
  25   999.000  0.000   QG      1
  26   999.000  0.000   QH2     1
  27   999.000  0.000   C       2
  28   999.000  0.000   CA      2
  29   999.000  0.000   CB      2
  30   999.000  0.000   CD      2
  31   999.000  0.000   CG      2
  32   999.000  0.000   HA      2
  33   999.000  0.000   HB2     2
  34   999.000  0.000   HB3     2
  35   999.000  0.000   HD2     2
  36   999.000  0.000   HD3     2
  37   999.000  0.000   HG2     2
  38   999.000  0.000   HG3     2
  39   999.000  0.000   N       2
  40   999.000  0.000   QB      2
  41   999.000  0.000   QD      2
  42   999.000  0.000   QG      2
...
These two lists can be used to make assignments of peaks to atoms in the molecule under investigation. For example, with these lists NOESY cross peaks can be assigned and integrated in order to extract distance constraints.

Starting from peaks assigned to spin system atoms, a method is needed to proceed to peaks assigned to atoms of the residues in the molecule. Changing assignments of single peaks is too time consuming. Changing the assignments for all peaks assigned to a given residue together is preferable. This is achieved by leaving the peak list and the atom numbers untouched and changing only the fragment and atom list.

The fragment type field can be changed from a general spin system (SS) into a glycine (GLY). Thereby it is ensured that not only the fragment type in the fragment list is updated but also that all the atoms of the new fragment type will be present in the atom list. In the above example, changing from a spin system fragment to a glycine fragment the Ha1, Ha2 and Qa atoms are inserted at the appropriate position in the atom list. For changing the fragment number (i.e. proceeding from spin system numbers to residue numbers) the mapping number is used. The mapping number is set in the peak editing window. In the case of a spin system fragment it is equal to the residue number of the spin system. The command "Switch fragment/map numbers [sn]" exchanges the fragment number and the mapping number. In order to keep the fragment and the atom list consistent the fragment number is changed in both lists together.

It is important to ensure that a certain fragment number is not used at same time for a spin system fragment and a residue fragment. The first possibility to avoid this, as illustrated in the above example, is to use different sets of numbers for the spin systems and the residues. The second possibility is to first set the mapping of all spin system fragments and only then exchange the fragment and the mapping numbers.

3.4.6 Changing Assignments

Different cases can be distinguished: either a peak is missassigned, a resonance is missassigned or the fragment number or fragment type is missassigned. Each of these cases is discussed separately.

To correct a missassigned peaks the atom list and the fragment list are not changed and only the atom number fields in the peak list are changed in order to reflect the new assignment. This is done with the command "Assign peak [ap]" In contrast, to change a resonance assignment, the peak list and the fragment list should remain unchanged and only the atom list is adapted. This can be achieved by editing the atom fields in the peak editing window. If changing an atom name or fragment number the corresponding fields in the atom list are updated.

To change the fragment type or the fragment number the peak list remains unchanged while the atom list and fragment list are adapted. This is done by editing the fragment and the mapping field in the peak editing window and using the "Switch fragment/map numbers [sn]" command as described in the chapter From Spin System Assignments to Residue Assignments on page 31.

3.5 Possible Resiude Types

Based on observed chemical shifts of a given fragment it is possible to identify likely residue types. Especially when 13C chemical shifts are known discrimination between different amino acid types is readily achieved, e.g., in more than 50% of the case the correct amino acid can be identified. The "Residue Type window [rw]"

allows to match a set of identified frquencies to the frequencies expected for different residue types. The expected frequencies and their standard deviation are stored in the fragment library. If for a certain atom no expected frequency is specified in the fragment library the atom will not be considered.

3.6 Possible Assignments

Once the resonance assignments are nearly completed the known frequencies can be calculated using the "Average chem. shift [ac]" command and subsequently can be used to generate possible assignments for peaks. For each dimension of a peak, the atoms resonating at about the same frequency can be identified and their atom numbers stored in a data structure. The command "Possible assign. [pa]" asks for a maximal allowed deviation between the frequencies of the picked peak and the resonances of the atoms. It then generates all the assignment possibilities for all the dimensions of the peaks. In addition to specify the possible assignments for each dimension separately one often wants to include or exclude combinations of assignments. Since a 3D or 4D list of assignment possibilities would be too big, XEASY treats only two dimensional arrays of assignment possibilities. In a 3D or 4D spectrum only combinations of assignments in the two dimensions displayed horizontally and vertically are treated. In each dimension up to 10 possible assignments can be stored, that is, in the displayed plane a maximum of 100 possible combinations. Each time when there are more than ten assignment possibilities in a dimension, a warning message will be prompted and the maximal allowed deviation between the peak and the atoms will be lowered for this peak. To view and edit the assignment possibilities the assignment window is available. It can be popped up by the command "Editing assign. [ea]" In the lower left side of the panel, the horizontal text lines list the possible assignments in the vertical dimension, and the vertical text lines in the upper right part list the possible assignments in the horizontal dimension. The assignments may be toggled using the buttons in the window. To select a single assignment possibility, the corresponding button has to be pushed while the shift key is held down. Once a unique assignment is present in the assignment window the peak list is updated automatically. Another possibility to update the peak list according to the assignment list is provided by the command "Update assign. [ua]" It can also be used to check the two lists for contradictory assignments and to update the assignment list according to the peak list.

Two possibilities exists to reduce the number of proposed assignments. The command "Reduce to intrares. ass. [ia]" toggles all inter-residual assignments off. The command "Reduce using ass. peaks [ru]" uses already assigned peaks to reduce the assignment possibilities of unassigned peaks. Its main application is to include the information of the third and fourth dimension in 3D and 4D spectra.

The assignment possibilities are stored in a file with extension ".assign". This file contains the possible assignments for peaks. The number in the first line of the file indicates the dimension of the peak (2 means two dimensions and 3 means three dimensions). Following for each peak are several lines containing the assignment information. A typical block of data for a 2D peak is given here

# 127
3 220 225 294
10 22 82 197 283 293 388 403 432 438 457
1050625 0 0 0
The peak number (127 in the example) is given after the hash "#". When loading the assignment list and no peak with the listed peak number is present in the loaded peak list, a warning message will be prompted and the loading is aborted. Subsequent lines (in the example the second and third line) list the possible assignments for each dimension. On both lines the first number indicates the number of possible atom assignments in this dimension followed by the individual atom numbers. Atom numbers in the file that are missing in the loaded atom list will be ignored. The four numbers on the last line encode the allowed assignment combinations corresponding to the toggle button matrix in the assignment window. The commands to read or write an assignment file are "Load assignment [la]" and "Write assignment [wa]"

In later stages of the NOESY cross peak assignment, the number of assignment possibilities can be reduced using preliminary structures. This can be done using the program ASNO (P. Güntert et al., 1993). The program reads in structure coordinates, a peak list and an atom list. It produces all assignment possibilities not contradicting the structures or the frequencies of the atoms. The resulting assignment file from ASNO can also be read with the "Load assignment [la]" command. The first few lines of such a file are given below.

Assignment file
   Corresponding peaklist: toc.peaks
   Corresponding coherencelist: tend.prot
   Number of dimensions: 2
   Uncertainties:    0.020   0.020


#     1
    126  128 
#     2
     24   22 
    173  171 
    361  359 
    253  251 
    316  314 
The first five lines give the peak list, atom list and chemical shift tolerances in ppm used to produce the assignment file as well as the dimensionality of the assignment list. The entries for each peak start with a hash "#" followed by the peak number. The following lines list the assignment possibilities. Each line defines one nD assignment possibility defined by atom numbers in all dimensions.

3.7 Strips

The concept of strips originates from work with 3D spectra but can also be used for 2D and 4D spectra. Strips are 1D cross sections extending in a second dimension. They lie at the position of an atom (in 2D spectra) or atom group (in 3D or 4D spectra) and contain all the peaks involving this atom group. For example in spectra involving 1H and 15N of the amide group, as the 15N-correlated [1H,1H]-NOESY, 15N-correlated [1H,1H]-TOCSY, HNCA or HN(CO)CA experiments, for each amide group a strip can be defined. In the case of 2D 1H spectra a strip is defined for each proton and in the case of 13C correlated spectra as the HCCH TOCSY a strip can be defined for each proton bound to a carbon. In order to judge the lineshapes of the cross peaks in two dimensions the 1D cross sections are extended into a second dimension. To use strips in 4D spectra the width of the second dimension should be set equal to the sweep width.

For 3D or 4D spectra, working with a set of strips instead of planes reduces the complexity of the assignment steps. Methods to selectively search for certain strips further simplify the assignment procedure. The following paragraphs introduce the strip data structure, methods to define strip lists, display and edit them as well as methods to selectively search for strips that have either a specified intensity pattern or lie at a selected position in the spectrum.

A strip in XEASY has similar fields as a peak, however a strip is not used to hold relevant assignment information. This has to reside in the peak lists. Assignments stored together with a strip are only used to remove duplicated entries, to sort the strip list, and to identify peaks belonging to a strip. The fields defining a strip are given in Table 3.7.0.A.

Table 3.7.0.A Strip Fields


Dim.   Fields      Description

   spectrum   pointer to the spectrum

   shift      folded w1 position in ppm
w1   fold      number of times strip is folded in w1
   atom number   assignment in w1: reference into the atom list

   shift      folded w2 position in ppm
w2   fold      number of times strip is folded in w2
   atom number   assignment in w2: reference into the atom list

   shift      folded w3 position in ppm
w3   fold      number of times strip is folded in w3
   atom number   assignment in w3: reference into the atom list

   shift      folded w4 position in ppm
w4   fold      number of times strip is folded in w4
   atom number   assignment in w4: reference into the atom list

As for the peak list the folded position of the strip is used within the program but the unfolded shifts are written out to the strip list file. The folding information is stored in the field fold. The atom numbers define the assignment of the strip. In contrast to peaks, strips can not be edited. They can only be defined from peaks or loaded from a file.

The strip list file has the extension .strips. An example of the first few lines out of a strip list file is given below:

     15nnoe-scp  3    0 116.385   8.312   8.311 1854 1853 1853
     15nnoe-scp  3    1 118.317   8.170   8.185 1887 1886 1886
     15nnoe-scp  3    2 117.788   7.822   7.817 1920 1919 1919
     15nnoe-scp  3    3 115.097   6.957   6.957 1953 1952 1952
     15nnoe-scp  3    4 112.377   8.099   8.099 1986 1985 1985
     15nnoe-scp  3    5 112.319   7.891   7.891 2019 2018 2018
     15nnoe-scp  3    6 112.020   6.961   6.960 2052 2051 2051
     15nnoe-scp  3    7 125.553   8.410   8.410 2085 2084 2084
     15nnoe-scp  3    8 111.984   8.932   8.932 2118 2117 2117
     15nnoe-scp  3    9 115.893   7.997   7.963 2151 2150 2150
     15nnoe-scp  3   10 106.044   7.482   7.478    0    0    0
     15nnoe-scp  3   11 117.601   7.746   7.733 2184 2183 2183
     15nnoe-scp  3   12 121.138   7.431   7.431 2217 2216 2216
     15nnoe-scp  3   13 119.677   8.409   8.409 2250 2249 2249
For each strip there is one line. First the name of the spectrum in which the strip was defined is indicated followed by the dimensionality of this spectrum. When loading a strip sequence, XEASY tries to Trash.findDFdf the indicated spectrum. If this is not present, a spectrum with the same dimensionality is taken - if possible the currently displayed one. The next number, the strip number, is currently not used. Then come the chemical shifts in all the dimensions and at the end the atom numbers specifying the assignment in each dimension. The commands to load and write strip lists are: "Load strip list [sl]" "Write strip list [ss]" A strip list can be build up from peaks in two different ways. First, all peaks selected with the "Select peak [sp]" command can be appended to the strip list using the command

"Strip sequence [se]" New strips are generated at the positions of each of the selected peaks. The assignments of the new strips are taken from the peaks. The newly generated strips are appended to the current strip list. The assignment of the strips can be used to sort them or to delete strips that have the same assignment. Second, with the commands "Append strip [sa]" and "Insert before reference [ib]" a single peak can be selected to define a strip, which is either appended or inserted into the strip list.

Once a strip list is defined it can be displayed. In order to be able to work with strips efficiently only about ten strips should be displayed on the screen at once. The selection of strips to be displayed and the number of strips to display on one screen are specified with the command "Goto strips [gs]". It displays one screen full of strips from the current strip list. With this command the width of the strips in the horizontal dimension is also defined. The height of the displayed strips will be equal to the height of the zoomed region where the command was issued.

To move around in the strip list the commands "Forward strips [fs]" and

"Backward strips [bs]" can be used. They move to the next or previous screen of strips. The command "Strip Trash.findDFdf [sf]" allows a fragment number to be specified. It then displays a screen of strips including the first strip assigned to this fragment. The whole strip list or single strips can be removed from memory using the commands "Erase strip list [es]" and "Remove one strip [ro]".

In many of the assignment tasks one starts with a strip and searches for one or several related strips. The command "Hold strip [sh]" fixes one or several strips. These fixed strips are thereafter shown as reference always on the left side of the screen. They can only be removed with the "Release strip [sr]"

command.

In addition to defining and displaying strips methods are necessary to search for relevant strips. Two such methods are implemented in XEASY. The first searches for strips at a selected position. Possible applications of this method include assigning NOESY cross peaks in heteronuclear correlated 3D NOESY spectra, spin systems assignment using the HCCH TOCSY, sequential assignment using the 15N correlated NOESY or the HNCA and HN(CO)CA experiments. For a more detailed description of how to work with such spectra, refer to the section 16.0 on page 109.

The second method searches for strips with a selected intensity pattern. The positions of the strip can in addition be used to narrow down the set of strips for which the intensity patterns are compared. The pattern are derived from observed intensities. They can be modified using peaks with the same assignment as the reference strip. The intensities at the positions of these peaks are multiplied by a user specified factor (refer to the detailed description below). Since the similarity of the strips are determined by calculating the correlation coefficient between strip intensity patterns the method is called the spectral correlation method (Bartels, C and Wüthrich, K. (1994) J. Biomol. NMR, ...). Its main application is the sequential assignment of 15N-correlated [1H,1H]-NOESY spectra: sequentially neighboring spin systems show similar cross peak patterns due to their spacial proximity. The frequencies of the sequential cross peaks can be identified using the peaks from the 15N-correlated [1H,1H]-TOCSY (see also section 16.0 on page 109).

In order to work with the method to search for strips at a selected position, reference strips have first to be defined. The command to define these reference strips is "Define reference strips [rd]". It copies the current strip list into the reference list. After this command has been issued the current strip list can be altered without affecting the search for close strips. The command "Compare close strips [pc]" can then be used to search for close strips. It allows to select a frequency in the vertical dimension of a zoomed region. It then displays a screen of strips. The first strip comes from the zoomed region at the position where the frequency was selected. The following strips are from the reference list. They are sorted by increasing distance from the selected frequency. To view the strips further away from the selected position the commands "Forward comparison [fc]" and "Backward comparison [bc]"

are provided. Again the first displayed strip is the one where the frequency was selected.

The next few paragraphs describe the correlation method followed by instructions of how to apply the method. For a detailed discussion refer to Bartels, C and Wüthrich, K. (1994) J. Biomol. NMR, ....

For the given strip s all the remaining candidate strips k are sorted according to a distance measure dk such that the strip which is searched gets a low rank. The candidate strips, starting with the lowest ranking ones, are then displayed in order to allow the final assignment to be made interactively. The distance measure dk is an enhancement of the correlation function introduced for the sequential assignment using 3D 15N-correlated [1H,1H]-NOESY in Bartels and Wüthrich (1994). In contrast to the sequential assignment using 3D 15N-correlated [1H,1H]-NOESY in all other assignment tasks it is possible to identify a frequency f in the given strip s at which there must be a peak in the sought strip k (see Applications section). This is used in the definition of the distance measure: [1] The condition discriminates those strips that are unlikely to correspond to the sought strip, since pk, their position in the vertical dimension, differs more than the user specified parameter from the expected frequency f. is usually set to a value larger than the expected error in the determination of the frequency f or the peak positions pk. The other strips with are possible candidates for the sought strip. They are further sorted according to the correlation function (Bartels and Wüthrich, 1994) which expresses the similarity of the peak pattern observed in strip k to the peak pattern expected for the sought strip and derived from the peak pattern observed in the given strip s. The peak patterns v = (i1,i2, . . ., in) are n-dimensional vectors with n equal to the number of data points along the vertical dimension which are derived from the experimentally observed intensities iex according to

[2] Here is the experimental intensity at the local maximum m of the absolute intensities, pl and ph are the positions of the two adjacent local minima of the absolute intensities and ib is a parameter usually set to 15 times the standard deviation of the noise. The constant Am - usually set to 1 - gives a weight to every local maximum. For particular assignment tasks, as for example sequential assignment using 3D 15N-correlated [1H,1H]-NOESY spectra, it can be used to emphasize or suppress subsets of peaks (Bartels and Wüthrich, 1994).

Note that setting in Eq. [1] to infinity, allows to handle cases where their is no peak in the sought strip whose frequency f can be identified in the given strip, e.g., sequential assignment using 3D 15N-correlated [1H,1H]-NOESY and setting to 0.0 ppm causes the candidate strips to be sorted only by the deviation of the strip position pk from the identified frequency f.

To apply the spectral correlation method the noise level must first be defined. This is done using the contour plot command "Contour plot [cp]"

Currently displayed peaks with the same assignment as the strip define the positions Am used to modify the peak patterns. The remaining parameters necessary to calculate the correlation coefficients are defined using the command "Define correlation [cd]" It copies the current strip list to the correlation strip list and calculates the correlation coefficients between pairs of strips. Again the current strip list can be modified afterwards without affecting the correlation method. To search for correlated strips the command "Compare correlated strips [cc]" is available. It asks for a reference strip and a frequency f to be selected and then displays a screen full of strips for comparison. The first strip displayed is the selected reference strip. The following ones come from the correlation strip list and are sorted by decreasing correlation to the pattern of the reference strip. To display the previous or next screen of correlated strips the commands "Forward comparison [fc]" and "Backward comparison [bc]" are used. Again the first displayed strip is the reference strip.

The command "Sequential assignment [op]"

uses the spectral correlation coefficients for determining the sequential assignment of a protein. The method combines simulated annealing with an algorithm described by R. Bernstein, C. Cieslar, A. Ross, H. Oschkinat, J. Freund and T. A. Holak. in J. Biomol. NMR, 3 (1993) 245-251. Already known sequential assignments and assignments of spin systems to possible spin system types can be provided to the routine.

3.8 Integrals

The final step in spectra interpretation for the structure determination of proteins is the integration of the NOESY cross peaks. The peak integrals are in a first approximation proportional to rij-6 where rij denotes the inter-proton distance between protons i and j. Because of this inverse sixth power relationship the volume accuracies are not critical - an error of 100% in volume leads to only a 12% change in inter-proton distance.

The problem remaining when trying to integrate cross peaks is overlap. Proteins of even moderate size will have upwards of several thousand cross-peaks many of which will overlap with others. If one of the overlapping components is weak the evaluation of its volume can be difficult - even getting the order of magnitude correct may be impossible by conventional techniques. In hetero-nuclear correlated 3D spectra the NOESY cross peaks are split up in a third dimension. This significantly reduces the problem of overlapping peaks and therefore integration.

The approach used in XEASY for 2D spectra is line-shape integration. This method stems from the fact that line-shapes of peaks with the same assignment have the same lineshape. Distortion may however arise from zero quantum effects. Cross peaks which overlap are integrated by taking the w1 and w2 line shapes for each peak and then deconvoluting the peak cluster with the lineshapes into volumes and volume errors. This technique was proposed by Denk, W. Baumann, R. & Wagner, G. (1986) J. Magn. Reson., 67, 386-390. The approach consists of three steps: first reference line-shapes along the x and y direction are determined for each resonance in the 2D spectrum. Next the peaks are grouped together into clusters of overlapping peaks. Here, two peaks are said to overlap if the rectangles defined by their line-shape extents intersect. Finally the volume of each peak in the cluster is determined by a linear least squares fit of the peak shapes constructed from the reference line-shapes to the experimental data points in the spectrum.

In mathematical terms the problem is one of adjusting the volumes Vp to minimize the following expression:

3.8.A

where S(w1,w2) is the spectral intensity at coordinate (w1,w2), Vp is the volume of peak p, m is the number of peaks in the cluster and Li is the reference line-shape for the wi resonance of peak p. This linear least squares problem is solved using standard methods (Press W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T. (1986) Numerical Recipes, the Art of Scientific Computing. Cambridge University Press). An uncertainty can be obtained for each peak volume by calculating the square-root of the above function over the peak region (not the total cluster). This can then be expressed as a percentage of the calculated volume.

For line shape integration, the command "Line-shape integration [li]" is used. Before it is applied, the lineshapes for each atom have to be defined and edited with the following commands

Lineshapes are saved into a file with extension ".ref". An example is given below:
# Number of dimensions 2
   1 -9999   0.000   0.000 -9999   0.000   0.000 
   2   774   4.211   4.266   155   4.206   4.258 
   3 -9999   0.000   0.000 -9999   0.000   0.000 
   4 -9999   0.000   0.000 -9999   0.000   0.000 
   5   155   3.057   3.167  1452   3.063   3.153 
The first line gives the dimensionality of the reference lineshape list. Each of the following lines defines for one atom the position in all dimensions where the lineshapes can be read out from the spectrum. The first number on each line is the atom number. The rest of the line is subdivided into groups of three numbers for each dimension. The first of which is the number of the reference peak, the second and third number give the lower and upper bound of the lineshape in ppm. The entry for atom 2 is illustrated on the figure below: 3.8.A The peak number "-9999" means that the w1 or w2 line-shape is not defined. The commands to load or save a line-shape list are "Load reference list [lr]" and

"Write reference list [wr]".

The formats for the line shape lists of the programs EASY and EASY3D are different. Instead of specifying a reference peak, as in EASY and EASY3D, in XEASY a reference atom is specified for each dimension. Since this format is difficult to adapt to higher dimensional spectra it has been dropped. The old format can still be read by the XEASY program. However only the new format will be written out.

Two additional commands exist to check the lineshapes. The command "Check ref. line-shape list [cr]" looks which lineshapes are defined and whether the lineshapes position and the atom frequency are consistent. The command "View reference peak [vr]" displays for each peak with the same assignment as the selected one a 1D cross section in the slice window.

In 3D spectra, where overlap is a smaller problem, it suffices to partition the spectral data points into areas that belong to a given peak. Then the data points in these areas can be summed up to give the peak volume. The external program peakint (described in section 15.12 on page 106) is a routine to integrate peaks by partitioning. These two automated approaches have to be complemented by interactive methods where it is possible to specify the region to integrate in the spectrum. For interactive peak integration the commands "Interactive integration param. [ip]" "Interactive integration [ii]"

can be used.

A different problem occurs when determining rate constants by evaluating a time series of spectra. It is crucial that the same regions out of each of the spectra at the different time points are integrated. This can be achieved by defining the regions to integrate by drawing rectangles (refer to section 3.9 on page 45) which can be saved and loaded again. The integration is performed with the command "Integrate rectangular regions [ir]"

Since NOESY spectral interpretation is an interactive process, where cross peak assignment, cross peak integration and structure calculation steps are repeated, efficient book keeping is crucial. For each peak, XEASY stores the integration method, the volume integral, and the error of the volume integral. The peak color can be used to reflect the state of the integration. In this way peaks with valid line shapes can be distinguished from peaks where lineshape information is missing or from peaks that were interactively integrated. Peaks can also be colored according to their integration method. To control the desired display mode the default window and the "Check ref. line-shape list [cr]" command are available.

In order to apply the lineshape integration selectively, peaks integrated interactively will not be changed by line shape integration. This allows the user to apply the lineshape integration routines to the whole spectrum without loosing the information of the interactively integrated peaks. In order to change the volume of an interactively integrated peak by line shape integration its integration method must be set back to "-" (for example by using the "Set selected peak entries [pe]" command).

3.9 Geometries

A basic feature necessary to work with spectra is to draw lines and rectangles (i.e. geometries) to mark features in the spectrum. Correspondingly a number of drawing commands exist in XEASY. They are:The [ds] command draws lines from each selected peak to the diagonal. When applied to the TOCSY or COSY peaks of one fragment this corresponds to drawing spin system connectivities. The [df] command draws lines at selected frequencies. It can be used to identify a spin system using a TOCSY spectrum: clicking at all the peaks of a TOCSY tower will produce a grid identifying all the peaks belonging to one spin system.

Geometries can be removed with the commands "Remove lines [rl]" and

"Remove rectangles [rr]" In order to remove all lines or rectangles at once these two commands can be combined with the "Apply to all ... [aa]" modifier command.

Geometries are saved in a file with the extension ".geom". The coordinates of the lines and rectangles are stored in ppm. A flag denotes whether the geometry is a line or a rectangle: "0" marks a line, "1" marks a rectangle. The commands to read or write a geometry file are "Load geometries [lg]" and "Write geometries [wg]".


next: 4. Spectrum Menu /

contents

-- DavidCowburn - 15 Jun 2005

Revision 216 Jul 2008 - Main.DavidCowburn

 Contents XEASY manual: 3. The XEASY Model previous: 2. Using XEASY / contents

3. The XEASY Model

To enable computer supported spectra interpretation, an operational model is needed which defines the terms on which the computer interacts with the spectroscopist. This Model has to include the spectra which are the raw data and abstractions such as peaks, assignments, line shapes, geometries and strips which are derived from the spectra and constitute the results of the program. These different elements are detailed in this chapter.

3.1 Spectrum

A spectrum is a 2D, 3D or 4D dimensional box of intensities containing the frequency domain data of an NMR experiment. Spectra are loaded using the command "New spectrum [ns]" .

The intensity information for each data point is either encoded in 8 bits or in 16 bits. The format with 8 bits uses a logarithmic representation of the data with 1 byte per real data point. For a given data point sk the program first determines the integer l that minimizes the expression


Changed:
<
<
3.1.A
>
>
3.1.A
  (i. e. ) and then stores in one byte


Changed:
<
<
3.1.B
>
>
3.1.B
  This format can represent numbers approximately in the range , i. e. with a relative error of less than 20%. The format with 16 bits uses a 16 bit floating point format with the "exponent" ek given by Eq. 3.1.B in the lower valued byte and the mantissa

Changed:
<
<
3.1.C
>
>
3.1.C
  ( if ) in the higher valued byte (Eccles et al., 1991). This format can represent numbers in the same range as the 8 bit format but with a relative error of less than 1%.

The intensity data is stored in a spectrum data file with the extension ".3D.8" or ".3D.16" depending on the used accuracy. The information about the spectrum is kept in the parameter file with the extension ".3D.param". The example below

Version ....................... 1
Number of dimensions .......... 2
16 or 8 bit file type ......... 16
Spectrometer frequency in w1 .. 60.811001
Spectrometer frequency in w2 .. 600.138000
Spectral sweep width in w1 .... 34.840000
Spectral sweep width in w2 .... 6.719000
Maximum chemical shift in w1 .. 135.384872
Maximum chemical shift in w2 .. 11.538800
Size of spectrum in w1 ........ 128
Size of spectrum in w2 ........ 512
Submatrix size in w1 .......... 256
Submatrix size in w2 .......... 256
Permutation for w1 ............ 2
Permutation for w2 ............ 1
Folding in w1 ................. RSH
Folding in w2 ................. RSH
Type of spectrum .............. C
is from a 2D spectrum. The intensities are stored in the 16 bit format. Spectrometer frequencies and spectral sweep widths for both dimensions are given in ppm. The maximum chemical shifts give the ppm frequency of the lower left corner of the spectrum. The total size of the spectrum is given in data points. The submatrix size gives the size of the blocks used to store the intensities in order to allow fast access to small parts of the spectrum. The permutation defines the sequential order in which the intensities are stored in the data file. The folding has to be set to either RSH, for Ruben States Haberkorn (Marion, D., Ikura, M., Tschudin, R. & Bax. A. (1989) J. Magn. Reson. 85, 393-399), or TPPI, for time proportional phase increment (Marion, D. and Wüthrich, K. (1983) Biochem. Biophys. Res. Comm. 113, 967-974). The folding type is used to fold the peaks into the spectrum when loading a peak list. The type of spectrum is not used by the program.

To set the calibration of a spectrum the command "Calibration [ca]" is used. Since the peak positions are stored in ppm it is not recommended to work with uncalibrated spectra. The way the parameter file is interpreted is influenced by the value of the resource "XEasy*traditional_calibration".

A spectrum can be turned around without changing the data file by editing the parameter file. Using the following parameter file defines the data file from the above example to be from a three dimensional spectrum extending only one data point in the third dimension. An arbitrary calibration is used in w1:

Version ....................... 1
Number of dimensions .......... 3
16 or 8 bit file type ......... 16
Spectrometer frequency in w1 .. 500.000000
Spectrometer frequency in w2 .. 60.811001
Spectrometer frequency in w3 .. 600.138000
Spectral sweep width in w1 .... 10.000000
Spectral sweep width in w2 .... 34.840000
Spectral sweep width in w3 .... 6.719000
Maximum chemical shift in w1 .. 500.000000
Maximum chemical shift in w2 .. 135.384872
Maximum chemical shift in w3 .. 11.538800
Size of spectrum in w1 ........ 1
Size of spectrum in w2 ........ 128
Size of spectrum in w3 ........ 512
Submatrix size in w1 .......... 1
Submatrix size in w2 .......... 256
Submatrix size in w3 .......... 256
Permutation for w1 ............ 3
Permutation for w2 ............ 2
Permutation for w3 ............ 1
Folding in w1 ................. RSH
Folding in w2 ................. RSH
Folding in w3 ................. RSH
Type of spectrum .............. C
The different possibilities to generate spectra files in the XEASY format are given below. A description of conversion programs between different formats is given under external programs .
  • directly from PROSA: using the spectrum processing program PROSA (P. Güntert, V. Dötsch, G. Wider and K. Wüthrich, J. Biomol. NMR, 2 (1992) 619-629) data suitable for XEASY may be written out with the command write easy8 [filename] or write easy16 [filename]
Changed:
<
<
  • converting a Bruker smx file into a 2D spectrum using the filegen2d program.
  • converting a series of Bruker smx files into one 3D spectrum using the filegen3d program.
  • copying Bruker smx files from the X32 to the workstation and converting them into 2D or 3D spectra. This can be done with the shell scripts cpx32 and cpx32_16bit.
  • converting a spectrum from the old EASY format into the new XEASY format with the program filecon2d.
  • >
    >
  • converting a Bruker smx file into a 2D spectrum using the filegen2d program.
  • converting a series of Bruker smx files into one 3D spectrum using the filegen3d program.
  • copying Bruker smx files from the X32 to the workstation and converting them into 2D or 3D spectra. This can be done with the shell scripts cpx32 and cpx32_16bit.
  • converting a spectrum from the old EASY format into the new XEASY format with the program filecon2d.
  •  In addition to the file format for XEASY, two outdated spectral formats from the EASY and EASY3D programs exist. The format from the EASY program can no longer be read by the XEASY program. These files have to be converted by the filecon2d program. Files used in EASY3D can still be read by the XEASY program. Since these parameter files contain binary information they may not be edited.

    3.2 Zoom Regions

    Many routines of the XEASY program display selected regions of a spectrum. One dimensional cross sections of the spectrum are displayed as plots of intensity vs. data points in the slice window (see section 12.8 on page 95). Two dimensional regions of the spectrum may be shown in the main window as contour plots or as intensity plots. The intensity plot maps the intensities onto a color scale. The contour plot draws lines of equal intensity. Following commands may be used to change between the different display modes:

    3.2.1 Selecting Zoom Regions

    XEASY provides routines to define and select regions from the spectrum for displaying. The interactive routines for zooming using no information from peaks, assignments or strip sequences are described below. In addition, there exist a number of commands which select zoom regions based on picked peaks or on strip sequences. These are described in the corresponding chapters about peak lists (page 23) and strip sequences (page 36).

    To select an arbitrary region in the spectrum and also to restore the full spectrum the command

    "Permutation [pm]"

    can be used. This command as well as the "New spectrum [ns]" command define the view onto the spectrum, that is which dimension will be displayed horizontally (in x direction) and which vertically (in y direction). Many routines within XEASY refer to this view.

    The following commands allow the definition of regions to be displayed by manually selecting their boundaries within the currently displayed zoom region:

    In addition the two letter code "mo" may be used to select a region in the overview window. To modify the size or position of the displayed regions the following commands are available:The zoom commands to move or resize regions take as a parameter the zoom factor which can be defined using the command

    "Zoom factor [zf]".

    3.2.2 Comparing Zoom Regions

    To compare different spectra, it is possible to display the contour plot of one spectrum on top of the intensity plot of an other spectrum. The command

    "Alternative spectrum [as]".

    may be used to define additional spectra for which the commands

    "Replace contour [rc]".
    "Replace spectrum [rs]".

    select the ones to display as contour and intensity plots. The spectra currently used for the intensity plot and the contour plot are indicated in upper left corner of the main window. The upper name gives the spectrum used for intensity plots the lower the one for the contour plot. The routines in XEASY which access the intensities of a spectrum use always the spectrum of the intensity plot. To compare different regions the command

    "Zoom alignment [za]".

    is used to get several aligned regions. The large cursor

    "Draw cursor [dc]".

    allows the precise comparison of the positions of peaks within these aligned regions. If displaying the intensity plots of different spectra together it might be necessary to adjust the scale of color vs. intensities of one or more of them. This is possible with the command

    "Scale display [sd]".

    3.2.3 Zoom Region History

    A history of the displayed zoom regions is kept and is used to call back previous displays. The command

    "Restore zoom [rz]".

    brings back the last stored display. Commands that can easily be reversed (e.g. the commands to move around a zoom region) are not stored in the zoom region history and can not be restored with the [rz] command. The two-letter codes "zb", "zf" and "zo" can be used in the overview window to select any zoom from the zoom history.

    Zooms may also be written to a file and recovered again. The commands are

    "Load zoom [lz]".
    "Write zoom [wz]".

    These files have the extension .zoom and contain the information about the zoom regions. The data in the zoomed regions are not stored, but rather the ppm coordinates of the regions in the spectrum are saved. Whenever such a file is loaded, the display will be accordingly updated.

    3.3 Phase Correction

    The program XEASY can be used to determine phase correction parameters interactively on the basis of 1D cross-sections that are displayed in the slice window (see "Slice window [sw]").

    With the conventions used by the program PROSA (P. Güntert, V. Dötsch, G. Wider and K. Wüthrich, J. Biomol. NMR, 2 (1992) 619-629), the phase-corrected spectrum is related to the original spectrum by

    
    
    Changed:
    <
    <
    3.1.B
    >
    >
    3.1.B
      for the data points 0, ..., n-1 in every row along the dimension of interest. is the constant, the linear (or first order) phase correction parameter. The program XEASY allows real-time change of the phase correction parameters and and display of the corresponding, phase-corrected 1D cross-sections, and thus provides a convenient environment for the accurate interactive selection of phase correction parameters. The phase correction parameters found with the help of XEASY can subsequently be used in data processing programs such as PROSA to perform the phase correction on the complete multidimensional spectrum.

    To use the interactive phase correction routine in XEASY, first two separate spectrum files containing the real and imaginary parts of the complex spectrum are prepared (e.g., with the program PROSA), then the real part of the spectrum is read into XEASY with the "New spectrum [ns]" command and the imaginary part of the spectrum is defined as alternative spectrum withe the "Alternative spectrum [as]" command. Next the user selects and displays suitable rows (or columns) in the slice window (see "Slice window [sw]" ) and clicks the Phase button in the slice window. Phase correction will be performed for the spectrum in the dimension of the current slice (i.e., the slice for which the Current button on top of the slice window is activated). The alternative spectrum window pops up and the user selects the file containing the imaginary part of the spectrum. The constant and linear phase correction parameters can now be adjusted using the keyboard according to table 3.3.

    The linear phase correction parameter is changed such that the phase of the current slice at the cursor position in the slice window remains constant. The actual values of the phase correction parameters are monitored in the lower left corner of the slice window.

    The user deactivates the phase correction mode by clicking the Phase button again. The 1D cross-sections of the original spectrum (i.e., the real part loaded with the [ns] command) are displayed again.

    3.4 Assignments

    Assignments represent the main result of the work with the program XEASY. Three lists, namely the peak list, the atom list and the fragment list contain the information about the assignments. The peak list contains the coordinates of the picked peaks used to assign the spectrum. The atom list contains the names and frequencies of possible resonances. They define the possible assignments for each dimension of a peak. For homonuclear, single quantum, proton spectra the atom list contains the names and frequencies of all protons in the molecule. In this case it corresponds to the proton list used in the programs EASY and EASY3D. The fragment list contains the names of the fragments which are used at the different stages of the assignment process. In early stages of the assignment, spin systems independent of the primary sequence of the molecule are used. Then increasingly more of them become mapped onto the primary sequence of the protein until finally the residues of the molecule under investigation are used. In this final stage the fragment list corresponds to the sequence list of the programs EASY and EASY3D.

    In addition to these three lists, XEASY uses a library file defining the atoms and pseudo atoms for each fragment type. The following chapters provide first, a detailed description of the three lists and the library file, and then a description of how they can be used in conjunction in order to proceed through the different stages of the assignment process.

    3.4.1 Peak List

    A peak list contains entries for peaks picked in the spectrum. The following paragraphs present in detail the peak list file format, peak picking, assigning peaks, editing peaks, displaying relevant information from a peak list, folding, and how to treat different dimensionality of peak lists and spectra.

    A peak list contains an entry for each peak. The fields describing a peak are listed in Table 3.4.1.A. In a 2D spectrum the fields for the w3 and w4 dimension and in a 3D spectrum those for w4 are not used. The peak numbers in a peaklist must be unique but not necessarily continuous.

    Changed:
    <
    <
    Table 3.4.1.A Peak Fields
    >
    >
    Table 3.4.1.A Peak Fields
     
    
    
    Changed:
    <
    <
    Dim. Fields Description
    >
    >
    Dim. Fields Description
     
    Changed:
    <
    <
    peak number unique number identifying the peak colour colour in the range [1,6] used for displaying the peaks volume volume of the peak volume error volume error in percent integration method method used for the integration: d, r, e, m, a, - comment user defined comment possible ass. data structure containing possible assignments
    >
    >
    peak number unique number identifying the peak colour colour in the range [1,6] used for displaying the peaks volume volume of the peak volume error volume error in percent integration method method used for the integration: d, r, e, m, a, - comment user defined comment possible ass. data structure containing possible assignments
     
    Changed:
    <
    <
    shift folded w1 position in ppm w1 fold number of times peak is folded in w1 atom number assignment in w1: reference into the atom list
    >
    >
    shift folded w1 position in ppm w1 fold number of times peak is folded in w1 atom number assignment in w1: reference into the atom list
     
    Changed:
    <
    <
    shift folded w2 position in ppm w2 fold number of times peak is folded in w2 atom number assignment in w2: reference into the atom list
    >
    >
    shift folded w2 position in ppm w2 fold number of times peak is folded in w2 atom number assignment in w2: reference into the atom list
     
    Changed:
    <
    <
    shift folded w3 position in ppm w3 fold number of times peak is folded in w3 atom number assignment in w3: reference into the atom list
    >
    >
    shift folded w3 position in ppm w3 fold number of times peak is folded in w3 atom number assignment in w3: reference into the atom list
     
    Changed:
    <
    <
    shift folded w4 position in ppm w4 fold number of times peak is folded in w4 atom number assignment in w4: reference into the atom list
    >
    >
    shift folded w4 position in ppm w4 fold number of times peak is folded in w4 atom number assignment in w4: reference into the atom list
      The peak list is stored in a peak list file with extension .peaks. It can be read by the program ASNO (P. Güntert et al., 1993) which uses a peak list, an atom list and selected structure coordinate files to generate possible assignments for NOESY cross peaks which can be loaded into XEASY. The program CALIBA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530) translates the peak lists containing integrated peaks into distance constraints which can be used by the program DIANA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530). An example of the first few lines of a two dimensional peak list is given below:
    # Number of dimensions 2
    
    
    Changed:
    <
    <
    1. 7.289 10.169 1 ? 2.048e+03 0.00e+00 - 0 126 128 0 # first peak
    2. 7.119 9.413 1 ? 1.280e+02 0.00e+00 - 0 517 506 0
    3. 7.106 7.497 1 ? 4.096e+03 0.00e+00 - 0 129 130 0
    4. 7.228 7.411 1 ? 4.096e+03 0.00e+00 - 0 131 127 0
    5. 7.106 7.411 1 ? 5.120e+02 0.00e+00 - 0 327 328 0
    6. 6.838 7.094 1 ? 8.192e+03 0.00e+00 - 0 489 488 0
    >
    >
    1. 7.289 10.169 1 ? 2.048e+03 0.00e+00 - 0 126 128 0 # first peak
    2. 7.119 9.413 1 ? 1.280e+02 0.00e+00 - 0 517 506 0
    3. 7.106 7.497 1 ? 4.096e+03 0.00e+00 - 0 129 130 0
    4. 7.228 7.411 1 ? 4.096e+03 0.00e+00 - 0 131 127 0
    5. 7.106 7.411 1 ? 5.120e+02 0.00e+00 - 0 327 328 0
    6. 6.838 7.094 1 ? 8.192e+03 0.00e+00 - 0 489 488 0
     The number on the first line after the hash "#" indicates the dimensionality of the peak list. Subsequent lines, starting with a number, contain the fields for one peak. Additional lines starting with a hash "#" are comments for the peak on the line above. The first field for each peak is the peak number, followed by: the unfolded chemical shift coordinates in ppm in w1 and w2 (more numbers are listed for higher dimensional spectra), the color code (a number from 1 to 6), the user defined type of the spectrum where the peak is observed, the peak volume, the uncertainty of the volume in percent, the integration method ("d" for Denk integration, "r" for rectangular integration, "e" for elliptical integration, "m" for maximum integration, "a" for automatic integration, "-" for not integrated), an unused number, the assignments in w1 and w2 is given by the two following atom numbers (more numbers are listed for higher dimensional spectra), the last number is not used. The commands to load or write peak lists are "Load peaklist [lp]" and "Write peaklist [wp]".

    Loading a peak list with different dimensionality than the spectrum is possible. If the peak list has more dimensions than the spectrum the additional dimensions are just ignored when working with the peak list. When loading a peak list with less dimensions than the spectrum, it has to be specified whether the needed additional dimensions are copied from the available dimensions or set to a fixed ppm value. For example, the 2D peak list from a [1H,15N]-HMQC spectrum can be loaded onto a 3D 15N-correlated [1H,1H]-NOESY by copying the 1H chemical shift of the 2D list to both 1H dimensions in the 3D spectrum. The resulting peaks are positioned on the diagonal of the 1H planes in the 3D spectrum.

    Peaks may be picked either automatically or manually. For automatic peak picking of anti phase peaks in 2D spectra the command "Anti-phase peaks [an]" is used. For automatic picking of inphase peaks the external program pick described on page 105 or the command

    "In-phase peak picking [in]" can be used.

    Issuing the command "Peak picking [pp]" enters the manual peak picking mode. By selecting a position on the screen a peak placed at the corresponding position of the spectrum is added to the peak list. In 3D and 4D spectra the added peak is picked at the position of the displayed 2D region.

    The peak list file contains always the unfolded chemical shifts. When loading a peak list onto a spectrum all peaks are folded into the spectrum. The folding information is stored in the field fold. When writing out the peak list, this field is used to back calculate the unfolded peak positions. The information of the unfolded chemical shifts of a peak can in this manner be transferred from a spectrum with a large sweep width (i.e. 2D HMQC experiment) to spectra with smaller sweep widths (i.e. 3D 15N correlated [1H,1H] spectra). When picking additional peaks, one wants to retain the folding information from peaks already present in the spectrum. This is possible with the command "Copy and move peaks [cm]", which allows to copy the folding information and the assignment from an existing peak to new peaks. Other methods to set the folding of a peak are to enter the unfolded chemical shift into the w field of the peak editing window or to use the "Set selected peak entries [pe]", command which can be used to set or reset any field in the peak list.

    Once peaks are picked, they may be assigned to atoms present in the atom list. This is done either for a single peak using the command "Assign peak [ap]"

    or for all selected peaks within a region using the command "Assign peaks in one region [ar]".

    For example, all peaks in the fingerprint region of a homonuclear proton COSY spectrum can be assigned to the HN and Ha atoms of a set of spin systems, or all peaks in a [1H,15N]-HMQC spectrum may be assigned to the N and HN atoms of back bone fragments.

    When working with a peak list it is crucial to display information relevant for the current assignment task. First, often only a subset of peaks is relevant (e.g. only the peaks assigned to a certain fragment). The commands "Select peak [sp]" and "Display all peaks [da]" provide different criteria for selecting peaks to display. In 3D and 4D spectra, usually peaks of interest are only those close to the displayed 2D zoom region. The option "Maximal distance in planes" in the default window controls the number of adjacent planes from which to display peaks.

    Second, peaks with different properties (i.e. assigned peaks versus unassigned peaks) can be displayed with different colors or peak shapes. The options: "Peak color determined by", "Peak cross displayed is", "Lineshape displayed for" and "Coloring interval" in the default window, control the peak colors.

    Third, peaks may also be labelled with their assignment or the volume. The command

    "Peak data window [pw]" pops up the peak data window in which the peak labels may be defined.

    Fourth, peaks may be used to zoom interesting regions out of a spectrum. In the simplest case of the command "View peak [vp]" a region is zoomed that contains a specified peak. The command "Zoom peak [zp]" displays a small region around each of the peaks selected with the "Select peak [sp]" command. The command "View reference peak [vr]" displays for each peak with the same assignment as the selected one, a slice in the slice window. An other powerful method to select relevant regions out of a spectrum is the strip list. It is described in the section 6.0 on page 56.

    It is possible to work with several peak lists together. When a peak list is loaded it is kept in memory until it is removed from memory with the

    "Erase peak list [ep]"

    command. To select a different peak list for displaying with a spectrum the command

    "Exchange peak list [xc]"

    is used. This command does not change the dimensionality or the folding of the loaded peak list. To adapt the folding of the peaks to a certain spectrum, to change the dimensionality or to permute the dimensions the command "Adapt peaks to spectrum [ad]" is used.

    Several commands exist to edit the peak list entries. they are:

    Worthwhile mentioning is the "Move reference peak [mr]" command. It allows one to move together all peaks with the same assignment. This helps when adjusting a peak list picked and assigned in one spectrum to a related, but slightly different spectrum. Two examples are pH titrations and the use of a TOCSY peak list to identify the intra-residual peaks in a NOESY spectrum.

    To check or to list the assignments the commands "List peak entries [le]" and "Report ass. stat.[ra]" are used. They check for duplicated assignments, i.e. two peaks that have the same assignment, and for large chemical shift errors, i.e. peaks at different frequencies which are assigned to the same atom. Both informations are useful to identify wrong assignments.

    3.4.2 Atom List

    The atom list contains the names and frequencies of resonances. They define the possible assignments for each dimension of a peak.

    Changed:
    <
    <
    Table 3.4.2.A - Atom Fields
    >
    >
    Table 3.4.2.A - Atom Fields
     
    
    
    Changed:
    <
    <
    Fields Description
    >
    >
    Fields Description
     
    Changed:
    <
    <
    atom number unique number identifying the atom shift mean chemical shift in ppm shift error deviation of the assigned peaks from the mean value name atom name fragment number number of the fragment to which the atom belongs lineshapes data structure containing the reference lineshapes
    >
    >
    atom number unique number identifying the atom shift mean chemical shift in ppm shift error deviation of the assigned peaks from the mean value name atom name fragment number number of the fragment to which the atom belongs lineshapes data structure containing the reference lineshapes
      The fields listed in Table 3.4.2.A constitute an atom entry. They can be modified in the peak editing window. The atom numbers, used to reference the atoms, must be unique but not necessarily continuous. The number -9999 is reserved to denote invalid entries. The average chemical shift and the shift error can be calculated from the assigned peaks. The command is

    "Average chem. shift [ac]".

    If the chemical shift is not defined it is set to the value 999.000.

    A new atom list is generated each time when a fragment list is loaded. For each fragment the corresponding atoms are looked up in the fragment library file and added to the list. New atom entries are added to the atom list if a non existing atom is used with the "Assign peak [ap]" command, if the fragment type is changed in the peak editing window, or with the "Add new fragment [af]" command.

    The atom list file has the extension ".prot" originating from the old EASY format. The following line is taken from such a file:

    
    
    Changed:
    <
    <
    32 4.370 0.004 HA 2
    >
    >
    32 4.370 0.004 HA 2
     The first number is the atom number, followed by its mean chemical shift and the deviation from the mean value. The atom name and the fragment number follow. The commands to read or write an atom list are

    "Load atoms (chem. shift) [lc]" and "Write atoms (chem. shift) [wc]".

    3.4.3 Fragment List

    The fragment list contains the fragments currently used for the assignment. Depending on the stage of the assignment, the fragments are either spin systems or residues of the molecule. In the latter case, the fragment list corresponds to the sequence list used in the programs EASY and EASY3D.

    Changed:
    <
    <
    Table 3.4.3.A Fragment Fields
    >
    >
    Table 3.4.3.A Fragment Fields
     
    
    
    Changed:
    <
    <
    Fields Description
    >
    >
    Fields Description
     
    Changed:
    <
    <
    fragment number unique number identifying the fragment fragment type name of the fragment mapping number used to map spin system fragments to a residue fragments comment used to store possible spin system types or possible sequential neighbours
    >
    >
    fragment number unique number identifying the fragment fragment type name of the fragment mapping number used to map spin system fragments to a residue fragments comment used to store possible spin system types or possible sequential neighbours
      The fields listed in Table 3.4.3.A constitute a fragment. The fragment numbers, which are used in the atom list to reference fragments, must be unique but not necessarily continuous. They are referenced in the atom list. The number -9999 is reserved to denote invalid entries. The fragment type must be defined in the fragment library. When starting a resonance assignment it is usually set to "SS" meaning that there is no information about the specific spin system type. In later stages it may be changed, for example to "ASP", using the peak editing window. The command "Add new fragment [af]" will add additional fragments to the list.

    The mapping number defines a mapping from one set of fragments (e.g., spin systems) to another set (e.g., amino acid residues). When working with spin system fragments the mapping number is the residue number of the spin system. For example: a fragment with number 207, type GLY and mapping number 55 describes spin system 207 which corresponds probably to GLY 55. The mapping number -1 is reserved to denote invalid mappings. The comment can, for example, be used to store the fragment numbers of the sequentially neighboring spin system. When using the automatic sequential assignment routine "Sequential assignment [op]" the fragment comment can be used to indicate for each spin system the probable amino acid types.

    The fragment list is stored in a file with extension ".seq". If the first line starts with a hash "#", it is treated as a comment. The subsequent lines list the fragment types with one line per fragment. For example, the following file defines a tripeptide:

    # tripeptide
    ASP 0
    GLY 5 203
    LYS 7 209 "-1: 203"
    
    The number, if given, after the fragment name denotes the fragment number. If there is no number given for a fragment, its number will be assigned according to the last fragment number plus one. If no number is given for the first fragment, it will be set to one. The optional second number is the mapping number. An entry "-1" for the mapping number indicates that it is not defined. The optional text within the quotes is the comment. If the comment is present the mapping number must also be listed.

    The command to read or write a sequence file is "Load sequence [ls]" and "Write sequence [ws]".

    3.4.4 Fragment Library

    The fragment library defines the different fragment types. All the atoms constituting a given fragment type are listed. The format is the one of the program DIANA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530). Amino acid fragments, nucleic acid fragments, a general spin system fragment and a general amino acid backbone fragment are currently defined. The extension is ".lib". The default library, defined by the environmental variable XEASY_LIB is loaded when starting up XEASY. To load a different library the command "Load library [ll]" is used.

    3.4.5 From Spin System Assignments to Residue Assignments

    A reasonable setup for starting out with resonance assignments is to define a set of general spin systems in the fragment list and generate the corresponding atom list. The first few lines of such a fragment list are given below:
      SS 201
      SS 202
      SS 203
      SS 204
      SS 205
      SS 206
      SS 207
      SS 208
      SS 209
      ...
    
    Some lines extracted from the corresponding atom list, which is generated by loading the fragment list into XEASY are given as well:
    
    
    Changed:
    <
    <
    1. 999.000 0.000 N 201
    2. 999.000 0.000 HN 201
    3. 999.000 0.000 CA 201
    4. 999.000 0.000 HA 201
    5. 999.000 0.000 QA 201
    6. 999.000 0.000 CB 201
    7. 999.000 0.000 HB2 201
    8. 999.000 0.000 HB3 201
    9. 999.000 0.000 QB 201
    >
    >
    1. 999.000 0.000 N 201
    2. 999.000 0.000 HN 201
    3. 999.000 0.000 CA 201
    4. 999.000 0.000 HA 201
    5. 999.000 0.000 QA 201
    6. 999.000 0.000 CB 201
    7. 999.000 0.000 HB2 201
    8. 999.000 0.000 HB3 201
    9. 999.000 0.000 QB 201
     ...
    Changed:
    <
    <
    34 999.000 0.000 N 202 35 999.000 0.000 HN 202 36 999.000 0.000 CA 202 37 999.000 0.000 HA 202 38 999.000 0.000 QA 202 39 999.000 0.000 CB 202 40 999.000 0.000 HB2 202 41 999.000 0.000 HB3 202 42 999.000 0.000 QB 202
    >
    >
    34 999.000 0.000 N 202 35 999.000 0.000 HN 202 36 999.000 0.000 CA 202 37 999.000 0.000 HA 202 38 999.000 0.000 QA 202 39 999.000 0.000 CB 202 40 999.000 0.000 HB2 202 41 999.000 0.000 HB3 202 42 999.000 0.000 QB 202
     ... With these two lists, peaks may be grouped by assigning them to the same fragment and further classified by assigning them to certain atoms within the fragments. For example when working with heteronuclear spectra, each peak in a [1H,15N] HMQC experiment can be assigned to the N and HN atoms of different fragments. Similarly when working with homonuclear spectra, each COSY cross peak in the fingerprint region may be assigned to the Ha and HN atoms of a different fragment. Peaks in the COSY or TOCSY spectrum lying on the same amide frequency as the HN Ha peak can then be assigned to the HN Hb2, HN Hg2, ... of the same fragment.

    After the resonance assignment has been finished, the fragments of choice are the amino acid residues and the atom list has entries for the atoms in the molecule. The first few lines of the fragment list for the interleukin receptor antagonist are given below:

    
    ARG 1
    PRO 2
    SER 3
    GLY 4
    ARG 5
    LYS 6
    
    Some lines extracted from the corresponding atom list with undefined chemical shifts are given below:
    
    
    Changed:
    <
    <
    1. 999.000 0.000 C 1
    2. 999.000 0.000 CA 1
    3. 999.000 0.000 CB 1
    4. 999.000 0.000 CD 1
    5. 999.000 0.000 CG 1
    6. 999.000 0.000 CZ 1
    7. 999.000 0.000 HA 1
    8. 999.000 0.000 HB2 1
    9. 999.000 0.000 HB3 1 10 999.000 0.000 HD2 1 11 999.000 0.000 HD3 1 12 999.000 0.000 HE 1 13 999.000 0.000 HG2 1 14 999.000 0.000 HG3 1 15 999.000 0.000 HH1 1 16 999.000 0.000 HH21 1 17 999.000 0.000 HH22 1 18 999.000 0.000 HN 1 19 999.000 0.000 N 1 20 999.000 0.000 NE 1 21 999.000 0.000 NH1 1 22 999.000 0.000 NH2 1 23 999.000 0.000 QB 1 24 999.000 0.000 QD 1 25 999.000 0.000 QG 1 26 999.000 0.000 QH2 1 27 999.000 0.000 C 2 28 999.000 0.000 CA 2 29 999.000 0.000 CB 2 30 999.000 0.000 CD 2 31 999.000 0.000 CG 2 32 999.000 0.000 HA 2 33 999.000 0.000 HB2 2 34 999.000 0.000 HB3 2 35 999.000 0.000 HD2 2 36 999.000 0.000 HD3 2 37 999.000 0.000 HG2 2 38 999.000 0.000 HG3 2 39 999.000 0.000 N 2 40 999.000 0.000 QB 2 41 999.000 0.000 QD 2 42 999.000 0.000 QG 2
    >
    >
    1. 999.000 0.000 C 1
    2. 999.000 0.000 CA 1
    3. 999.000 0.000 CB 1
    4. 999.000 0.000 CD 1
    5. 999.000 0.000 CG 1
    6. 999.000 0.000 CZ 1
    7. 999.000 0.000 HA 1
    8. 999.000 0.000 HB2 1
    9. 999.000 0.000 HB3 1 10 999.000 0.000 HD2 1 11 999.000 0.000 HD3 1 12 999.000 0.000 HE 1 13 999.000 0.000 HG2 1 14 999.000 0.000 HG3 1 15 999.000 0.000 HH1 1 16 999.000 0.000 HH21 1 17 999.000 0.000 HH22 1 18 999.000 0.000 HN 1 19 999.000 0.000 N 1 20 999.000 0.000 NE 1 21 999.000 0.000 NH1 1 22 999.000 0.000 NH2 1 23 999.000 0.000 QB 1 24 999.000 0.000 QD 1 25 999.000 0.000 QG 1 26 999.000 0.000 QH2 1 27 999.000 0.000 C 2 28 999.000 0.000 CA 2 29 999.000 0.000 CB 2 30 999.000 0.000 CD 2 31 999.000 0.000 CG 2 32 999.000 0.000 HA 2 33 999.000 0.000 HB2 2 34 999.000 0.000 HB3 2 35 999.000 0.000 HD2 2 36 999.000 0.000 HD3 2 37 999.000 0.000 HG2 2 38 999.000 0.000 HG3 2 39 999.000 0.000 N 2 40 999.000 0.000 QB 2 41 999.000 0.000 QD 2 42 999.000 0.000 QG 2
     ... These two lists can be used to make assignments of peaks to atoms in the molecule under investigation. For example, with these lists NOESY cross peaks can be assigned and integrated in order to extract distance constraints.

    Starting from peaks assigned to spin system atoms, a method is needed to proceed to peaks assigned to atoms of the residues in the molecule. Changing assignments of single peaks is too time consuming. Changing the assignments for all peaks assigned to a given residue together is preferable. This is achieved by leaving the peak list and the atom numbers untouched and changing only the fragment and atom list.

    The fragment type field can be changed from a general spin system (SS) into a glycine (GLY). Thereby it is ensured that not only the fragment type in the fragment list is updated but also that all the atoms of the new fragment type will be present in the atom list. In the above example, changing from a spin system fragment to a glycine fragment the Ha1, Ha2 and Qa atoms are inserted at the appropriate position in the atom list. For changing the fragment number (i.e. proceeding from spin system numbers to residue numbers) the mapping number is used. The mapping number is set in the peak editing window. In the case of a spin system fragment it is equal to the residue number of the spin system. The command "Switch fragment/map numbers [sn]" exchanges the fragment number and the mapping number. In order to keep the fragment and the atom list consistent the fragment number is changed in both lists together.

    It is important to ensure that a certain fragment number is not used at same time for a spin system fragment and a residue fragment. The first possibility to avoid this, as illustrated in the above example, is to use different sets of numbers for the spin systems and the residues. The second possibility is to first set the mapping of all spin system fragments and only then exchange the fragment and the mapping numbers.

    3.4.6 Changing Assignments

    Different cases can be distinguished: either a peak is missassigned, a resonance is missassigned or the fragment number or fragment type is missassigned. Each of these cases is discussed separately.

    To correct a missassigned peaks the atom list and the fragment list are not changed and only the atom number fields in the peak list are changed in order to reflect the new assignment. This is done with the command "Assign peak [ap]" In contrast, to change a resonance assignment, the peak list and the fragment list should remain unchanged and only the atom list is adapted. This can be achieved by editing the atom fields in the peak editing window. If changing an atom name or fragment number the corresponding fields in the atom list are updated.

    To change the fragment type or the fragment number the peak list remains unchanged while the atom list and fragment list are adapted. This is done by editing the fragment and the mapping field in the peak editing window and using the "Switch fragment/map numbers [sn]" command as described in the chapter From Spin System Assignments to Residue Assignments on page 31.

    3.5 Possible Resiude Types

    Based on observed chemical shifts of a given fragment it is possible to identify likely residue types. Especially when 13C chemical shifts are known discrimination between different amino acid types is readily achieved, e.g., in more than 50% of the case the correct amino acid can be identified. The "Residue Type window [rw]"

    allows to match a set of identified frquencies to the frequencies expected for different residue types. The expected frequencies and their standard deviation are stored in the fragment library. If for a certain atom no expected frequency is specified in the fragment library the atom will not be considered.

    3.6 Possible Assignments

    Once the resonance assignments are nearly completed the known frequencies can be calculated using the "Average chem. shift [ac]" command and subsequently can be used to generate possible assignments for peaks. For each dimension of a peak, the atoms resonating at about the same frequency can be identified and their atom numbers stored in a data structure. The command "Possible assign. [pa]" asks for a maximal allowed deviation between the frequencies of the picked peak and the resonances of the atoms. It then generates all the assignment possibilities for all the dimensions of the peaks. In addition to specify the possible assignments for each dimension separately one often wants to include or exclude combinations of assignments. Since a 3D or 4D list of assignment possibilities would be too big, XEASY treats only two dimensional arrays of assignment possibilities. In a 3D or 4D spectrum only combinations of assignments in the two dimensions displayed horizontally and vertically are treated. In each dimension up to 10 possible assignments can be stored, that is, in the displayed plane a maximum of 100 possible combinations. Each time when there are more than ten assignment possibilities in a dimension, a warning message will be prompted and the maximal allowed deviation between the peak and the atoms will be lowered for this peak. To view and edit the assignment possibilities the assignment window is available. It can be popped up by the command "Editing assign. [ea]" In the lower left side of the panel, the horizontal text lines list the possible assignments in the vertical dimension, and the vertical text lines in the upper right part list the possible assignments in the horizontal dimension. The assignments may be toggled using the buttons in the window. To select a single assignment possibility, the corresponding button has to be pushed while the shift key is held down. Once a unique assignment is present in the assignment window the peak list is updated automatically. Another possibility to update the peak list according to the assignment list is provided by the command "Update assign. [ua]" It can also be used to check the two lists for contradictory assignments and to update the assignment list according to the peak list.

    Two possibilities exists to reduce the number of proposed assignments. The command "Reduce to intrares. ass. [ia]" toggles all inter-residual assignments off. The command "Reduce using ass. peaks [ru]" uses already assigned peaks to reduce the assignment possibilities of unassigned peaks. Its main application is to include the information of the third and fourth dimension in 3D and 4D spectra.

    The assignment possibilities are stored in a file with extension ".assign". This file contains the possible assignments for peaks. The number in the first line of the file indicates the dimension of the peak (2 means two dimensions and 3 means three dimensions). Following for each peak are several lines containing the assignment information. A typical block of data for a 2D peak is given here

    # 127
    3 220 225 294
    10 22 82 197 283 293 388 403 432 438 457
    1050625 0 0 0
    
    The peak number (127 in the example) is given after the hash "#". When loading the assignment list and no peak with the listed peak number is present in the loaded peak list, a warning message will be prompted and the loading is aborted. Subsequent lines (in the example the second and third line) list the possible assignments for each dimension. On both lines the first number indicates the number of possible atom assignments in this dimension followed by the individual atom numbers. Atom numbers in the file that are missing in the loaded atom list will be ignored. The four numbers on the last line encode the allowed assignment combinations corresponding to the toggle button matrix in the assignment window. The commands to read or write an assignment file are "Load assignment [la]" and "Write assignment [wa]"

    In later stages of the NOESY cross peak assignment, the number of assignment possibilities can be reduced using preliminary structures. This can be done using the program ASNO (P. Güntert et al., 1993). The program reads in structure coordinates, a peak list and an atom list. It produces all assignment possibilities not contradicting the structures or the frequencies of the atoms. The resulting assignment file from ASNO can also be read with the "Load assignment [la]" command. The first few lines of such a file are given below.

    Assignment file
    
    
    Changed:
    <
    <
    Corresponding peaklist: toc.peaks Corresponding coherencelist: tend.prot Number of dimensions: 2
    Uncertainties
    0.020 0.020
    >
    >
    Corresponding peaklist: toc.peaks Corresponding coherencelist: tend.prot Number of dimensions: 2
    Uncertainties
    0.020 0.020
     
    Changed:
    <
    <
    # 1 126 128 # 2 24 22 173 171 361 359 253 251 316 314
    >
    >
    # 1 126 128 # 2 24 22 173 171 361 359 253 251 316 314
     The first five lines give the peak list, atom list and chemical shift tolerances in ppm used to produce the assignment file as well as the dimensionality of the assignment list. The entries for each peak start with a hash "#" followed by the peak number. The following lines list the assignment possibilities. Each line defines one nD assignment possibility defined by atom numbers in all dimensions.

    3.7 Strips

    The concept of strips originates from work with 3D spectra but can also be used for 2D and 4D spectra. Strips are 1D cross sections extending in a second dimension. They lie at the position of an atom (in 2D spectra) or atom group (in 3D or 4D spectra) and contain all the peaks involving this atom group. For example in spectra involving 1H and 15N of the amide group, as the 15N-correlated [1H,1H]-NOESY, 15N-correlated [1H,1H]-TOCSY, HNCA or HN(CO)CA experiments, for each amide group a strip can be defined. In the case of 2D 1H spectra a strip is defined for each proton and in the case of 13C correlated spectra as the HCCH TOCSY a strip can be defined for each proton bound to a carbon. In order to judge the lineshapes of the cross peaks in two dimensions the 1D cross sections are extended into a second dimension. To use strips in 4D spectra the width of the second dimension should be set equal to the sweep width.

    For 3D or 4D spectra, working with a set of strips instead of planes reduces the complexity of the assignment steps. Methods to selectively search for certain strips further simplify the assignment procedure. The following paragraphs introduce the strip data structure, methods to define strip lists, display and edit them as well as methods to selectively search for strips that have either a specified intensity pattern or lie at a selected position in the spectrum.

    A strip in XEASY has similar fields as a peak, however a strip is not used to hold relevant assignment information. This has to reside in the peak lists. Assignments stored together with a strip are only used to remove duplicated entries, to sort the strip list, and to identify peaks belonging to a strip. The fields defining a strip are given in Table 3.7.0.A.

    Changed:
    <
    <
    Table 3.7.0.A Strip Fields
    
    
    >
    >
    Table 3.7.0.A Strip Fields
    
    
     
    Changed:
    <
    <
    Dim. Fields Description
    >
    >
    Dim. Fields Description
     
    Changed:
    <
    <
    spectrum pointer to the spectrum
    >
    >
    spectrum pointer to the spectrum
     
    Changed:
    <
    <
    shift folded w1 position in ppm w1 fold number of times strip is folded in w1 atom number assignment in w1: reference into the atom list
    >
    >
    shift folded w1 position in ppm w1 fold number of times strip is folded in w1 atom number assignment in w1: reference into the atom list
     
    Changed:
    <
    <
    shift folded w2 position in ppm w2 fold number of times strip is folded in w2 atom number assignment in w2: reference into the atom list
    >
    >
    shift folded w2 position in ppm w2 fold number of times strip is folded in w2 atom number assignment in w2: reference into the atom list
     
    Changed:
    <
    <
    shift folded w3 position in ppm w3 fold number of times strip is folded in w3 atom number assignment in w3: reference into the atom list
    >
    >
    shift folded w3 position in ppm w3 fold number of times strip is folded in w3 atom number assignment in w3: reference into the atom list
     
    Changed:
    <
    <
    shift folded w4 position in ppm w4 fold number of times strip is folded in w4 atom number assignment in w4: reference into the atom list
    >
    >
    shift folded w4 position in ppm w4 fold number of times strip is folded in w4 atom number assignment in w4: reference into the atom list
     As for the peak list the folded position of the strip is used within the program but the unfolded shifts are written out to the strip list file. The folding information is stored in the field fold. The atom numbers define the assignment of the strip. In contrast to peaks, strips can not be edited. They can only be defined from peaks or loaded from a file.

    The strip list file has the extension .strips. An example of the first few lines out of a strip list file is given below:

    
    
    Changed:
    <
    <
    15nnoe-scp 3 0 116.385 8.312 8.311 1854 1853 1853 15nnoe-scp 3 1 118.317 8.170 8.185 1887 1886 1886 15nnoe-scp 3 2 117.788 7.822 7.817 1920 1919 1919 15nnoe-scp 3 3 115.097 6.957 6.957 1953 1952 1952 15nnoe-scp 3 4 112.377 8.099 8.099 1986 1985 1985 15nnoe-scp 3 5 112.319 7.891 7.891 2019 2018 2018 15nnoe-scp 3 6 112.020 6.961 6.960 2052 2051 2051 15nnoe-scp 3 7 125.553 8.410 8.410 2085 2084 2084 15nnoe-scp 3 8 111.984 8.932 8.932 2118 2117 2117 15nnoe-scp 3 9 115.893 7.997 7.963 2151 2150 2150 15nnoe-scp 3 10 106.044 7.482 7.478 0 0 0 15nnoe-scp 3 11 117.601 7.746 7.733 2184 2183 2183 15nnoe-scp 3 12 121.138 7.431 7.431 2217 2216 2216 15nnoe-scp 3 13 119.677 8.409 8.409 2250 2249 2249 For each strip there is one line. First the name of the spectrum in which the strip was defined is indicated followed by the dimensionality of this spectrum. When loading a strip sequence, XEASY tries to find the indicated spectrum. If this is not present, a spectrum with the same dimensionality is taken - if possible the currently displayed one. The next number, the strip number, is currently not used. Then come the chemical shifts in all the dimensions and at the end the atom numbers specifying the assignment in each dimension. The commands to load and write strip lists are:
    >
    >
    15nnoe-scp 3 0 116.385 8.312 8.311 1854 1853 1853 15nnoe-scp 3 1 118.317 8.170 8.185 1887 1886 1886 15nnoe-scp 3 2 117.788 7.822 7.817 1920 1919 1919 15nnoe-scp 3 3 115.097 6.957 6.957 1953 1952 1952 15nnoe-scp 3 4 112.377 8.099 8.099 1986 1985 1985 15nnoe-scp 3 5 112.319 7.891 7.891 2019 2018 2018 15nnoe-scp 3 6 112.020 6.961 6.960 2052 2051 2051 15nnoe-scp 3 7 125.553 8.410 8.410 2085 2084 2084 15nnoe-scp 3 8 111.984 8.932 8.932 2118 2117 2117 15nnoe-scp 3 9 115.893 7.997 7.963 2151 2150 2150 15nnoe-scp 3 10 106.044 7.482 7.478 0 0 0 15nnoe-scp 3 11 117.601 7.746 7.733 2184 2183 2183 15nnoe-scp 3 12 121.138 7.431 7.431 2217 2216 2216 15nnoe-scp 3 13 119.677 8.409 8.409 2250 2249 2249 For each strip there is one line. First the name of the spectrum in which the strip was defined is indicated followed by the dimensionality of this spectrum. When loading a strip sequence, XEASY tries to Trash.findDFdf the indicated spectrum. If this is not present, a spectrum with the same dimensionality is taken - if possible the currently displayed one. The next number, the strip number, is currently not used. Then come the chemical shifts in all the dimensions and at the end the atom numbers specifying the assignment in each dimension. The commands to load and write strip lists are:
     "Load strip list [sl]" "Write strip list [ss]" A strip list can be build up from peaks in two different ways. First, all peaks selected with the "Select peak [sp]" command can be appended to the strip list using the command

    "Strip sequence [se]" New strips are generated at the positions of each of the selected peaks. The assignments of the new strips are taken from the peaks. The newly generated strips are appended to the current strip list. The assignment of the strips can be used to sort them or to delete strips that have the same assignment. Second, with the commands "Append strip [sa]" and "Insert before reference [ib]" a single peak can be selected to define a strip, which is either appended or inserted into the strip list.

    Once a strip list is defined it can be displayed. In order to be able to work with strips efficiently only about ten strips should be displayed on the screen at once. The selection of strips to be displayed and the number of strips to display on one screen are specified with the command "Goto strips [gs]". It displays one screen full of strips from the current strip list. With this command the width of the strips in the horizontal dimension is also defined. The height of the displayed strips will be equal to the height of the zoomed region where the command was issued.

    To move around in the strip list the commands "Forward strips [fs]" and

    "Backward strips [bs]" can be used. They move to the next or previous screen of strips. The command

    Changed:
    <
    <
    "Strip find [sf]"
    >
    >
    "Strip Trash.findDFdf [sf]"
     allows a fragment number to be specified. It then displays a screen of strips including the first strip assigned to this fragment. The whole strip list or single strips can be removed from memory using the commands "Erase strip list [es]" and "Remove one strip [ro]".

    In many of the assignment tasks one starts with a strip and searches for one or several related strips. The command "Hold strip [sh]" fixes one or several strips. These fixed strips are thereafter shown as reference always on the left side of the screen. They can only be removed with the "Release strip [sr]"

    command.

    In addition to defining and displaying strips methods are necessary to search for relevant strips. Two such methods are implemented in XEASY. The first searches for strips at a selected position. Possible applications of this method include assigning NOESY cross peaks in heteronuclear correlated 3D NOESY spectra, spin systems assignment using the HCCH TOCSY, sequential assignment using the 15N correlated NOESY or the HNCA and HN(CO)CA experiments. For a more detailed description of how to work with such spectra, refer to the section 16.0 on page 109.

    The second method searches for strips with a selected intensity pattern. The positions of the strip can in addition be used to narrow down the set of strips for which the intensity patterns are compared. The pattern are derived from observed intensities. They can be modified using peaks with the same assignment as the reference strip. The intensities at the positions of these peaks are multiplied by a user specified factor (refer to the detailed description below). Since the similarity of the strips are determined by calculating the correlation coefficient between strip intensity patterns the method is called the spectral correlation method (Bartels, C and Wüthrich, K. (1994) J. Biomol. NMR, ...). Its main application is the sequential assignment of 15N-correlated [1H,1H]-NOESY spectra: sequentially neighboring spin systems show similar cross peak patterns due to their spacial proximity. The frequencies of the sequential cross peaks can be identified using the peaks from the 15N-correlated [1H,1H]-TOCSY (see also section 16.0 on page 109).

    In order to work with the method to search for strips at a selected position, reference strips have first to be defined. The command to define these reference strips is "Define reference strips [rd]". It copies the current strip list into the reference list. After this command has been issued the current strip list can be altered without affecting the search for close strips. The command "Compare close strips [pc]" can then be used to search for close strips. It allows to select a frequency in the vertical dimension of a zoomed region. It then displays a screen of strips. The first strip comes from the zoomed region at the position where the frequency was selected. The following strips are from the reference list. They are sorted by increasing distance from the selected frequency. To view the strips further away from the selected position the commands "Forward comparison [fc]" and "Backward comparison [bc]"

    are provided. Again the first displayed strip is the one where the frequency was selected.

    The next few paragraphs describe the correlation method followed by instructions of how to apply the method. For a detailed discussion refer to Bartels, C and Wüthrich, K. (1994) J. Biomol. NMR, ....

    For the given strip s all the remaining candidate strips k are sorted according to a distance measure dk such that the strip which is searched gets a low rank. The candidate strips, starting with the lowest ranking ones, are then displayed in order to allow the final assignment to be made interactively. The distance measure dk is an enhancement of the correlation function introduced for the sequential assignment using 3D 15N-correlated [1H,1H]-NOESY in Bartels and Wüthrich (1994). In contrast to the sequential assignment using 3D 15N-correlated [1H,1H]-NOESY in all other assignment tasks it is possible to identify a frequency f in the given strip s at which there must be a peak in the sought strip k (see Applications section). This is used in the definition of the distance measure:

    Changed:
    <
    <
    [1]
    >
    >
    [1]
      The condition discriminates those strips that are unlikely to correspond to the sought strip, since pk, their position in the vertical dimension, differs more than the user specified parameter from the expected frequency f. is usually set to a value larger than the expected error in the determination of the frequency f or the peak positions pk. The other strips with are possible candidates for the sought strip. They are further sorted according to the correlation function (Bartels and Wüthrich, 1994) which expresses the similarity of the peak pattern observed in strip k to the peak pattern expected for the sought strip and derived from the peak pattern observed in the given strip s. The peak patterns v = (i1,i2, . . ., in) are n-dimensional vectors with n equal to the number of data points along the vertical dimension which are derived from the experimentally observed intensities iex according to

    Changed:
    <
    <
    [2]
    >
    >
    [2]
      Here is the experimental intensity at the local maximum m of the absolute intensities, pl and ph are the positions of the two adjacent local minima of the absolute intensities and ib is a parameter usually set to 15 times the standard deviation of the noise. The constant Am - usually set to 1 - gives a weight to every local maximum. For particular assignment tasks, as for example sequential assignment using 3D 15N-correlated [1H,1H]-NOESY spectra, it can be used to emphasize or suppress subsets of peaks (Bartels and Wüthrich, 1994).

    Note that setting in Eq. [1] to infinity, allows to handle cases where their is no peak in the sought strip whose frequency f can be identified in the given strip, e.g., sequential assignment using 3D 15N-correlated [1H,1H]-NOESY and setting to 0.0 ppm causes the candidate strips to be sorted only by the deviation of the strip position pk from the identified frequency f.

    To apply the spectral correlation method the noise level must first be defined. This is done using the contour plot command "Contour plot [cp]"

    Currently displayed peaks with the same assignment as the strip define the positions Am used to modify the peak patterns. The remaining parameters necessary to calculate the correlation coefficients are defined using the command "Define correlation [cd]" It copies the current strip list to the correlation strip list and calculates the correlation coefficients between pairs of strips. Again the current strip list can be modified afterwards without affecting the correlation method. To search for correlated strips the command "Compare correlated strips [cc]" is available. It asks for a reference strip and a frequency f to be selected and then displays a screen full of strips for comparison. The first strip displayed is the selected reference strip. The following ones come from the correlation strip list and are sorted by decreasing correlation to the pattern of the reference strip. To display the previous or next screen of correlated strips the commands "Forward comparison [fc]" and "Backward comparison [bc]" are used. Again the first displayed strip is the reference strip.

    The command "Sequential assignment [op]"

    uses the spectral correlation coefficients for determining the sequential assignment of a protein. The method combines simulated annealing with an algorithm described by R. Bernstein, C. Cieslar, A. Ross, H. Oschkinat, J. Freund and T. A. Holak. in J. Biomol. NMR, 3 (1993) 245-251. Already known sequential assignments and assignments of spin systems to possible spin system types can be provided to the routine.

    3.8 Integrals

    The final step in spectra interpretation for the structure determination of proteins is the integration of the NOESY cross peaks. The peak integrals are in a first approximation proportional to rij-6 where rij denotes the inter-proton distance between protons i and j. Because of this inverse sixth power relationship the volume accuracies are not critical - an error of 100% in volume leads to only a 12% change in inter-proton distance.

    The problem remaining when trying to integrate cross peaks is overlap. Proteins of even moderate size will have upwards of several thousand cross-peaks many of which will overlap with others. If one of the overlapping components is weak the evaluation of its volume can be difficult - even getting the order of magnitude correct may be impossible by conventional techniques. In hetero-nuclear correlated 3D spectra the NOESY cross peaks are split up in a third dimension. This significantly reduces the problem of overlapping peaks and therefore integration.

    The approach used in XEASY for 2D spectra is line-shape integration. This method stems from the fact that line-shapes of peaks with the same assignment have the same lineshape. Distortion may however arise from zero quantum effects. Cross peaks which overlap are integrated by taking the w1 and w2 line shapes for each peak and then deconvoluting the peak cluster with the lineshapes into volumes and volume errors. This technique was proposed by Denk, W. Baumann, R. & Wagner, G. (1986) J. Magn. Reson., 67, 386-390. The approach consists of three steps: first reference line-shapes along the x and y direction are determined for each resonance in the 2D spectrum. Next the peaks are grouped together into clusters of overlapping peaks. Here, two peaks are said to overlap if the rectangles defined by their line-shape extents intersect. Finally the volume of each peak in the cluster is determined by a linear least squares fit of the peak shapes constructed from the reference line-shapes to the experimental data points in the spectrum.

    In mathematical terms the problem is one of adjusting the volumes Vp to minimize the following expression:

    Changed:
    <
    <
    3.8.A
    >
    >
    3.8.A
      where S(w1,w2) is the spectral intensity at coordinate (w1,w2), Vp is the volume of peak p, m is the number of peaks in the cluster and Li is the reference line-shape for the wi resonance of peak p. This linear least squares problem is solved using standard methods (Press W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T. (1986) Numerical Recipes, the Art of Scientific Computing. Cambridge University Press). An uncertainty can be obtained for each peak volume by calculating the square-root of the above function over the peak region (not the total cluster). This can then be expressed as a percentage of the calculated volume.

    For line shape integration, the command "Line-shape integration [li]" is used. Before it is applied, the lineshapes for each atom have to be defined and edited with the following commands

    Lineshapes are saved into a file with extension ".ref". An example is given below:
    # Number of dimensions 2
    
    
    Changed:
    <
    <
    1. -9999 0.000 0.000 -9999 0.000 0.000
    2. 774 4.211 4.266 155 4.206 4.258
    3. -9999 0.000 0.000 -9999 0.000 0.000
    4. -9999 0.000 0.000 -9999 0.000 0.000
    5. 155 3.057 3.167 1452 3.063 3.153
    >
    >
    1. -9999 0.000 0.000 -9999 0.000 0.000
    2. 774 4.211 4.266 155 4.206 4.258
    3. -9999 0.000 0.000 -9999 0.000 0.000
    4. -9999 0.000 0.000 -9999 0.000 0.000
    5. 155 3.057 3.167 1452 3.063 3.153
      The first line gives the dimensionality of the reference lineshape list. Each of the following lines defines for one atom the position in all dimensions where the lineshapes can be read out from the spectrum. The first number on each line is the atom number. The rest of the line is subdivided into groups of three numbers for each dimension. The first of which is the number of the reference peak, the second and third number give the lower and upper bound of the lineshape in ppm. The entry for atom 2 is illustrated on the figure below:
    Changed:
    <
    <
    3.8.A
    >
    >
    3.8.A
      The peak number "-9999" means that the w1 or w2 line-shape is not defined. The commands to load or save a line-shape list are "Load reference list [lr]" and

    "Write reference list [wr]".

    The formats for the line shape lists of the programs EASY and EASY3D are different. Instead of specifying a reference peak, as in EASY and EASY3D, in XEASY a reference atom is specified for each dimension. Since this format is difficult to adapt to higher dimensional spectra it has been dropped. The old format can still be read by the XEASY program. However only the new format will be written out.

    Two additional commands exist to check the lineshapes. The command "Check ref. line-shape list [cr]" looks which lineshapes are defined and whether the lineshapes position and the atom frequency are consistent. The command "View reference peak [vr]" displays for each peak with the same assignment as the selected one a 1D cross section in the slice window.

    In 3D spectra, where overlap is a smaller problem, it suffices to partition the spectral data points into areas that belong to a given peak. Then the data points in these areas can be summed up to give the peak volume. The external program peakint (described in section 15.12 on page 106) is a routine to integrate peaks by partitioning. These two automated approaches have to be complemented by interactive methods where it is possible to specify the region to integrate in the spectrum. For interactive peak integration the commands "Interactive integration param. [ip]" "Interactive integration [ii]"

    can be used.

    A different problem occurs when determining rate constants by evaluating a time series of spectra. It is crucial that the same regions out of each of the spectra at the different time points are integrated. This can be achieved by defining the regions to integrate by drawing rectangles (refer to section 3.9 on page 45) which can be saved and loaded again. The integration is performed with the command "Integrate rectangular regions [ir]"

    Since NOESY spectral interpretation is an interactive process, where cross peak assignment, cross peak integration and structure calculation steps are repeated, efficient book keeping is crucial. For each peak, XEASY stores the integration method, the volume integral, and the error of the volume integral. The peak color can be used to reflect the state of the integration. In this way peaks with valid line shapes can be distinguished from peaks where lineshape information is missing or from peaks that were interactively integrated. Peaks can also be colored according to their integration method. To control the desired display mode the default window and the "Check ref. line-shape list [cr]" command are available.

    In order to apply the lineshape integration selectively, peaks integrated interactively will not be changed by line shape integration. This allows the user to apply the lineshape integration routines to the whole spectrum without loosing the information of the interactively integrated peaks. In order to change the volume of an interactively integrated peak by line shape integration its integration method must be set back to "-" (for example by using the "Set selected peak entries [pe]" command).

    3.9 Geometries

    A basic feature necessary to work with spectra is to draw lines and rectangles (i.e. geometries) to mark features in the spectrum. Correspondingly a number of drawing commands exist in XEASY. They are:The [ds] command draws lines from each selected peak to the diagonal. When applied to the TOCSY or COSY peaks of one fragment this corresponds to drawing spin system connectivities. The [df] command draws lines at selected frequencies. It can be used to identify a spin system using a TOCSY spectrum: clicking at all the peaks of a TOCSY tower will produce a grid identifying all the peaks belonging to one spin system.

    Geometries can be removed with the commands "Remove lines [rl]" and

    "Remove rectangles [rr]" In order to remove all lines or rectangles at once these two commands can be combined with the "Apply to all ... [aa]" modifier command.

    Geometries are saved in a file with the extension ".geom". The coordinates of the lines and rectangles are stored in ppm. A flag denotes whether the geometry is a line or a rectangle: "0" marks a line, "1" marks a rectangle. The commands to read or write a geometry file are "Load geometries [lg]" and "Write geometries [wg]".


    next: 4. Spectrum Menu /

    contents

    -- DavidCowburn - 15 Jun 2005

    Revision 115 Jun 2005 - Main.DavidCowburn

     Contents XEASY manual: 3. The XEASY Model previous: 2. Using XEASY / contents

    3. The XEASY Model

    To enable computer supported spectra interpretation, an operational model is needed which defines the terms on which the computer interacts with the spectroscopist. This Model has to include the spectra which are the raw data and abstractions such as peaks, assignments, line shapes, geometries and strips which are derived from the spectra and constitute the results of the program. These different elements are detailed in this chapter.

    3.1 Spectrum

    A spectrum is a 2D, 3D or 4D dimensional box of intensities containing the frequency domain data of an NMR experiment. Spectra are loaded using the command "New spectrum [ns]" .

    The intensity information for each data point is either encoded in 8 bits or in 16 bits. The format with 8 bits uses a logarithmic representation of the data with 1 byte per real data point. For a given data point sk the program first determines the integer l that minimizes the expression

    				 	 3.1.A
    
    (i. e. ) and then stores in one byte

    				 	 3.1.B
    
    This format can represent numbers approximately in the range , i. e. with a relative error of less than 20%. The format with 16 bits uses a 16 bit floating point format with the "exponent" ek given by Eq. 3.1.B in the lower valued byte and the mantissa
    				 	 3.1.C
    
    
    ( if ) in the higher valued byte (Eccles et al., 1991). This format can represent numbers in the same range as the 8 bit format but with a relative error of less than 1%.

    The intensity data is stored in a spectrum data file with the extension ".3D.8" or ".3D.16" depending on the used accuracy. The information about the spectrum is kept in the parameter file with the extension ".3D.param". The example below

    Version ....................... 1
    Number of dimensions .......... 2
    16 or 8 bit file type ......... 16
    Spectrometer frequency in w1 .. 60.811001
    Spectrometer frequency in w2 .. 600.138000
    Spectral sweep width in w1 .... 34.840000
    Spectral sweep width in w2 .... 6.719000
    Maximum chemical shift in w1 .. 135.384872
    Maximum chemical shift in w2 .. 11.538800
    Size of spectrum in w1 ........ 128
    Size of spectrum in w2 ........ 512
    Submatrix size in w1 .......... 256
    Submatrix size in w2 .......... 256
    Permutation for w1 ............ 2
    Permutation for w2 ............ 1
    Folding in w1 ................. RSH
    Folding in w2 ................. RSH
    Type of spectrum .............. C
    
    is from a 2D spectrum. The intensities are stored in the 16 bit format. Spectrometer frequencies and spectral sweep widths for both dimensions are given in ppm. The maximum chemical shifts give the ppm frequency of the lower left corner of the spectrum. The total size of the spectrum is given in data points. The submatrix size gives the size of the blocks used to store the intensities in order to allow fast access to small parts of the spectrum. The permutation defines the sequential order in which the intensities are stored in the data file. The folding has to be set to either RSH, for Ruben States Haberkorn (Marion, D., Ikura, M., Tschudin, R. & Bax. A. (1989) J. Magn. Reson. 85, 393-399), or TPPI, for time proportional phase increment (Marion, D. and Wüthrich, K. (1983) Biochem. Biophys. Res. Comm. 113, 967-974). The folding type is used to fold the peaks into the spectrum when loading a peak list. The type of spectrum is not used by the program.

    To set the calibration of a spectrum the command "Calibration [ca]" is used. Since the peak positions are stored in ppm it is not recommended to work with uncalibrated spectra. The way the parameter file is interpreted is influenced by the value of the resource "XEasy*traditional_calibration".

    A spectrum can be turned around without changing the data file by editing the parameter file. Using the following parameter file defines the data file from the above example to be from a three dimensional spectrum extending only one data point in the third dimension. An arbitrary calibration is used in w1:

    Version ....................... 1
    Number of dimensions .......... 3
    16 or 8 bit file type ......... 16
    Spectrometer frequency in w1 .. 500.000000
    Spectrometer frequency in w2 .. 60.811001
    Spectrometer frequency in w3 .. 600.138000
    Spectral sweep width in w1 .... 10.000000
    Spectral sweep width in w2 .... 34.840000
    Spectral sweep width in w3 .... 6.719000
    Maximum chemical shift in w1 .. 500.000000
    Maximum chemical shift in w2 .. 135.384872
    Maximum chemical shift in w3 .. 11.538800
    Size of spectrum in w1 ........ 1
    Size of spectrum in w2 ........ 128
    Size of spectrum in w3 ........ 512
    Submatrix size in w1 .......... 1
    Submatrix size in w2 .......... 256
    Submatrix size in w3 .......... 256
    Permutation for w1 ............ 3
    Permutation for w2 ............ 2
    Permutation for w3 ............ 1
    Folding in w1 ................. RSH
    Folding in w2 ................. RSH
    Folding in w3 ................. RSH
    Type of spectrum .............. C
    
    The different possibilities to generate spectra files in the XEASY format are given below. A description of conversion programs between different formats is given under external programs .
    • directly from PROSA: using the spectrum processing program PROSA (P. Güntert, V. Dötsch, G. Wider and K. Wüthrich, J. Biomol. NMR, 2 (1992) 619-629) data suitable for XEASY may be written out with the command write easy8 [filename] or write easy16 [filename]
    • converting a Bruker smx file into a 2D spectrum using the filegen2d program.
    • converting a series of Bruker smx files into one 3D spectrum using the filegen3d program.
    • copying Bruker smx files from the X32 to the workstation and converting them into 2D or 3D spectra. This can be done with the shell scripts cpx32 and cpx32_16bit.
    • converting a spectrum from the old EASY format into the new XEASY format with the program filecon2d.

    In addition to the file format for XEASY, two outdated spectral formats from the EASY and EASY3D programs exist. The format from the EASY program can no longer be read by the XEASY program. These files have to be converted by the filecon2d program. Files used in EASY3D can still be read by the XEASY program. Since these parameter files contain binary information they may not be edited.

    3.2 Zoom Regions

    Many routines of the XEASY program display selected regions of a spectrum. One dimensional cross sections of the spectrum are displayed as plots of intensity vs. data points in the slice window (see section 12.8 on page 95). Two dimensional regions of the spectrum may be shown in the main window as contour plots or as intensity plots. The intensity plot maps the intensities onto a color scale. The contour plot draws lines of equal intensity. Following commands may be used to change between the different display modes:

    3.2.1 Selecting Zoom Regions

    XEASY provides routines to define and select regions from the spectrum for displaying. The interactive routines for zooming using no information from peaks, assignments or strip sequences are described below. In addition, there exist a number of commands which select zoom regions based on picked peaks or on strip sequences. These are described in the corresponding chapters about peak lists (page 23) and strip sequences (page 36).

    To select an arbitrary region in the spectrum and also to restore the full spectrum the command

    "Permutation [pm]"

    can be used. This command as well as the "New spectrum [ns]" command define the view onto the spectrum, that is which dimension will be displayed horizontally (in x direction) and which vertically (in y direction). Many routines within XEASY refer to this view.

    The following commands allow the definition of regions to be displayed by manually selecting their boundaries within the currently displayed zoom region:

    In addition the two letter code "mo" may be used to select a region in the overview window. To modify the size or position of the displayed regions the following commands are available:The zoom commands to move or resize regions take as a parameter the zoom factor which can be defined using the command

    "Zoom factor [zf]".

    3.2.2 Comparing Zoom Regions

    To compare different spectra, it is possible to display the contour plot of one spectrum on top of the intensity plot of an other spectrum. The command

    "Alternative spectrum [as]".

    may be used to define additional spectra for which the commands

    "Replace contour [rc]".
    "Replace spectrum [rs]".

    select the ones to display as contour and intensity plots. The spectra currently used for the intensity plot and the contour plot are indicated in upper left corner of the main window. The upper name gives the spectrum used for intensity plots the lower the one for the contour plot. The routines in XEASY which access the intensities of a spectrum use always the spectrum of the intensity plot. To compare different regions the command

    "Zoom alignment [za]".

    is used to get several aligned regions. The large cursor

    "Draw cursor [dc]".

    allows the precise comparison of the positions of peaks within these aligned regions. If displaying the intensity plots of different spectra together it might be necessary to adjust the scale of color vs. intensities of one or more of them. This is possible with the command

    "Scale display [sd]".

    3.2.3 Zoom Region History

    A history of the displayed zoom regions is kept and is used to call back previous displays. The command

    "Restore zoom [rz]".

    brings back the last stored display. Commands that can easily be reversed (e.g. the commands to move around a zoom region) are not stored in the zoom region history and can not be restored with the [rz] command. The two-letter codes "zb", "zf" and "zo" can be used in the overview window to select any zoom from the zoom history.

    Zooms may also be written to a file and recovered again. The commands are

    "Load zoom [lz]".
    "Write zoom [wz]".

    These files have the extension .zoom and contain the information about the zoom regions. The data in the zoomed regions are not stored, but rather the ppm coordinates of the regions in the spectrum are saved. Whenever such a file is loaded, the display will be accordingly updated.

    3.3 Phase Correction

    The program XEASY can be used to determine phase correction parameters interactively on the basis of 1D cross-sections that are displayed in the slice window (see "Slice window [sw]").

    With the conventions used by the program PROSA (P. Güntert, V. Dötsch, G. Wider and K. Wüthrich, J. Biomol. NMR, 2 (1992) 619-629), the phase-corrected spectrum is related to the original spectrum by

    				 	 3.1.B
    
    for the data points 0, ..., n-1 in every row along the dimension of interest. is the constant, the linear (or first order) phase correction parameter. The program XEASY allows real-time change of the phase correction parameters and and display of the corresponding, phase-corrected 1D cross-sections, and thus provides a convenient environment for the accurate interactive selection of phase correction parameters. The phase correction parameters found with the help of XEASY can subsequently be used in data processing programs such as PROSA to perform the phase correction on the complete multidimensional spectrum.

    To use the interactive phase correction routine in XEASY, first two separate spectrum files containing the real and imaginary parts of the complex spectrum are prepared (e.g., with the program PROSA), then the real part of the spectrum is read into XEASY with the "New spectrum [ns]" command and the imaginary part of the spectrum is defined as alternative spectrum withe the "Alternative spectrum [as]" command. Next the user selects and displays suitable rows (or columns) in the slice window (see "Slice window [sw]" ) and clicks the Phase button in the slice window. Phase correction will be performed for the spectrum in the dimension of the current slice (i.e., the slice for which the Current button on top of the slice window is activated). The alternative spectrum window pops up and the user selects the file containing the imaginary part of the spectrum. The constant and linear phase correction parameters can now be adjusted using the keyboard according to table 3.3.

    The linear phase correction parameter is changed such that the phase of the current slice at the cursor position in the slice window remains constant. The actual values of the phase correction parameters are monitored in the lower left corner of the slice window.

    The user deactivates the phase correction mode by clicking the Phase button again. The 1D cross-sections of the original spectrum (i.e., the real part loaded with the [ns] command) are displayed again.

    3.4 Assignments

    Assignments represent the main result of the work with the program XEASY. Three lists, namely the peak list, the atom list and the fragment list contain the information about the assignments. The peak list contains the coordinates of the picked peaks used to assign the spectrum. The atom list contains the names and frequencies of possible resonances. They define the possible assignments for each dimension of a peak. For homonuclear, single quantum, proton spectra the atom list contains the names and frequencies of all protons in the molecule. In this case it corresponds to the proton list used in the programs EASY and EASY3D. The fragment list contains the names of the fragments which are used at the different stages of the assignment process. In early stages of the assignment, spin systems independent of the primary sequence of the molecule are used. Then increasingly more of them become mapped onto the primary sequence of the protein until finally the residues of the molecule under investigation are used. In this final stage the fragment list corresponds to the sequence list of the programs EASY and EASY3D.

    In addition to these three lists, XEASY uses a library file defining the atoms and pseudo atoms for each fragment type. The following chapters provide first, a detailed description of the three lists and the library file, and then a description of how they can be used in conjunction in order to proceed through the different stages of the assignment process.

    3.4.1 Peak List

    A peak list contains entries for peaks picked in the spectrum. The following paragraphs present in detail the peak list file format, peak picking, assigning peaks, editing peaks, displaying relevant information from a peak list, folding, and how to treat different dimensionality of peak lists and spectra.

    A peak list contains an entry for each peak. The fields describing a peak are listed in Table 3.4.1.A. In a 2D spectrum the fields for the w3 and w4 dimension and in a 3D spectrum those for w4 are not used. The peak numbers in a peaklist must be unique but not necessarily continuous.

    Table 3.4.1.A Peak Fields

    Dim.	Fields		Description
    
    	peak number	unique number identifying the peak
    	colour		colour in the range [1,6] used for displaying the peaks
    	volume		volume of the peak
    	volume error	volume error in percent
    	integration method  
    			method used for the integration: d, r, e, m, a, -
    	comment		user defined comment
    	possible ass.	data structure containing possible assignments
    
    	shift		folded w1 position in ppm
    w1	fold		number of times peak is folded in w1
    	atom number	assignment in w1: reference into the atom list
    
    	shift		folded w2 position in ppm
    w2	fold		number of times peak is folded in w2
    	atom number	assignment in w2: reference into the atom list
    
    	shift		folded w3 position in ppm
    w3	fold		number of times peak is folded in w3
    	atom number	assignment in w3: reference into the atom list
    
    	shift		folded w4 position in ppm
    w4	fold		number of times peak is folded in w4
    	atom number	assignment in w4: reference into the atom list
    
    
    The peak list is stored in a peak list file with extension .peaks. It can be read by the program ASNO (P. Güntert et al., 1993) which uses a peak list, an atom list and selected structure coordinate files to generate possible assignments for NOESY cross peaks which can be loaded into XEASY. The program CALIBA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530) translates the peak lists containing integrated peaks into distance constraints which can be used by the program DIANA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530). An example of the first few lines of a two dimensional peak list is given below:
    # Number of dimensions 2
    	11  7.289  10.169 1 ?			 2.048e+03  0.00e+00 -	0  126  128  0
    		 #  first peak
    	12  7.119	9.413 1 ?			 1.280e+02  0.00e+00 -	0  517  506  0
    	3	7.106	7.497 1 ?			 4.096e+03  0.00e+00 -	0  129  130  0
    	4	7.228	7.411 1 ?			 4.096e+03  0.00e+00 -	0  131  127  0
    	5	7.106	7.411 1 ?			 5.120e+02  0.00e+00 -	0  327  328  0
    	6	6.838	7.094 1 ?			 8.192e+03  0.00e+00 -	0  489  488  0
    
    The number on the first line after the hash "#" indicates the dimensionality of the peak list. Subsequent lines, starting with a number, contain the fields for one peak. Additional lines starting with a hash "#" are comments for the peak on the line above. The first field for each peak is the peak number, followed by: the unfolded chemical shift coordinates in ppm in w1 and w2 (more numbers are listed for higher dimensional spectra), the color code (a number from 1 to 6), the user defined type of the spectrum where the peak is observed, the peak volume, the uncertainty of the volume in percent, the integration method ("d" for Denk integration, "r" for rectangular integration, "e" for elliptical integration, "m" for maximum integration, "a" for automatic integration, "-" for not integrated), an unused number, the assignments in w1 and w2 is given by the two following atom numbers (more numbers are listed for higher dimensional spectra), the last number is not used. The commands to load or write peak lists are "Load peaklist [lp]" and "Write peaklist [wp]".

    Loading a peak list with different dimensionality than the spectrum is possible. If the peak list has more dimensions than the spectrum the additional dimensions are just ignored when working with the peak list. When loading a peak list with less dimensions than the spectrum, it has to be specified whether the needed additional dimensions are copied from the available dimensions or set to a fixed ppm value. For example, the 2D peak list from a [1H,15N]-HMQC spectrum can be loaded onto a 3D 15N-correlated [1H,1H]-NOESY by copying the 1H chemical shift of the 2D list to both 1H dimensions in the 3D spectrum. The resulting peaks are positioned on the diagonal of the 1H planes in the 3D spectrum.

    Peaks may be picked either automatically or manually. For automatic peak picking of anti phase peaks in 2D spectra the command "Anti-phase peaks [an]" is used. For automatic picking of inphase peaks the external program pick described on page 105 or the command

    "In-phase peak picking [in]" can be used.

    Issuing the command "Peak picking [pp]" enters the manual peak picking mode. By selecting a position on the screen a peak placed at the corresponding position of the spectrum is added to the peak list. In 3D and 4D spectra the added peak is picked at the position of the displayed 2D region.

    The peak list file contains always the unfolded chemical shifts. When loading a peak list onto a spectrum all peaks are folded into the spectrum. The folding information is stored in the field fold. When writing out the peak list, this field is used to back calculate the unfolded peak positions. The information of the unfolded chemical shifts of a peak can in this manner be transferred from a spectrum with a large sweep width (i.e. 2D HMQC experiment) to spectra with smaller sweep widths (i.e. 3D 15N correlated [1H,1H] spectra). When picking additional peaks, one wants to retain the folding information from peaks already present in the spectrum. This is possible with the command "Copy and move peaks [cm]", which allows to copy the folding information and the assignment from an existing peak to new peaks. Other methods to set the folding of a peak are to enter the unfolded chemical shift into the w field of the peak editing window or to use the "Set selected peak entries [pe]", command which can be used to set or reset any field in the peak list.

    Once peaks are picked, they may be assigned to atoms present in the atom list. This is done either for a single peak using the command "Assign peak [ap]"

    or for all selected peaks within a region using the command "Assign peaks in one region [ar]".

    For example, all peaks in the fingerprint region of a homonuclear proton COSY spectrum can be assigned to the HN and Ha atoms of a set of spin systems, or all peaks in a [1H,15N]-HMQC spectrum may be assigned to the N and HN atoms of back bone fragments.

    When working with a peak list it is crucial to display information relevant for the current assignment task. First, often only a subset of peaks is relevant (e.g. only the peaks assigned to a certain fragment). The commands "Select peak [sp]" and "Display all peaks [da]" provide different criteria for selecting peaks to display. In 3D and 4D spectra, usually peaks of interest are only those close to the displayed 2D zoom region. The option "Maximal distance in planes" in the default window controls the number of adjacent planes from which to display peaks.

    Second, peaks with different properties (i.e. assigned peaks versus unassigned peaks) can be displayed with different colors or peak shapes. The options: "Peak color determined by", "Peak cross displayed is", "Lineshape displayed for" and "Coloring interval" in the default window, control the peak colors.

    Third, peaks may also be labelled with their assignment or the volume. The command

    "Peak data window [pw]" pops up the peak data window in which the peak labels may be defined.

    Fourth, peaks may be used to zoom interesting regions out of a spectrum. In the simplest case of the command "View peak [vp]" a region is zoomed that contains a specified peak. The command "Zoom peak [zp]" displays a small region around each of the peaks selected with the "Select peak [sp]" command. The command "View reference peak [vr]" displays for each peak with the same assignment as the selected one, a slice in the slice window. An other powerful method to select relevant regions out of a spectrum is the strip list. It is described in the section 6.0 on page 56.

    It is possible to work with several peak lists together. When a peak list is loaded it is kept in memory until it is removed from memory with the

    "Erase peak list [ep]"

    command. To select a different peak list for displaying with a spectrum the command

    "Exchange peak list [xc]"

    is used. This command does not change the dimensionality or the folding of the loaded peak list. To adapt the folding of the peaks to a certain spectrum, to change the dimensionality or to permute the dimensions the command "Adapt peaks to spectrum [ad]" is used.

    Several commands exist to edit the peak list entries. they are:

    Worthwhile mentioning is the "Move reference peak [mr]" command. It allows one to move together all peaks with the same assignment. This helps when adjusting a peak list picked and assigned in one spectrum to a related, but slightly different spectrum. Two examples are pH titrations and the use of a TOCSY peak list to identify the intra-residual peaks in a NOESY spectrum.

    To check or to list the assignments the commands "List peak entries [le]" and "Report ass. stat.[ra]" are used. They check for duplicated assignments, i.e. two peaks that have the same assignment, and for large chemical shift errors, i.e. peaks at different frequencies which are assigned to the same atom. Both informations are useful to identify wrong assignments.

    3.4.2 Atom List

    The atom list contains the names and frequencies of resonances. They define the possible assignments for each dimension of a peak.

    Table 3.4.2.A - Atom Fields

    
    Fields		Description
    
    atom number	unique number identifying the atom
    shift		mean chemical shift in ppm
    shift error	deviation of the assigned peaks from the mean value
    name		atom name
    fragment number	number of the fragment to which the atom belongs
    lineshapes	data structure containing the reference lineshapes
    
    The fields listed in Table 3.4.2.A constitute an atom entry. They can be modified in the peak editing window. The atom numbers, used to reference the atoms, must be unique but not necessarily continuous. The number -9999 is reserved to denote invalid entries. The average chemical shift and the shift error can be calculated from the assigned peaks. The command is

    "Average chem. shift [ac]".

    If the chemical shift is not defined it is set to the value 999.000.

    A new atom list is generated each time when a fragment list is loaded. For each fragment the corresponding atoms are looked up in the fragment library file and added to the list. New atom entries are added to the atom list if a non existing atom is used with the "Assign peak [ap]" command, if the fragment type is changed in the peak editing window, or with the "Add new fragment [af]" command.

    The atom list file has the extension ".prot" originating from the old EASY format. The following line is taken from such a file:

    32	4.370  0.004	HA  2
    
    The first number is the atom number, followed by its mean chemical shift and the deviation from the mean value. The atom name and the fragment number follow. The commands to read or write an atom list are

    "Load atoms (chem. shift) [lc]" and "Write atoms (chem. shift) [wc]".

    3.4.3 Fragment List

    The fragment list contains the fragments currently used for the assignment. Depending on the stage of the assignment, the fragments are either spin systems or residues of the molecule. In the latter case, the fragment list corresponds to the sequence list used in the programs EASY and EASY3D.

    Table 3.4.3.A Fragment Fields

    Fields			Description
    
    fragment number	unique number identifying the fragment
    fragment type	  name of the fragment
    mapping number	 used to map spin system fragments to a residue fragments
    comment			  used to store possible spin system types or possible sequential neighbours
    
    The fields listed in Table 3.4.3.A constitute a fragment. The fragment numbers, which are used in the atom list to reference fragments, must be unique but not necessarily continuous. They are referenced in the atom list. The number -9999 is reserved to denote invalid entries. The fragment type must be defined in the fragment library. When starting a resonance assignment it is usually set to "SS" meaning that there is no information about the specific spin system type. In later stages it may be changed, for example to "ASP", using the peak editing window. The command "Add new fragment [af]" will add additional fragments to the list.

    The mapping number defines a mapping from one set of fragments (e.g., spin systems) to another set (e.g., amino acid residues). When working with spin system fragments the mapping number is the residue number of the spin system. For example: a fragment with number 207, type GLY and mapping number 55 describes spin system 207 which corresponds probably to GLY 55. The mapping number -1 is reserved to denote invalid mappings. The comment can, for example, be used to store the fragment numbers of the sequentially neighboring spin system. When using the automatic sequential assignment routine "Sequential assignment [op]" the fragment comment can be used to indicate for each spin system the probable amino acid types.

    The fragment list is stored in a file with extension ".seq". If the first line starts with a hash "#", it is treated as a comment. The subsequent lines list the fragment types with one line per fragment. For example, the following file defines a tripeptide:

    # tripeptide
    ASP 0
    GLY 5 203
    LYS 7 209 "-1: 203"
    
    The number, if given, after the fragment name denotes the fragment number. If there is no number given for a fragment, its number will be assigned according to the last fragment number plus one. If no number is given for the first fragment, it will be set to one. The optional second number is the mapping number. An entry "-1" for the mapping number indicates that it is not defined. The optional text within the quotes is the comment. If the comment is present the mapping number must also be listed.

    The command to read or write a sequence file is "Load sequence [ls]" and "Write sequence [ws]".

    3.4.4 Fragment Library

    The fragment library defines the different fragment types. All the atoms constituting a given fragment type are listed. The format is the one of the program DIANA (P. Güntert et al., J. Mol. Biol. (1991) 217, 517-530). Amino acid fragments, nucleic acid fragments, a general spin system fragment and a general amino acid backbone fragment are currently defined. The extension is ".lib". The default library, defined by the environmental variable XEASY_LIB is loaded when starting up XEASY. To load a different library the command "Load library [ll]" is used.

    3.4.5 From Spin System Assignments to Residue Assignments

    A reasonable setup for starting out with resonance assignments is to define a set of general spin systems in the fragment list and generate the corresponding atom list. The first few lines of such a fragment list are given below:
      SS 201
      SS 202
      SS 203
      SS 204
      SS 205
      SS 206
      SS 207
      SS 208
      SS 209
      ...
    
    Some lines extracted from the corresponding atom list, which is generated by loading the fragment list into XEASY are given as well:
    	1 999.000 0.000 N	  201
    	2 999.000 0.000 HN	 201
    	3 999.000 0.000 CA	 201
    	4 999.000 0.000 HA	 201
    	5 999.000 0.000 QA	 201
    	6 999.000 0.000 CB	 201
    	7 999.000 0.000 HB2	201
    	8 999.000 0.000 HB3	201
    	9 999.000 0.000 QB	 201
    ...
      34 999.000 0.000 N	  202
      35 999.000 0.000 HN	 202
      36 999.000 0.000 CA	 202
      37 999.000 0.000 HA	 202
      38 999.000 0.000 QA	 202
      39 999.000 0.000 CB	 202
      40 999.000 0.000 HB2	202
      41 999.000 0.000 HB3	202
      42 999.000 0.000 QB	 202
    ...
    
    With these two lists, peaks may be grouped by assigning them to the same fragment and further classified by assigning them to certain atoms within the fragments. For example when working with heteronuclear spectra, each peak in a [1H,15N] HMQC experiment can be assigned to the N and HN atoms of different fragments. Similarly when working with homonuclear spectra, each COSY cross peak in the fingerprint region may be assigned to the Ha and HN atoms of a different fragment. Peaks in the COSY or TOCSY spectrum lying on the same amide frequency as the HN Ha peak can then be assigned to the HN Hb2, HN Hg2, ... of the same fragment.

    After the resonance assignment has been finished, the fragments of choice are the amino acid residues and the atom list has entries for the atoms in the molecule. The first few lines of the fragment list for the interleukin receptor antagonist are given below:

    
    ARG 1
    PRO 2
    SER 3
    GLY 4
    ARG 5
    LYS 6
    
    Some lines extracted from the corresponding atom list with undefined chemical shifts are given below:
    	1	999.000  0.000	C		 1
    	2	999.000  0.000	CA		1
    	3	999.000  0.000	CB		1
    	4	999.000  0.000	CD		1
    	5	999.000  0.000	CG		1
    	6	999.000  0.000	CZ		1
    	7	999.000  0.000	HA		1
    	8	999.000  0.000	HB2	  1
    	9	999.000  0.000	HB3	  1
      10	999.000  0.000	HD2	  1
      11	999.000  0.000	HD3	  1
      12	999.000  0.000	HE		1
      13	999.000  0.000	HG2	  1
      14	999.000  0.000	HG3	  1
      15	999.000  0.000	HH1	  1
      16	999.000  0.000	HH21	 1
      17	999.000  0.000	HH22	 1
      18	999.000  0.000	HN		1
      19	999.000  0.000	N		 1
      20	999.000  0.000	NE		1
      21	999.000  0.000	NH1	  1
      22	999.000  0.000	NH2	  1
      23	999.000  0.000	QB		1
      24	999.000  0.000	QD		1
      25	999.000  0.000	QG		1
      26	999.000  0.000	QH2	  1
      27	999.000  0.000	C		 2
      28	999.000  0.000	CA		2
      29	999.000  0.000	CB		2
      30	999.000  0.000	CD		2
      31	999.000  0.000	CG		2
      32	999.000  0.000	HA		2
      33	999.000  0.000	HB2	  2
      34	999.000  0.000	HB3	  2
      35	999.000  0.000	HD2	  2
      36	999.000  0.000	HD3	  2
      37	999.000  0.000	HG2	  2
      38	999.000  0.000	HG3	  2
      39	999.000  0.000	N		 2
      40	999.000  0.000	QB		2
      41	999.000  0.000	QD		2
      42	999.000  0.000	QG		2
    ...
    
    These two lists can be used to make assignments of peaks to atoms in the molecule under investigation. For example, with these lists NOESY cross peaks can be assigned and integrated in order to extract distance constraints.

    Starting from peaks assigned to spin system atoms, a method is needed to proceed to peaks assigned to atoms of the residues in the molecule. Changing assignments of single peaks is too time consuming. Changing the assignments for all peaks assigned to a given residue together is preferable. This is achieved by leaving the peak list and the atom numbers untouched and changing only the fragment and atom list.

    The fragment type field can be changed from a general spin system (SS) into a glycine (GLY). Thereby it is ensured that not only the fragment type in the fragment list is updated but also that all the atoms of the new fragment type will be present in the atom list. In the above example, changing from a spin system fragment to a glycine fragment the Ha1, Ha2 and Qa atoms are inserted at the appropriate position in the atom list. For changing the fragment number (i.e. proceeding from spin system numbers to residue numbers) the mapping number is used. The mapping number is set in the peak editing window. In the case of a spin system fragment it is equal to the residue number of the spin system. The command "Switch fragment/map numbers [sn]" exchanges the fragment number and the mapping number. In order to keep the fragment and the atom list consistent the fragment number is changed in both lists together.

    It is important to ensure that a certain fragment number is not used at same time for a spin system fragment and a residue fragment. The first possibility to avoid this, as illustrated in the above example, is to use different sets of numbers for the spin systems and the residues. The second possibility is to first set the mapping of all spin system fragments and only then exchange the fragment and the mapping numbers.

    3.4.6 Changing Assignments

    Different cases can be distinguished: either a peak is missassigned, a resonance is missassigned or the fragment number or fragment type is missassigned. Each of these cases is discussed separately.

    To correct a missassigned peaks the atom list and the fragment list are not changed and only the atom number fields in the peak list are changed in order to reflect the new assignment. This is done with the command "Assign peak [ap]" In contrast, to change a resonance assignment, the peak list and the fragment list should remain unchanged and only the atom list is adapted. This can be achieved by editing the atom fields in the peak editing window. If changing an atom name or fragment number the corresponding fields in the atom list are updated.

    To change the fragment type or the fragment number the peak list remains unchanged while the atom list and fragment list are adapted. This is done by editing the fragment and the mapping field in the peak editing window and using the "Switch fragment/map numbers [sn]" command as described in the chapter From Spin System Assignments to Residue Assignments on page 31.

    3.5 Possible Resiude Types

    Based on observed chemical shifts of a given fragment it is possible to identify likely residue types. Especially when 13C chemical shifts are known discrimination between different amino acid types is readily achieved, e.g., in more than 50% of the case the correct amino acid can be identified. The "Residue Type window [rw]"

    allows to match a set of identified frquencies to the frequencies expected for different residue types. The expected frequencies and their standard deviation are stored in the fragment library. If for a certain atom no expected frequency is specified in the fragment library the atom will not be considered.

    3.6 Possible Assignments

    Once the resonance assignments are nearly completed the known frequencies can be calculated using the "Average chem. shift [ac]" command and subsequently can be used to generate possible assignments for peaks. For each dimension of a peak, the atoms resonating at about the same frequency can be identified and their atom numbers stored in a data structure. The command "Possible assign. [pa]" asks for a maximal allowed deviation between the frequencies of the picked peak and the resonances of the atoms. It then generates all the assignment possibilities for all the dimensions of the peaks. In addition to specify the possible assignments for each dimension separately one often wants to include or exclude combinations of assignments. Since a 3D or 4D list of assignment possibilities would be too big, XEASY treats only two dimensional arrays of assignment possibilities. In a 3D or 4D spectrum only combinations of assignments in the two dimensions displayed horizontally and vertically are treated. In each dimension up to 10 possible assignments can be stored, that is, in the displayed plane a maximum of 100 possible combinations. Each time when there are more than ten assignment possibilities in a dimension, a warning message will be prompted and the maximal allowed deviation between the peak and the atoms will be lowered for this peak. To view and edit the assignment possibilities the assignment window is available. It can be popped up by the command "Editing assign. [ea]" In the lower left side of the panel, the horizontal text lines list the possible assignments in the vertical dimension, and the vertical text lines in the upper right part list the possible assignments in the horizontal dimension. The assignments may be toggled using the buttons in the window. To select a single assignment possibility, the corresponding button has to be pushed while the shift key is held down. Once a unique assignment is present in the assignment window the peak list is updated automatically. Another possibility to update the peak list according to the assignment list is provided by the command "Update assign. [ua]" It can also be used to check the two lists for contradictory assignments and to update the assignment list according to the peak list.

    Two possibilities exists to reduce the number of proposed assignments. The command "Reduce to intrares. ass. [ia]" toggles all inter-residual assignments off. The command "Reduce using ass. peaks [ru]" uses already assigned peaks to reduce the assignment possibilities of unassigned peaks. Its main application is to include the information of the third and fourth dimension in 3D and 4D spectra.

    The assignment possibilities are stored in a file with extension ".assign". This file contains the possible assignments for peaks. The number in the first line of the file indicates the dimension of the peak (2 means two dimensions and 3 means three dimensions). Following for each peak are several lines containing the assignment information. A typical block of data for a 2D peak is given here

    # 127
    3 220 225 294
    10 22 82 197 283 293 388 403 432 438 457
    1050625 0 0 0
    
    The peak number (127 in the example) is given after the hash "#". When loading the assignment list and no peak with the listed peak number is present in the loaded peak list, a warning message will be prompted and the loading is aborted. Subsequent lines (in the example the second and third line) list the possible assignments for each dimension. On both lines the first number indicates the number of possible atom assignments in this dimension followed by the individual atom numbers. Atom numbers in the file that are missing in the loaded atom list will be ignored. The four numbers on the last line encode the allowed assignment combinations corresponding to the toggle button matrix in the assignment window. The commands to read or write an assignment file are "Load assignment [la]" and "Write assignment [wa]"

    In later stages of the NOESY cross peak assignment, the number of assignment possibilities can be reduced using preliminary structures. This can be done using the program ASNO (P. Güntert et al., 1993). The program reads in structure coordinates, a peak list and an atom list. It produces all assignment possibilities not contradicting the structures or the frequencies of the atoms. The resulting assignment file from ASNO can also be read with the "Load assignment [la]" command. The first few lines of such a file are given below.

    Assignment file
    	Corresponding peaklist: toc.peaks
    	Corresponding coherencelist: tend.prot
    	Number of dimensions: 2
    	Uncertainties:	 0.020	0.020
    
    
    #	  1
    	 126  128 
    #	  2
    	  24	22 
    	 173  171 
    	 361  359 
    	 253  251 
    	 316  314 
    
    The first five lines give the peak list, atom list and chemical shift tolerances in ppm used to produce the assignment file as well as the dimensionality of the assignment list. The entries for each peak start with a hash "#" followed by the peak number. The following lines list the assignment possibilities. Each line defines one nD assignment possibility defined by atom numbers in all dimensions.

    3.7 Strips

    The concept of strips originates from work with 3D spectra but can also be used for 2D and 4D spectra. Strips are 1D cross sections extending in a second dimension. They lie at the position of an atom (in 2D spectra) or atom group (in 3D or 4D spectra) and contain all the peaks involving this atom group. For example in spectra involving 1H and 15N of the amide group, as the 15N-correlated [1H,1H]-NOESY, 15N-correlated [1H,1H]-TOCSY, HNCA or HN(CO)CA experiments, for each amide group a strip can be defined. In the case of 2D 1H spectra a strip is defined for each proton and in the case of 13C correlated spectra as the HCCH TOCSY a strip can be defined for each proton bound to a carbon. In order to judge the lineshapes of the cross peaks in two dimensions the 1D cross sections are extended into a second dimension. To use strips in 4D spectra the width of the second dimension should be set equal to the sweep width.

    For 3D or 4D spectra, working with a set of strips instead of planes reduces the complexity of the assignment steps. Methods to selectively search for certain strips further simplify the assignment procedure. The following paragraphs introduce the strip data structure, methods to define strip lists, display and edit them as well as methods to selectively search for strips that have either a specified intensity pattern or lie at a selected position in the spectrum.

    A strip in XEASY has similar fields as a peak, however a strip is not used to hold relevant assignment information. This has to reside in the peak lists. Assignments stored together with a strip are only used to remove duplicated entries, to sort the strip list, and to identify peaks belonging to a strip. The fields defining a strip are given in Table 3.7.0.A.

    Table 3.7.0.A Strip Fields

    
    Dim.	Fields		Description
    
    	spectrum	pointer to the spectrum
    
    	shift		folded w1 position in ppm
    w1	fold		number of times strip is folded in w1
    	atom number	assignment in w1: reference into the atom list
    
    	shift		folded w2 position in ppm
    w2	fold		number of times strip is folded in w2
    	atom number	assignment in w2: reference into the atom list
    
    	shift		folded w3 position in ppm
    w3	fold		number of times strip is folded in w3
    	atom number	assignment in w3: reference into the atom list
    
    	shift		folded w4 position in ppm
    w4	fold		number of times strip is folded in w4
    	atom number	assignment in w4: reference into the atom list
    
    
    As for the peak list the folded position of the strip is used within the program but the unfolded shifts are written out to the strip list file. The folding information is stored in the field fold. The atom numbers define the assignment of the strip. In contrast to peaks, strips can not be edited. They can only be defined from peaks or loaded from a file.

    The strip list file has the extension .strips. An example of the first few lines out of a strip list file is given below:

    	  15nnoe-scp  3	 0 116.385	8.312	8.311 1854 1853 1853
    	  15nnoe-scp  3	 1 118.317	8.170	8.185 1887 1886 1886
    	  15nnoe-scp  3	 2 117.788	7.822	7.817 1920 1919 1919
    	  15nnoe-scp  3	 3 115.097	6.957	6.957 1953 1952 1952
    	  15nnoe-scp  3	 4 112.377	8.099	8.099 1986 1985 1985
    	  15nnoe-scp  3	 5 112.319	7.891	7.891 2019 2018 2018
    	  15nnoe-scp  3	 6 112.020	6.961	6.960 2052 2051 2051
    	  15nnoe-scp  3	 7 125.553	8.410	8.410 2085 2084 2084
    	  15nnoe-scp  3	 8 111.984	8.932	8.932 2118 2117 2117
    	  15nnoe-scp  3	 9 115.893	7.997	7.963 2151 2150 2150
    	  15nnoe-scp  3	10 106.044	7.482	7.478	 0	 0	 0
    	  15nnoe-scp  3	11 117.601	7.746	7.733 2184 2183 2183
    	  15nnoe-scp  3	12 121.138	7.431	7.431 2217 2216 2216
    	  15nnoe-scp  3	13 119.677	8.409	8.409 2250 2249 2249
    
    For each strip there is one line. First the name of the spectrum in which the strip was defined is indicated followed by the dimensionality of this spectrum. When loading a strip sequence, XEASY tries to find the indicated spectrum. If this is not present, a spectrum with the same dimensionality is taken - if possible the currently displayed one. The next number, the strip number, is currently not used. Then come the chemical shifts in all the dimensions and at the end the atom numbers specifying the assignment in each dimension. The commands to load and write strip lists are: "Load strip list [sl]" "Write strip list [ss]" A strip list can be build up from peaks in two different ways. First, all peaks selected with the "Select peak [sp]" command can be appended to the strip list using the command

    "Strip sequence [se]" New strips are generated at the positions of each of the selected peaks. The assignments of the new strips are taken from the peaks. The newly generated strips are appended to the current strip list. The assignment of the strips can be used to sort them or to delete strips that have the same assignment. Second, with the commands "Append strip [sa]" and "Insert before reference [ib]" a single peak can be selected to define a strip, which is either appended or inserted into the strip list.

    Once a strip list is defined it can be displayed. In order to be able to work with strips efficiently only about ten strips should be displayed on the screen at once. The selection of strips to be displayed and the number of strips to display on one screen are specified with the command "Goto strips [gs]". It displays one screen full of strips from the current strip list. With this command the width of the strips in the horizontal dimension is also defined. The height of the displayed strips will be equal to the height of the zoomed region where the command was issued.

    To move around in the strip list the commands "Forward strips [fs]" and

    "Backward strips [bs]" can be used. They move to the next or previous screen of strips. The command "Strip find [sf]" allows a fragment number to be specified. It then displays a screen of strips including the first strip assigned to this fragment. The whole strip list or single strips can be removed from memory using the commands "Erase strip list [es]" and "Remove one strip [ro]".

    In many of the assignment tasks one starts with a strip and searches for one or several related strips. The command "Hold strip [sh]" fixes one or several strips. These fixed strips are thereafter shown as reference always on the left side of the screen. They can only be removed with the "Release strip [sr]"

    command.

    In addition to defining and displaying strips methods are necessary to search for relevant strips. Two such methods are implemented in XEASY. The first searches for strips at a selected position. Possible applications of this method include assigning NOESY cross peaks in heteronuclear correlated 3D NOESY spectra, spin systems assignment using the HCCH TOCSY, sequential assignment using the 15N correlated NOESY or the HNCA and HN(CO)CA experiments. For a more detailed description of how to work with such spectra, refer to the section 16.0 on page 109.

    The second method searches for strips with a selected intensity pattern. The positions of the strip can in addition be used to narrow down the set of strips for which the intensity patterns are compared. The pattern are derived from observed intensities. They can be modified using peaks with the same assignment as the reference strip. The intensities at the positions of these peaks are multiplied by a user specified factor (refer to the detailed description below). Since the similarity of the strips are determined by calculating the correlation coefficient between strip intensity patterns the method is called the spectral correlation method (Bartels, C and Wüthrich, K. (1994) J. Biomol. NMR, ...). Its main application is the sequential assignment of 15N-correlated [1H,1H]-NOESY spectra: sequentially neighboring spin systems show similar cross peak patterns due to their spacial proximity. The frequencies of the sequential cross peaks can be identified using the peaks from the 15N-correlated [1H,1H]-TOCSY (see also section 16.0 on page 109).

    In order to work with the method to search for strips at a selected position, reference strips have first to be defined. The command to define these reference strips is "Define reference strips [rd]". It copies the current strip list into the reference list. After this command has been issued the current strip list can be altered without affecting the search for close strips. The command "Compare close strips [pc]" can then be used to search for close strips. It allows to select a frequency in the vertical dimension of a zoomed region. It then displays a screen of strips. The first strip comes from the zoomed region at the position where the frequency was selected. The following strips are from the reference list. They are sorted by increasing distance from the selected frequency. To view the strips further away from the selected position the commands "Forward comparison [fc]" and "Backward comparison [bc]"

    are provided. Again the first displayed strip is the one where the frequency was selected.

    The next few paragraphs describe the correlation method followed by instructions of how to apply the method. For a detailed discussion refer to Bartels, C and Wüthrich, K. (1994) J. Biomol. NMR, ....

    For the given strip s all the remaining candidate strips k are sorted according to a distance measure dk such that the strip which is searched gets a low rank. The candidate strips, starting with the lowest ranking ones, are then displayed in order to allow the final assignment to be made interactively. The distance measure dk is an enhancement of the correlation function introduced for the sequential assignment using 3D 15N-correlated [1H,1H]-NOESY in Bartels and Wüthrich (1994). In contrast to the sequential assignment using 3D 15N-correlated [1H,1H]-NOESY in all other assignment tasks it is possible to identify a frequency f in the given strip s at which there must be a peak in the sought strip k (see Applications section). This is used in the definition of the distance measure: [1] The condition discriminates those strips that are unlikely to correspond to the sought strip, since pk, their position in the vertical dimension, differs more than the user specified parameter from the expected frequency f. is usually set to a value larger than the expected error in the determination of the frequency f or the peak positions pk. The other strips with are possible candidates for the sought strip. They are further sorted according to the correlation function (Bartels and Wüthrich, 1994) which expresses the similarity of the peak pattern observed in strip k to the peak pattern expected for the sought strip and derived from the peak pattern observed in the given strip s. The peak patterns v = (i1,i2, . . ., in) are n-dimensional vectors with n equal to the number of data points along the vertical dimension which are derived from the experimentally observed intensities iex according to

    [2] Here is the experimental intensity at the local maximum m of the absolute intensities, pl and ph are the positions of the two adjacent local minima of the absolute intensities and ib is a parameter usually set to 15 times the standard deviation of the noise. The constant Am - usually set to 1 - gives a weight to every local maximum. For particular assignment tasks, as for example sequential assignment using 3D 15N-correlated [1H,1H]-NOESY spectra, it can be used to emphasize or suppress subsets of peaks (Bartels and Wüthrich, 1994).

    Note that setting in Eq. [1] to infinity, allows to handle cases where their is no peak in the sought strip whose frequency f can be identified in the given strip, e.g., sequential assignment using 3D 15N-correlated [1H,1H]-NOESY and setting to 0.0 ppm causes the candidate strips to be sorted only by the deviation of the strip position pk from the identified frequency f.

    To apply the spectral correlation method the noise level must first be defined. This is done using the contour plot command "Contour plot [cp]"

    Currently displayed peaks with the same assignment as the strip define the positions Am used to modify the peak patterns. The remaining parameters necessary to calculate the correlation coefficients are defined using the command "Define correlation [cd]" It copies the current strip list to the correlation strip list and calculates the correlation coefficients between pairs of strips. Again the current strip list can be modified afterwards without affecting the correlation method. To search for correlated strips the command "Compare correlated strips [cc]" is available. It asks for a reference strip and a frequency f to be selected and then displays a screen full of strips for comparison. The first strip displayed is the selected reference strip. The following ones come from the correlation strip list and are sorted by decreasing correlation to the pattern of the reference strip. To display the previous or next screen of correlated strips the commands "Forward comparison [fc]" and "Backward comparison [bc]" are used. Again the first displayed strip is the reference strip.

    The command "Sequential assignment [op]"

    uses the spectral correlation coefficients for determining the sequential assignment of a protein. The method combines simulated annealing with an algorithm described by R. Bernstein, C. Cieslar, A. Ross, H. Oschkinat, J. Freund and T. A. Holak. in J. Biomol. NMR, 3 (1993) 245-251. Already known sequential assignments and assignments of spin systems to possible spin system types can be provided to the routine.

    3.8 Integrals

    The final step in spectra interpretation for the structure determination of proteins is the integration of the NOESY cross peaks. The peak integrals are in a first approximation proportional to rij-6 where rij denotes the inter-proton distance between protons i and j. Because of this inverse sixth power relationship the volume accuracies are not critical - an error of 100% in volume leads to only a 12% change in inter-proton distance.

    The problem remaining when trying to integrate cross peaks is overlap. Proteins of even moderate size will have upwards of several thousand cross-peaks many of which will overlap with others. If one of the overlapping components is weak the evaluation of its volume can be difficult - even getting the order of magnitude correct may be impossible by conventional techniques. In hetero-nuclear correlated 3D spectra the NOESY cross peaks are split up in a third dimension. This significantly reduces the problem of overlapping peaks and therefore integration.

    The approach used in XEASY for 2D spectra is line-shape integration. This method stems from the fact that line-shapes of peaks with the same assignment have the same lineshape. Distortion may however arise from zero quantum effects. Cross peaks which overlap are integrated by taking the w1 and w2 line shapes for each peak and then deconvoluting the peak cluster with the lineshapes into volumes and volume errors. This technique was proposed by Denk, W. Baumann, R. & Wagner, G. (1986) J. Magn. Reson., 67, 386-390. The approach consists of three steps: first reference line-shapes along the x and y direction are determined for each resonance in the 2D spectrum. Next the peaks are grouped together into clusters of overlapping peaks. Here, two peaks are said to overlap if the rectangles defined by their line-shape extents intersect. Finally the volume of each peak in the cluster is determined by a linear least squares fit of the peak shapes constructed from the reference line-shapes to the experimental data points in the spectrum.

    In mathematical terms the problem is one of adjusting the volumes Vp to minimize the following expression:

    3.8.A

    where S(w1,w2) is the spectral intensity at coordinate (w1,w2), Vp is the volume of peak p, m is the number of peaks in the cluster and Li is the reference line-shape for the wi resonance of peak p. This linear least squares problem is solved using standard methods (Press W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T. (1986) Numerical Recipes, the Art of Scientific Computing. Cambridge University Press). An uncertainty can be obtained for each peak volume by calculating the square-root of the above function over the peak region (not the total cluster). This can then be expressed as a percentage of the calculated volume.

    For line shape integration, the command "Line-shape integration [li]" is used. Before it is applied, the lineshapes for each atom have to be defined and edited with the following commands

    Lineshapes are saved into a file with extension ".ref". An example is given below:
    # Number of dimensions 2
    	1 -9999	0.000	0.000 -9999	0.000	0.000 
    	2	774	4.211	4.266	155	4.206	4.258 
    	3 -9999	0.000	0.000 -9999	0.000	0.000 
    	4 -9999	0.000	0.000 -9999	0.000	0.000 
    	5	155	3.057	3.167  1452	3.063	3.153 
    
    The first line gives the dimensionality of the reference lineshape list. Each of the following lines defines for one atom the position in all dimensions where the lineshapes can be read out from the spectrum. The first number on each line is the atom number. The rest of the line is subdivided into groups of three numbers for each dimension. The first of which is the number of the reference peak, the second and third number give the lower and upper bound of the lineshape in ppm. The entry for atom 2 is illustrated on the figure below: 3.8.A The peak number "-9999" means that the w1 or w2 line-shape is not defined. The commands to load or save a line-shape list are "Load reference list [lr]" and

    "Write reference list [wr]".

    The formats for the line shape lists of the programs EASY and EASY3D are different. Instead of specifying a reference peak, as in EASY and EASY3D, in XEASY a reference atom is specified for each dimension. Since this format is difficult to adapt to higher dimensional spectra it has been dropped. The old format can still be read by the XEASY program. However only the new format will be written out.

    Two additional commands exist to check the lineshapes. The command "Check ref. line-shape list [cr]" looks which lineshapes are defined and whether the lineshapes position and the atom frequency are consistent. The command "View reference peak [vr]" displays for each peak with the same assignment as the selected one a 1D cross section in the slice window.

    In 3D spectra, where overlap is a smaller problem, it suffices to partition the spectral data points into areas that belong to a given peak. Then the data points in these areas can be summed up to give the peak volume. The external program peakint (described in section 15.12 on page 106) is a routine to integrate peaks by partitioning. These two automated approaches have to be complemented by interactive methods where it is possible to specify the region to integrate in the spectrum. For interactive peak integration the commands "Interactive integration param. [ip]" "Interactive integration [ii]"

    can be used.

    A different problem occurs when determining rate constants by evaluating a time series of spectra. It is crucial that the same regions out of each of the spectra at the different time points are integrated. This can be achieved by defining the regions to integrate by drawing rectangles (refer to section 3.9 on page 45) which can be saved and loaded again. The integration is performed with the command "Integrate rectangular regions [ir]"

    Since NOESY spectral interpretation is an interactive process, where cross peak assignment, cross peak integration and structure calculation steps are repeated, efficient book keeping is crucial. For each peak, XEASY stores the integration method, the volume integral, and the error of the volume integral. The peak color can be used to reflect the state of the integration. In this way peaks with valid line shapes can be distinguished from peaks where lineshape information is missing or from peaks that were interactively integrated. Peaks can also be colored according to their integration method. To control the desired display mode the default window and the "Check ref. line-shape list [cr]" command are available.

    In order to apply the lineshape integration selectively, peaks integrated interactively will not be changed by line shape integration. This allows the user to apply the lineshape integration routines to the whole spectrum without loosing the information of the interactively integrated peaks. In order to change the volume of an interactively integrated peak by line shape integration its integration method must be set back to "-" (for example by using the "Set selected peak entries [pe]" command).

    3.9 Geometries

    A basic feature necessary to work with spectra is to draw lines and rectangles (i.e. geometries) to mark features in the spectrum. Correspondingly a number of drawing commands exist in XEASY. They are:The [ds] command draws lines from each selected peak to the diagonal. When applied to the TOCSY or COSY peaks of one fragment this corresponds to drawing spin system connectivities. The [df] command draws lines at selected frequencies. It can be used to identify a spin system using a TOCSY spectrum: clicking at all the peaks of a TOCSY tower will produce a grid identifying all the peaks belonging to one spin system.

    Geometries can be removed with the commands "Remove lines [rl]" and

    "Remove rectangles [rr]" In order to remove all lines or rectangles at once these two commands can be combined with the "Apply to all ... [aa]" modifier command.

    Geometries are saved in a file with the extension ".geom". The coordinates of the lines and rectangles are stored in ppm. A flag denotes whether the geometry is a line or a rectangle: "0" marks a line, "1" marks a rectangle. The commands to read or write a geometry file are "Load geometries [lg]" and "Write geometries [wg]".


    next: 4. Spectrum Menu /

    contents

    -- DavidCowburn - 15 Jun 2005

     
    Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
    Ideas, requests, problems regarding this intranet, Send feedback