protomo tutorial
version 2.2

Hanspeter Winkler

 

1 Getting started

1.1 Sample data set

This tutorial uses a small sample data set of insect flight muscle (IFM). The specimen is a section through a sarcomere that was stained and plastic embedded. The section shows a myac layer with alternating thick filaments (myosin) and thin filaments (actin) that run vertical in the images. The tilt series was acquired on a CM300 electron microscope at a magnification of 19500, with the Saxton scheme. The initial angular increment at the untilted state is 4° and the maximal tilt angles are ±61°. There are 39 images in this tilt series.

1.2 Initial setup

Unpack the data by typing the following command at the shell prompt:
tar -xjf protomo-tutorial-2.2.x.tar.bz2
where x is the current release number. This will create a new subdirectory in the current working directory and in that subdirectory you will find all the required files and data. The raw input data is stored in the subdirectory “raw”. The results produced by this tutorial can be found in the “results” subdirectory. First we set the tutorial directory as our working directory and we create two subdirectories in it:
cd protomo-tutorial-2.2.x
mkdir cache out
The “cache” directory is used by the software to store intermediate files and should preferably reside on a local filesystem, and not on a networked filesystem, for efficient disk input/output. Files with the suffix “.i3c” are cache files and can be deleted anytime. They will be automatically regenerated if they are needed again. The “out” directory receives output files for diagnostic purposes. These files are not needed in subsequent alignment steps and can also be deleted anytime.

2 Parameter files

2.1 Tilt geometry

The tilt geometry is specified in a formatted text file. The file name usually has the suffix “.tlt”. The file can be created with a text editor; here we use a shell script that takes a simple list as input. Each line of this list contains the following information (see file “max.dat”):
column description
1 file name prefix
2 tilt azimuth
3 tilt angle
4, 5 x and y coordinates of common origin
6 in-plane rotation

The tilt geometry file is generated with the command:

createtlt.sh max.dat max 101 >max.tlt
where the first parameter is the file name, the second one an identification string (a name for the tilt series), and the third one the starting number for numbering the images in the tilt series. The output of this script is redirected to a file named “max.tlt”.

2.2 Processing parameters

The files that specify all the necessary parameters for processing a data set are also plain text files. In such a file, the relevant parameters for tilt series alignment are contained in a section called “tiltseries” (see for instance the file “max.param”). The format of this file and the meaning of the parameters are described in more detail in the user’s guide. In our example, all parameters were assigned reasonable values, so the tutorial will work “out of the box”.

3 Coarse alignment

Images in the tilt series can be aligned manually if the image acquisition software did not keep the region of interest centered in the field of view. The graphical user interface for this task is started with the command:  [A]  [A] Error messages “(tomoalign-gui:9999) Gtk-CRITICAL **” can be disregarded.
tomoalign-gui -tlt max.tlt max.param
Since this is the very first time we read the raw images, and we have selected preprocessing options in the parameter file, a cache file with preprocessed images is created (“cache/max_pre.i3c”). Since we also have selected binning with a sampling factor of 2, a second cache file with the binned images is created (“cache/max_pre_smp2.i3c“). In all subsequent operations, data is directly read from the cache files, unless they have been deleted, in which case they are regenerated automatically. The alignment parameters (transformation matrices and origins) are stored in a separate file (“max.i3t”) which is also created automatically. If tomoalign-gui is run a second time, this file is read instead of “max.tlt”, and the option -tlt is not needed anymore. If we do not want to do an alignment at this point, the tilt series can also be set up and initialized with the command line program “tomoinit”, which takes the same arguments as “tomoalign-gui”.
We first increase the area to be aligned by zooming out. Select “View  →  zoom out” on the top menu bar. For manual alignment, left-click on the image and drag it while keeping the mouse button down. The reference image is displayed in red color, and the image that is aligned in green color. If aligned the addition of the two colors produces a grey-level image. A single image can also be aligned automatically by pressing the “align” button. The “reset” button returns to the initial state for the displayed image.
In our case, the tilt series is relatively well aligned, so we can use the automated function for translational alignment of the whole tilt series. We start the alignment process by choosing “Actions  →  align all”. When the alignment has finished (the “stop” button, to the right of the “reset” button, changes to “align” again), we notice that we still see the red-colored reference near the right edge of the specimen. This is because perfect alignment cannot be achieved by a simple translational alignment and we need the area matching technique described in the following section. We can check the alignment quality with an animated display. First, switch to single image display with the menu option “View  →  image”, then choose “Actions  →  show movie”. Smooth transitions between images indicate good alignment, skips indicate misalignment. We now close the graphical interface and save the result. It is advised to make a backup copy of the alignments:
cp -p max.i3t maxsaved.i3t

4 Alignment by area matching

4.1 Setting up the alignment

The following tasks are carried out in the Sparx environment. The command prompt in Sparx is denoted here as “>>>”, but may be different depending on how it was configured. First, we need to load the Python tomography extension module:
sparx
>>> import protomo
If we haven’t carried out a coarse alignment, we first need to create a new instance of a geometry object from the data in the text file:
>>> maxgeom = protomo.geom( "max.tlt" )
Then we also need a new instance of a parameter object:
>>> maxparam = protomo.param( "max.param" )
With these two objects, we create a new instance of a tilt series object:
>>> max = protomo.series( maxparam, maxgeom )
In our case, this command will fail with the error “file already exists”, because we ran the coarse alignment and the geometry is already defined, i. e. the file “max.i3t” which contains all the alignment parameters, has already been created by “tomoalign-gui”. The file name of this file is generated from the prefix of the parameter file name by default, “max” in our case. This prefix will also be used subsequently when file names are generated for output files. It can be changed by specifying an additional string argument in the protomo.series method.
To create the tilt series object when the series has already been initialized, we simply leave out the second argument and create the object in this way:
>>> max = protomo.series( maxparam )
The geometric parameters are now retrieved from the file “max.i3t”, which keeps track of the alignment parameters. If we quit a Sparx session and leave this file unchanged, we can resume the alignment in a new session at the point where we left off.
Before we start the alignment we want to check whether the mask and filter parameters have reasonable values. For this purpose we extract a window from the untilted image (number 120) with the parameters loaded previously and we display it:
>>> zero = max.image( 120 )
>>> zero.display()
The transform of this image is generated and displayed as follows:
>>> zero = max.transform( 120 )
>>> zero.display()
We need to adjust the contrast so that we can see the applied band-pass filter by clicking on the menu items “Image  →  Histogram”. In the histogram window, we decrease the upper limit until it nearly reaches the lower limit. In a similar way we can inspect the filtered image:
>>> zero = max.filter( 120 )
>>> zero.display()

4.2 The first alignment cycle

We are now ready to start the alignment by calling the align method of our tilt series object:
>>> max.align()
It calculates and stores a correction to the initially supplied geometry for each image by matching equivalent image areas in the tilt series. Since we have selected the “logging” option in our parameter object “maxparam”, a line with translational shifts, correction values and the cross-correlation coefficient is written to the terminal for each image. The alignment terminates prematurely at image 109 and 131, because the specified maximal correction of 4% in the parameter file (maxcorrection: 0.04) is exceeded for these images.
In an interactive session we can display a plot of the correction factors for the aligned images as follows:
>>> max.plot()
Another option is to write the factors to a file for subsequent plotting. If the file name is omitted, it is generated from the tilt series prefix by appending the cycle number, which is “00” for the first cycle. The two commands below are equivalent:
>>> max.corr()
>>> max.corr( “out/max00.corr” )
The output is a simple text file containing numbers that can be plotted with the program of your choice. There is an sample script “plot.sh” which produces postscript files from this output (this needs to be run from the shell prompt):
plot.sh out/max00.corr
The file with the correction factors generated by the script is called “max00_cof.ps”. Additional files with the direction of stretching/compression indicated by the correction factors (max00_coa.ps) and the in-plane rotations (max00_rot.ps) are also generated. These are measured anti-clockwise from the x-axis, in degrees. The files are located in the subdirectory “out”.
We also requested the output of correlation peak images of size 128 × 128 pixels. These were written into the “out” subdirectory as a 3D stack of images (file “max00_cor.img”). The z-coordinate of each section corresponds to the sequence number of the image. The sequence number is the number enclosed in square brackets that appears in the terminal log output. It always starts at 0 and is not necessarily the same as the image number. Note, that there is no correlation peak for image 120, sequence number [19], because this is the reference image which does not need to be aligned. We can display the correlation peaks after reading the file (scroll in the z-direction with the up- and down-arrow keys):
>>> cor = protomo.image( “out/max00_cor.img” )
>>> cor.display()
To prepare for the next cycle, we re-evaluate the tilt geometry using the corrections obtained by area matching. The align command computed a shift vector and four correction parameters (a 2 × 2 matrix) for each image which were recorded internally in the file “max.i3t”. From these, new geometric parameters are calculated by a least squares fit:
>>> max.fit()
The fitting could be executed multiple times with different parameters, if desired. The result of the most recent fit, however, will only be stored permanently if the tilt series object is updated, replacing the old with the new geometric parameters:
>>> max.update()

4.3 More alignment cycles

We use the same parameters for the next few cycles and write the correction data to a file for later examination:
>>> max.align()
>>> max.corr()
>>> max.plot()
The second cycle completes successfully and the deviations of the correction factors are an order of magnitude better, when plotted. Ideally, these factors should all be 1. We recompute the geometry again:
>>> max.fit()
>>> max.update()
and run another cycle:
>>> max.align()
>>> max.corr()
>>> max.fit()
>>> max.update()
There was no substantial improvement of the corrections, so we want to compute a preliminary backprojection map to check the thickness of the specimen:
>>> img = max.map()
>>> img.display()
When displaying the map we find that sections in the range  − 14 ≤ z ≤ 19 show the most contrast. The estimated thickness would then be 34 pixels at a sampling factor of 2 and we use this number for the next cycle (see parameter file max03.param).
The “map” method created a new image object in memory, which will persist as long as the current Sparx session is active and it will be deleted when we quit Sparx. Since we do not need this image anymore after viewing, we delete it to reclaim the used memory:
>>> del img

4.4 Maximal usable area

We are now trying to determine the maximal usable area in the tilt series. The “area” method generates a superposition of binary images for each projection which are extracted with the stored geometric parameters. Pixels that lie within the raw image are assigned a value of 1, those outside a value of 0. In the superposition, the pixel value is a count of images that contributed to the particular pixel. When displayed and zoomed out, the white area, or more precisely, the area with the highest density value represents the equivalent projected specimen areas that are common to all members of the tilt series.
>>> area = max.area()
>>> area.display()
We can see in the resulting image that the individual projected regions overlap better vertically than horizontally which suggests a horizontal drift during data collection. The displayed image should be zoomed out and thresholded just below the density 39, which is the number of images in the tilt series. The terminal output indicates that the maximal area has a size of 944 × 943 pixels, which is the rectangle that can be fit in the common area. It is slightly off-center by 23 pixels on the x-axis and -22 pixels on the y-axis. [B]  [B] The reported numbers may vary slightly depending on the versions of third-party libraries used. We choose a slightly smaller area of 936 × 936 pixels. These numbers are products of small prime numbers for which Fourier transform computation is more efficient.
A new alignment cycle is carried out with the increased area and the adjusted thickness. We first read a new parameter set from a text file, replace the old with the new parameters in the tilt series object and start a new alignment cycle:
>>> maxparam = protomo.param( “max03.param” )
>>> max.param( maxparam )
>>> max.align()
>>> max.corr()
>>> max.fit()
>>> max.update()

4.5 Final cycle

We set the sampling factor to 1, adjust the window size to 1800 × 1800, and double the thickness for the final alignment at full resolution. The parameters are again read from a text file:
>>> maxparam = protomo.param( “max04.param” )
>>> max.param( maxparam )
>>> max.align()
>>> max.corr()
>>> max.fit()
>>> max.update()
The final alignment parameters can be exported to a text file as follows:
>>> newgeom = max.geom()
>>> newgeom.write( “max04.tlt” )

5 Backprojection map

The x-y size of the final map could be set to 1800 × 1800 (the window size used for the final alignment) but we reduce it for this tutorial to speed up the map computation:
>>> max.setparam( “map.size”, “{ 512 256 96 }” )
>>> max.mapfile()
We used the “mapfile” method which writes the map directly to a disk file instead of storing it in memory as we did previously, because for large maps the required memory may exceed the available resources. If no file name is specified, the map is written to the file out/max05_bck.img.