Segment¶
Program to extract overlapping segments from micrographs
Parameters¶
Parameter |
Example (default) |
Description |
---|---|---|
Micrographs |
cs_scan034.tif |
Input micrographs: accepted file formats (tif, .mrc, .mrcs, .spi, .hdf, .img, .hed). |
Image output stack |
protein_stack.hdf |
Output stack: accepted file formats (hdf). |
Segment coordinates |
scan034_boxes.txt |
Input: file with identical name of corresponding micrograph (accepted file formats EMAN’s Helixboxer/Boxer, EMAN2’s E2helixboxer and Bsoft filament parameters coordinates: .box, .txt, .star, .db). When using the frame processing please specify a previously generated spring.db to provide the coordinates. Make sure that helix paths are continuous. A helix path can follow a C- or S-path but must NOT form a U-turn. |
Segment size in Angstrom |
700 |
Molecular mass (i.e. signal) increases with segment size and helix defects become more pronounced. Final image size = segement size + stepsize (accepted values min=100, max=1500). |
Estimated helix width in Angstrom |
200 |
Generous width measure of helix required for rectangular mask (accepted values min=0, max=1500). |
Step size of segmentation in Angstrom |
70 |
Overlapping segments are related views according to helical symmetry, i.e. step size should be a multiple of helical rise (stepsize of 0 corresponds to one central box per helix) (accepted values min=0, max=2000). |
Pixel size in Angstrom |
1.163 |
Pixel size is an imaging parameter (accepted values min=0.001, max=100). |
Invert option |
True |
Inversion of image densities for cryo data, i.e. protein becomes white. |
Sample parameter file¶
You may run the program in the command line by providing the parameters via a text file:
segment --f parameterfile.txt
Where the format of the parameters is:
Micrographs = cs_scan034.tif
Image output stack = protein_stack.hdf
Segment coordinates = scan034_boxes.txt
Segment size in Angstrom = 700
Estimated helix width in Angstrom = 200
Step size of segmentation in Angstrom = 70
Pixel size in Angstrom = 1.163
Invert option = True
Additional parameters (intermediate level)¶
Parameter |
Example (default) |
Description |
---|---|---|
Spring database option |
True |
If checked will read previous spring.db (Sqlite-compatible database) otherwise will create new one. |
spring.db file |
spring.db |
Program requires a previously generated spring.db and writes an updated spring.db database in the working directory. |
Perturb step option |
False |
Perturb the segmentation step between the windowed segments. Takes specified step size and applies a random shift along the helix between +/- stepsize // 2. This is useful to avoid artifacts in the Fourier transforms of class averages. |
CTF correct option |
True |
Segments are CTF corrected with determined CTF parameters. |
CTFFIND or CTFTILT |
ctftilt |
Choose whether ‘ctffind’ or ‘ctftilt’ values are used for CTF correction. |
convolve or phase-flip |
convolve |
Choose whether to ‘convolve’ or ‘phase-flip’ images with determined CTF. |
Binning option |
False |
Segments are reduced in size by binning. |
Binning factor |
6 |
Segments are reduced in size by binning factor (accepted values min=1, max=20). |
Normalization option |
True |
Segments are normalized with a mean of 0 and standard deviation of 1. |
Row normalization option |
False |
Option to normalize micrographs row by row to eliminate artifacts as they occur in Falcon II images or frames if they are not correctly linearized. |
Remove helix ends option |
False |
Ends of helices are removed by half the segment size. This depends on how you boxed the helices. |
Rotation option |
True |
Segments are rotated with helix axis perpendicular to image rows. |
MPI option |
True |
OpenMPI installed (mpirun). |
Number of CPUs |
8 |
Number of processors to be used (accepted values min=1, max=1000). |
Temporary directory |
/tmp |
Temporary directory should have fast read and write access. |
Sample parameter file (intermediate level)¶
You may run the program in the command line by providing the parameters via a text file:
segment --f parameterfile.txt
Where the format of the parameters is:
Micrographs = cs_scan034.tif
Image output stack = protein_stack.hdf
Spring database option = True
spring.db file = spring.db
Segment coordinates = scan034_boxes.txt
Segment size in Angstrom = 700
Estimated helix width in Angstrom = 200
Step size of segmentation in Angstrom = 70
Perturb step option = False
Pixel size in Angstrom = 1.163
CTF correct option = True
CTFFIND or CTFTILT = ctftilt
convolve or phase-flip = convolve
Binning option = False
Binning factor = 6
Invert option = True
Normalization option = True
Row normalization option = False
Remove helix ends option = False
Rotation option = True
MPI option = True
Number of CPUs = 8
Temporary directory = /tmp
Additional parameters (expert level)¶
Parameter |
Example (default) |
Description |
---|---|---|
Astigmatism correction |
True |
Option to correct for astigmatism in image otherwise average defocus is used. |
Micrographs select option |
False |
Choose whether to select any particular micrographs. |
Include or exclude micrographs |
include |
Choose whether to ‘include’ or ‘exclude’ specified micrographs. |
Micrographs list |
1-9, 11, 13 |
List of comma-separated micrograph ids, e.g. ‘1-10, 12, 14’ (1st micrograph is 1). |
Helices select option |
False |
Choose whether to select any particular helices. |
Include or exclude helices |
include |
Choose whether to ‘include’ or ‘exclude’ specified helices. |
Helices list |
1-9, 11, 13 |
List of comma-separated helix ids, e.g. ‘1-10, 12, 14’ (1st helix is 1). |
Straightness select option |
False |
Choose whether to select any helices based on straightness. |
Include or exclude straight helices |
include |
Choose whether to ‘include’ or ‘exclude’ helices of specified persistence length. |
Persistence length range |
(80, 100) |
Range of persistence length in percent, i.e. upper 10 percent of distribution is expressed as 90 - 100 percent range, lower 20 percent is expressed as 0 - 20 percent etc. 90 - 100 % corresponds to most straight helices. Values from database are stored in m, e.g. ‘0-0.0001’ Persistence length is calculated as: p = -ln(2 * (end_to_end_distance / contour_length) ** 2 - 1) / contour_length)), i.e. short persistence lengths of 1 nm correspond to very flexible whereas 1 m corresponds to extremely straight helices. Examples are TMV: 2.9 mm (2.9e-3 m), amyloid beta filaments: 300 microm (3e-4 m) and DNA: 100 nm (1e-7 m). Due to the alignment error of the segments this value may not be absolutely comparable to determined persistence lengths by other methods but still be valid as a relative measure of straightness (accepted values min=0, max=100). |
Defocus select option |
False |
Choose whether to select any segments based on defocus. |
Include or exclude defocus range |
include |
Choose whether to ‘include’ or ‘exclude’ segments of specified defocus. |
Defocus range |
(10000, 40000) |
Range of defocus in Angstrom, e.g. ‘10000-40000’ (accepted values min=0, max=100000). |
Astigmatism select option |
False |
Choose whether to select any segments based on astigmatism. |
Include or exclude astigmatic segments |
include |
Choose whether to ‘include’ or ‘exclude’ segments of specified astigmatism amplitude in Angstrom. |
Astigmatism range |
(0, 4000) |
Range of astigmatism amplitude (difference between defocus one and two) in Angstrom, e.g. ‘0-4000’ (accepted values min=0, max=100000). |
Frame processing option |
False |
This option will prepare of stack containing frame helix segments from direct electron detectors and is intended for subsequent helix-based movie processing using ‘segmentrefine3d’. Prior to this option run ‘segmentrefine3d’ using the combined average of all frames. For input of the ‘Frame processing option’ using ‘segment’ please provide: 1. ‘Micrographs’ as an mrc-stack file 2. ‘Segment coordinates’ - use previous spring.db as input instead of pure coordinate files. 3. ‘spring.db file’ previous spring.db (same file as 2.) and 4. ‘Refinement.db to process’ from your last ‘segmentrefine3d’ cycle. This option will generate the following output: 1. Stack of frame helix segments, 2. spring_frames.db with copies of all segment entries from the previous spring.db and 3. refinement_frames.db with copies of previous orientation parameters. With those output files of the ‘segment’ run you can launch ‘segmentrefine3d’ with ‘Frame motion correction’ |
First and last frame |
(0, 6) |
Choose first and last frame to be processed from direct detector movies. Remember, first frame correspond to frame 0 (accepted values min=0, max=400). |
Refinement.db to process |
refinement.db |
Input: refinement.db from previous combined average frame run of segmentrefine3d. |
Sample parameter file (expert level)¶
You may run the program in the command line by providing the parameters via a text file:
segment --f parameterfile.txt
Where the format of the parameters is:
Micrographs = cs_scan034.tif
Image output stack = protein_stack.hdf
Spring database option = True
spring.db file = spring.db
Segment coordinates = scan034_boxes.txt
Segment size in Angstrom = 700
Estimated helix width in Angstrom = 200
Step size of segmentation in Angstrom = 70
Perturb step option = False
Pixel size in Angstrom = 1.163
CTF correct option = True
CTFFIND or CTFTILT = ctftilt
convolve or phase-flip = convolve
Astigmatism correction = True
Binning option = False
Binning factor = 6
Invert option = True
Normalization option = True
Row normalization option = False
Micrographs select option = False
Include or exclude micrographs = include
Micrographs list = 1-9, 11, 13
Helices select option = False
Include or exclude helices = include
Helices list = 1-9, 11, 13
Straightness select option = False
Include or exclude straight helices = include
Persistence length range = (80, 100)
Defocus select option = False
Include or exclude defocus range = include
Defocus range = (10000, 40000)
Astigmatism select option = False
Include or exclude astigmatic segments = include
Astigmatism range = (0, 4000)
Remove helix ends option = False
Rotation option = True
Frame processing option = False
First and last frame = (0, 6)
Refinement.db to process = refinement.db
MPI option = True
Number of CPUs = 8
Temporary directory = /tmp
Command line options¶
When invoking segment, you may specify any of these options:
usage: segment [-h] [--g] [--p] [--f FILENAME] [--c] [--l LOGFILENAME] [--d DIRECTORY_NAME] [--version] [--spring_database_option]
[--perturb_step_option] [--ctf_correct_option] [--astigmatism_correction] [--binning_option] [--invert_option] [--normalization_option]
[--row_normalization_option] [--micrographs_select_option] [--helices_select_option] [--straightness_select_option]
[--defocus_select_option] [--astigmatism_select_option] [--remove_helix_ends_option] [--rotation_option] [--frame_processing_option]
[--mpi_option]
[input_output [input_output ...]]
Program to extract overlapping segments from micrographs
positional arguments:
input_output Input and output files
optional arguments:
-h, --help show this help message and exit
--g, --GUI GUI option: read input parameters from GUI
--p, --promptuser Prompt user option: read input parameters from prompt
--f FILENAME, --parameterfile FILENAME
File option: read input parameters from FILENAME
--c, --cmd Command line parameter option: read only boolean input parameters from command line and all other parameters will be assigned
from other sources
--l LOGFILENAME, --logfile LOGFILENAME
Output logfile name as specified
--d DIRECTORY_NAME, --directory DIRECTORY_NAME
Output directory name as specified
--version show program's version number and exit
--spring_database_option, --spr
If checked will read previous spring.db (Sqlite-compatible database) otherwise will create new one. (default: False)
--perturb_step_option, --per
Perturb the segmentation step between the windowed segments. Takes specified step size and applies a random shift along the
helix between +/- stepsize // 2. This is useful to avoid artifacts in the Fourier transforms of class averages. (default:
False)
--ctf_correct_option, --ctf
Segments are CTF corrected with determined CTF parameters. (default: False)
--astigmatism_correction, --ast
Option to correct for astigmatism in image otherwise average defocus is used. (default: False)
--binning_option, --bin
Segments are reduced in size by binning. (default: False)
--invert_option, --inv
Inversion of image densities for cryo data, i.e. protein becomes white. (default: False)
--normalization_option, --nor
Segments are normalized with a mean of 0 and standard deviation of 1. (default: False)
--row_normalization_option, --row
Option to normalize micrographs row by row to eliminate artifacts as they occur in Falcon II images or frames if they are not
correctly linearized. (default: False)
--micrographs_select_option, --mic
Choose whether to select any particular micrographs. (default: False)
--helices_select_option, --hel
Choose whether to select any particular helices. (default: False)
--straightness_select_option, --str
Choose whether to select any helices based on straightness. (default: False)
--defocus_select_option, --def
Choose whether to select any segments based on defocus. (default: False)
--astigmatism_select_option
Choose whether to select any segments based on astigmatism. (default: False)
--remove_helix_ends_option, --rem
Ends of helices are removed by half the segment size. This depends on how you boxed the helices. (default: False)
--rotation_option, --rot
Segments are rotated with helix axis perpendicular to image rows. (default: False)
--frame_processing_option, --fra
This option will prepare of stack containing frame helix segments from direct electron detectors and is intended for
subsequent helix-based movie processing using 'segmentrefine3d'. Prior to this option run 'segmentrefine3d' using the combined
average of all frames. For input of the 'Frame processing option' using 'segment' please provide: 1. 'Micrographs' as an mrc-
stack file 2. 'Segment coordinates' - use previous spring.db as input instead of pure coordinate files. 3. 'spring.db file'
previous spring.db (same file as 2.) and 4. 'Refinement.db to process' from your last 'segmentrefine3d' cycle. This option
will generate the following output: 1. Stack of frame helix segments, 2. spring_frames.db with copies of all segment entries
from the previous spring.db and 3. refinement_frames.db with copies of previous orientation parameters. With those output
files of the 'segment' run you can launch 'segmentrefine3d' with 'Frame motion correction' (default: False)
--mpi_option, --mpi OpenMPI installed (mpirun). (default: False)
Program flow¶
assign_reorganize: Initialize micrographs and segments to convert them into Spring’s file structure
single_out: Single out individual helices from micrograph
readmic: Loading new micrograph
center_segments: Segments are centerd with respect to helix axis
window_segment: Windowing segments from micrograph