Michelixtracegrid¶
Program to optimize michelixtrace by varying michelixtrace parameters systematically on a grid
Parameters¶
Parameter |
Example (default) |
Description |
---|---|---|
Micrographs |
cs_scan034.tif |
Input micrographs: accepted file formats (tif, .mrc, .mrcs, .spi, .hdf, .img, .hed). |
Diagnostic plot pattern |
michelixtracegrid_diag.pdf |
If single input micrograph: name of diagnostic plot file. In case of multiple input micrographs suffix to be attached to corresponding input micrograph. Output: accepted file formats (pdf, .png, .bmp, .emf, .eps, .gif, .jpeg, .jpg, .ps, .raw, .rgba, .svg, .svgz, .tif, .tiff). |
First parameter |
alpha_threshold |
Choose parameter to be varied in first dimension: ; ‘tile_size_power’; ‘tile_overlap’; ‘binning_factor’; ‘alpha_threshold’; ‘min_helix_length’; ‘max_helix_length’; ‘order_fit’; |
Second parameter |
min_helix_length |
Choose parameter to be varied in second dimension: ; ‘none’; ‘tile_size_power’; ‘tile_overlap’; ‘binning_factor’; ‘alpha_threshold’; ‘min_helix_length’; ‘max_helix_length’; ‘order_fit’; |
Lower and upper limit first parameter |
(1.4, 1.9) |
Lower and upper limit of first parameter for grid search. Unit dependent on quantity (accepted values min=-1e+07, max=1e+07). |
Lower and upper limit second parameter |
(22.0, 24.0) |
Lower and upper limit of second parameter for grid search. Unit dependent on quantity (accepted values min=-1e+07, max=1e+07). |
First and second parameter increment |
(0.1, 0.3) |
First and second parameter increment for grid search. Unit dependent on quantity (accepted values min=-1e+08, max=1e+08). |
Helix reference |
helix_reference.hdf |
Helix reference: long rectangular straight box of helix to be traced. accepted file formats (spi, .hdf, .img, .hed). |
Estimated helix width in Angstrom |
200 |
Generous width measure of helix required for rectangular mask (accepted values min=0, max=1500). |
Pixel size in Angstrom |
1.163 |
Pixel size is an imaging parameter (accepted values min=0.001, max=100). |
Sample parameter file¶
You may run the program in the command line by providing the parameters via a text file:
michelixtracegrid --f parameterfile.txt
Where the format of the parameters is:
Micrographs = cs_scan034.tif
Diagnostic plot pattern = michelixtracegrid_diag.pdf
First parameter = alpha_threshold
Second parameter = min_helix_length
Lower and upper limit first parameter = (1.4, 1.9)
Lower and upper limit second parameter = (22.0, 24.0)
First and second parameter increment = (0.1, 0.3)
Helix reference = helix_reference.hdf
Estimated helix width in Angstrom = 200
Pixel size in Angstrom = 1.163
Additional parameters (intermediate level)¶
Parameter |
Example (default) |
Description |
---|---|---|
Binning option |
True |
Micrograph is reduced in size by binning. |
Binning factor |
4 |
Micrograph is reduced in size by binning factor (accepted values min=1, max=20). |
MPI option |
True |
OpenMPI installed (mpirun). |
Number of CPUs |
2 |
Number of processors to be used. Maximum number corresponds directly to number of input scans, i.e. no gain in performance if single input micrograph chosen (accepted values min=1, max=300). |
Temporary directory |
/tmp |
Temporary directory should have fast read and write access. |
Sample parameter file (intermediate level)¶
You may run the program in the command line by providing the parameters via a text file:
michelixtracegrid --f parameterfile.txt
Where the format of the parameters is:
Micrographs = cs_scan034.tif
Diagnostic plot pattern = michelixtracegrid_diag.pdf
First parameter = alpha_threshold
Second parameter = min_helix_length
Lower and upper limit first parameter = (1.4, 1.9)
Lower and upper limit second parameter = (22.0, 24.0)
First and second parameter increment = (0.1, 0.3)
Helix reference = helix_reference.hdf
Estimated helix width in Angstrom = 200
Pixel size in Angstrom = 1.163
Binning option = True
Binning factor = 4
MPI option = True
Number of CPUs = 2
Temporary directory = /tmp
Additional parameters (expert level)¶
Parameter |
Example (default) |
Description |
---|---|---|
Subgrid option |
False |
Run subgrids to parallelize expensive grid searches. |
Part and number of subgrids |
(1, 3) |
E.g. one out of total of three subgrids will be run. This features is thought for parallelization of expensive grid searches (accepted values min=1, max=100). |
Grid continue option |
False |
Continue grid refinement in case of interrupted grid searches. |
Grid database |
grid.db |
Continue grid refinement in case of interrupted grid searches. |
Invert option |
False |
Inversion of contrast of reference, e.g. when using inverted class-averageReference must have same contrast than the micrograph, e.g. protein requires to be black in micrograph as well as reference. |
Tile size power spectrum in Angstrom |
500 |
Tile size to be used for analysis (accepted values min=1, max=10000). |
Tile overlap in percent |
80 |
Overlap influences degree of averaging (accepted values min=0, max=90). |
Alpha threshold cc-map |
0.001 |
Parameter for adaptive thresholding of CC-map:The significance of cross correlation values in the micrograph will be judged by how extreme values compare to an exponential null hypothesis.The corresponding p-values are considered significant if below significance level alpha. Lower this value in orders of magnitude if helix tracing too promiscuous (accepted values min=0, max=1). |
Absolute threshold option cc-map |
False |
If True, then adaptive thresholding using Alpha threhold will not be used. Instead, absolute CC-value can be defined using Absolute threshold parameter. |
Absolute threshold cc-map |
0.2 |
Absolute CC threshold to regard pixel in CC-map as helix. Can only be used if Absolute threshold option is on (accepted values min=0, max=10). |
Order fit |
2 |
Order of polynomial fit the coordinates of detected helix (1=linear, 2=quadratic, 3=cubic …). Can be used as a further restraint (accepted values min=1, max=19). |
Minimum and maximum helix length |
(500, 1500) |
Sets the minimum and maximum allowed helix length in Angstrom. Too short values can lead to contaminations being recognized as helices Too large values can be too stringent, especially for overlapping or highly bent helices. Longer helices will be split in half. Maximum helix length is recommended to be at least double of minimum helix length. (accepted values min=100, max=7000). |
Pruning cutoff bending |
2.0 |
Outlier helices that are too bent or kinked are removed in this pruning step. The distribution of persistence length measures is analyzed once a population of more than 100 helices have been detected. The pruning cutoff determines how many standard deviations (estimated by MAD) the persistence length is allowed to be below the median of the distribution. Diagnostic output file “PersistenceLength.pdf” is generated. Values between 1 and 3 are recommended (accepted values min=0, max=10). |
Box file coordinate step |
70.0 |
If resulting box files are to be used in another software, step size in Anstrombetween coordinates can be set here. Leave unchanged for subjequent usage withinSPRING, since this can be adjusted in the SPRING program #segment seperately (accepted values min=1, max=500). |
Compute performance score |
False |
Option to compute measures of tracing performance based on recall, precision F1-measure, F05-measure by comparison of traced with provided ground truth helices. |
Parameter search option |
False |
If True, tracing is run with multiple parameter pairs of Alpha threshold and Minimum helix length cutoff to determine optimum parameter set. The grid search will output a ParameterSpace.pdf file. |
Manually traced helix file |
mic.box |
Interactively traced helix file considered to be the ground truth in for parameter search. Input: file with identical name of corresponding micrograph (accepted file formats EMAN’s Helixboxer/Boxer, EMAN2’s E2helixboxer and Bsoft filament parameters coordinates: .box, .txt). Make sure that helix paths are continuous. A helix path can follow a C- or S-path but must NOT form a U-turn. |
Sample parameter file (expert level)¶
You may run the program in the command line by providing the parameters via a text file:
michelixtracegrid --f parameterfile.txt
Where the format of the parameters is:
Micrographs = cs_scan034.tif
Diagnostic plot pattern = michelixtracegrid_diag.pdf
First parameter = alpha_threshold
Second parameter = min_helix_length
Lower and upper limit first parameter = (1.4, 1.9)
Lower and upper limit second parameter = (22.0, 24.0)
First and second parameter increment = (0.1, 0.3)
Subgrid option = False
Part and number of subgrids = (1, 3)
Grid continue option = False
Grid database = grid.db
Helix reference = helix_reference.hdf
Invert option = False
Estimated helix width in Angstrom = 200
Pixel size in Angstrom = 1.163
Binning option = True
Binning factor = 4
Tile size power spectrum in Angstrom = 500
Tile overlap in percent = 80
Alpha threshold cc-map = 0.001
Absolute threshold option cc-map = False
Absolute threshold cc-map = 0.2
Order fit = 2
Minimum and maximum helix length = (500, 1500)
Pruning cutoff bending = 2.0
Box file coordinate step = 70.0
Compute performance score = False
Parameter search option = False
Manually traced helix file = mic.box
MPI option = True
Number of CPUs = 2
Temporary directory = /tmp
Command line options¶
When invoking michelixtracegrid, you may specify any of these options:
usage: michelixtracegrid [-h] [--g] [--p] [--f FILENAME] [--c] [--l LOGFILENAME] [--d DIRECTORY_NAME] [--version] [--subgrid_option]
[--grid_continue_option] [--invert_option] [--binning_option] [--absolute_threshold_option_cc-map]
[--compute_performance_score] [--parameter_search_option] [--mpi_option]
[input_output [input_output ...]]
Program to optimize michelixtrace by varying michelixtrace parameters systematically on a grid
positional arguments:
input_output Input and output files
optional arguments:
-h, --help show this help message and exit
--g, --GUI GUI option: read input parameters from GUI
--p, --promptuser Prompt user option: read input parameters from prompt
--f FILENAME, --parameterfile FILENAME
File option: read input parameters from FILENAME
--c, --cmd Command line parameter option: read only boolean input parameters from command line and all other parameters will be assigned
from other sources
--l LOGFILENAME, --logfile LOGFILENAME
Output logfile name as specified
--d DIRECTORY_NAME, --directory DIRECTORY_NAME
Output directory name as specified
--version show program's version number and exit
--subgrid_option, --sub
Run subgrids to parallelize expensive grid searches. (default: False)
--grid_continue_option, --gri
Continue grid refinement in case of interrupted grid searches. (default: False)
--invert_option, --inv
Inversion of contrast of reference, e.g. when using inverted class-averageReference must have same contrast than the
micrograph, e.g. protein requires to be black in micrograph as well as reference. (default: False)
--binning_option, --bin
Micrograph is reduced in size by binning. (default: False)
--absolute_threshold_option_cc-map, --abs
If True, then adaptive thresholding using Alpha threhold will not be used. Instead, absolute CC-value can be defined using
Absolute threshold parameter. (default: False)
--compute_performance_score, --com
Option to compute measures of tracing performance based on recall, precision F1-measure, F05-measure by comparison of traced
with provided ground truth helices. (default: False)
--parameter_search_option, --par
If True, tracing is run with multiple parameter pairs of Alpha threshold and Minimum helix length cutoff to determine optimum
parameter set. The grid search will output a ParameterSpace.pdf file. (default: False)
--mpi_option, --mpi OpenMPI installed (mpirun). (default: False)
Program flow¶
orient_reference_power_with_overlapping_powers: Find orientations of by matching power spectra.
find_translations_by_cc: Find translations by cross-correlation
perform_connected_component_analysis: Extract individual helices by connected component analysis.
build_cc_image_of_helices: Compute fine map of helix localisation
visualize_traces_in_diagnostic_plot: Generate diagnostic plot