Parameters: | query_sequence : string
The query sequence, this may be upper or lowercase from the set of
{A, C, G, T, N} (nucleotide) or from the set of
{A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V, B, Z, X, *
} (protein)
gap_open_penalty : int, optional
The penalty applied to creating a gap in the alignment. This CANNOT
be 0.
Default is 5.
gap_extend_penalty : int, optional
The penalty applied to extending a gap in the alignment. This CANNOT
be 0.
Default is 2.
score_size : int, optional
If your estimated best alignment score is < 255 this should be 0.
If your estimated best alignment score is >= 255, this should be 1.
If you don’t know, this should be 2.
Default is 2.
mask_length : int, optional
The distance between the optimal and suboptimal alignment ending
position >= mask_length. We suggest to use len(query_sequence)/2, if
you don’t have special concerns.
Detailed description of mask_length: After locating the optimal
alignment ending position, the suboptimal alignment score can be
heuristically found by checking the second largest score in the array
that contains the maximal score of each column of the SW matrix. In
order to avoid picking the scores that belong to the alignments
sharing the partial best alignment, SSW C library masks the reference
loci nearby (mask length = mask_length) the best alignment ending
position and locates the second largest score from the unmasked
elements.
Default is 15.
mask_auto : bool, optional
This will automatically set the used mask length to be
max(int(len(query_sequence)/2), mask_length).
Default is True.
score_only : bool, optional
This will prevent the best alignment beginning positions (BABP) and the
cigar from being returned as a result. This overrides any setting on
score_filter, distance_filter, and override_skip_babp. It has the
highest precedence.
Default is False.
score_filter : int, optional
If set, this will prevent the cigar and best alignment beginning
positions (BABP) from being returned if the optimal alignment score is
less than score_filter saving some time computationally. This filter
may be overridden by score_only (prevents BABP and cigar, regardless
of other arguments), distance_filter (may prevent cigar, but will
cause BABP to be calculated), and override_skip_babp (will ensure
BABP) returned.
Default is None.
distance_filter : int, optional
If set, this will prevent the cigar from being returned if the length
of the query_sequence or the target_sequence is less than
distance_filter saving some time computationally. The results of
this filter may be overridden by score_only (prevents BABP and cigar,
regardless of other arguments), and score_filter (may prevent cigar).
override_skip_babp has no effect with this filter applied, as BABP
must be calculated to perform the filter.
Default is None.
override_skip_babp : bool, optional
When True, the best alignment beginning positions (BABP) will always be
returned unless score_only is set to True.
Default is False.
protein : bool, optional
When True, the query_sequence and target_sequence will be read as
protein sequence. When False, the query_sequence and
target_sequence will be read as nucleotide sequence. If True, a
substitution_matrix must be supplied.
Default is False.
match_score : int, optional
When using a nucleotide sequence, the match_score is the score added
when a match occurs. This is ignored if substitution_matrix is
provided.
Default is 2.
mismatch_score : int, optional
When using a nucleotide sequence, the mismatch is the score subtracted
when a mismatch occurs. This should be a negative integer.
This is ignored if substitution_matrix is provided.
Default is -3.
substitution_matrix : 2D dict, optional
Provides the score for each possible substitution of sequence
characters. This may be used for protein or nucleotide sequences. The
entire set of possible combinations for the relevant sequence type MUST
be enumerated in the dict of dicts. This will override match_score
and mismatch_score. Required when protein is True.
Default is None.
suppress_sequences : bool, optional
If True, the query and target sequences will not be returned for
convenience.
Default is False.
zero_index : bool, optional
If True, all inidices will start at 0. If False, all inidices will
start at 1.
Default is True.
|