skbio.alignment.global_pairwise_align_nucleotide¶
-
skbio.alignment.
global_pairwise_align_nucleotide
(seq1, seq2, gap_open_penalty=5, gap_extend_penalty=2, match_score=1, mismatch_score=-2, substitution_matrix=None, penalize_terminal_gaps=False)[source]¶ Globally align nucleotide seqs or alignments with Needleman-Wunsch
State: Experimental as of 0.4.0.
- Parameters
seq1 (DNA, RNA, or TabularMSA[DNA|RNA]) – The first unaligned sequence(s).
seq2 (DNA, RNA, or TabularMSA[DNA|RNA]) – The second unaligned sequence(s).
gap_open_penalty (int or float, optional) – Penalty for opening a gap (this is substracted from previous best alignment score, so is typically positive).
gap_extend_penalty (int or float, optional) – Penalty for extending a gap (this is substracted from previous best alignment score, so is typically positive).
match_score (int or float, optional) – The score to add for a match between a pair of bases (this is added to the previous best alignment score, so is typically positive).
mismatch_score (int or float, optional) – The score to add for a mismatch between a pair of bases (this is added to the previous best alignment score, so is typically negative).
substitution_matrix (2D dict (or similar)) – Lookup for substitution scores (these values are added to the previous best alignment score). If provided, this overrides
match_score
andmismatch_score
.penalize_terminal_gaps (bool, optional) – If True, will continue to penalize gaps even after one sequence has been aligned through its end. This behavior is true Needleman-Wunsch alignment, but results in (biologically irrelevant) artifacts when the sequences being aligned are of different length. This is
False
by default, which is very likely to be the behavior you want in all or nearly all cases.
- Returns
TabularMSA
object containing the aligned sequences, alignment score (float), and start/end positions of each input sequence (iterable of two-item tuples). Note that start/end positions are indexes into the unaligned sequences.- Return type
See also
local_pairwise_align()
,local_pairwise_align_protein()
,local_pairwise_align_nucleotide()
,skbio.alignment.local_pairwise_align_ssw()
,global_pairwise_align()
,global_pairwise_align_protein()
Notes
Default
match_score
,mismatch_score
,gap_open_penalty
andgap_extend_penalty
parameters are derived from the NCBI BLAST Server 1.This function can be use to align either a pair of sequences, a pair of alignments, or a sequence and an alignment.
References