bioin.replication.approximate_pattern_count¶
-
bioin.replication.approximate_pattern_count(pattern, text, d)[source]¶ Compute the number of occurrences of pattern in text with at most d mismatches. Given input strings Text and Pattern as well as an integer d, we extend the definition of pattern_count to the function approximate_pattern_count(Pattern, Text, d). This function computes the number of occurrences of Pattern in Text with at most d mismatches. For example, approximate_pattern_count(‘AAAAA’, ‘AACAAGCATAAACATTAAAGAG’, 1) = 4.
This is because AAAAA appears four times in this string with at most one mismatch: AACAA, ATAAA, AAACA, and AAAGA. Notice that two of these occurrences overlap.
Parameters: - pattern (str) – a sub DNA string.
- text (str) – a DNA string.
- d (int) – the number of maximum mismatches.
Returns: Integer, the number of occurrences of pattern in text with at most d mismatches.
Examples
The number of times Pattern appears in Text with at most d mismatches.
>>> pattern = 'GAGG' >>> text = 'TTTAGAGCCTTCAGAGG' >>> d = 2 >>> approx_count = approximate_pattern_count(pattern, text, d) >>> approx_count 4