bioin
stable

Contents:

  • Getting Started
  • API Documentation
    • bioin.pattern_count
    • bioin.frequency_map
    • bioin.frequent_words
    • bioin.reverse
    • bioin.complement
    • bioin.reverse_complement
    • bioin.motif.count_motif
    • bioin.motif.profile_motif
    • bioin.motif.consensus_motif
    • bioin.motif.score_motif
    • bioin.motif.probability_profile
    • bioin.motif.profile_most_probable_kmer
    • bioin.motif.greedy_motif_search
    • bioin.motif.count_with_pseudocount
    • bioin.motif.profile_with_pseudocount
    • bioin.motif.consensus_with_pseudocount
    • bioin.motif.score_with_pseudocount
    • bioin.motif.probability_with_pseudocount
    • bioin.motif.profile_most_probable_kmer_with_pseudocount
    • bioin.motif.greedy_motif_search_with_pseudocount
    • bioin.motif.profile_probable_motifs
    • bioin.motif.random_motifs
    • bioin.motif.randomized_motif_search
    • bioin.motif.normalize_probability
    • bioin.motif.weighted_die
    • bioin.motif.profile_generated_string
    • bioin.motif.GibbsSampler
    • bioin.replication.pattern_matching
    • bioin.replication.symbol_array
    • bioin.replication.faster_symbol_array
    • bioin.replication.skew_array
    • bioin.replication.minimum_skew
    • bioin.replication.hamming_distance
    • bioin.replication.approximate_pattern_matching
    • bioin.replication.approximate_pattern_count
bioin
  • Docs »
  • API Documentation »
  • bioin.motif.greedy_motif_search
  • Edit on GitHub

bioin.motif.greedy_motif_search¶

bioin.motif.greedy_motif_search(dna, k, t)[source]¶

Calculate t k-mers from dna that have the best score (i.e. the most frequently occur t k-mers in the given dna)

Parameters:
  • dna (list) – matrix, has t rows (t strings in the list).
  • k (int) – k-mer, k nuleotides in each string of the list.
  • t (int) – t is the number of k-mers in dna to return, also equal to the row number of dna 2D matrix.
Returns:

List, (or can take it as 2D matrix, a sub matrix of dna), t k-mers.

Examples

GreedyMotifSearch, starts by setting best_motifs equal to the first k-mer from each string in Dna (each row assign a k-mer), then ranges over all possible k-mers in dna[0], the algorithm then builds a profile matrix Profile fro this lone k-mer, and sets Motifs[1] equal to the profile_most_probable k-mer in dna[1]. Then iterates by updating Profile as the profile matrix formed from Motifs[0] and Motifs[1], and sets Motifs[2] equal to the profile_most_probable k-mer in dna[2]. After finding k-mers Motifs in the first i strings of Dna, GreedyMotifSearch constructs Profile(Motifs) and sets Motifs[i] equal to the profile_most_probable k-mer from dna[i] based on this profile matrix.

>>> dna = ['GGCGTTCAGGCA', 'AAGAATCAGTCA', 'CAAGGAGTTCGC', 'CACGTCAATCAC', 'CAATAATATTCG']
>>> k = 3
>>> t = 5
>>> t_kmers = greedy_motif_search(dna, k, t)
>>> t_kmers
    ['CAG', 'CAG', 'CAA', 'CAA', 'CAA']
Next Previous

© Copyright 2020, Lihua. Project structure based on the Computational Molecular Science Python Cookiecutter version 1.2 Revision 4b0adfbe.

Built with Sphinx using a theme provided by Read the Docs.