bioin.motif.GibbsSampler¶
-
bioin.motif.GibbsSampler(Dna, k, t, N)[source]¶ Using GibbsSampler method to return the best motifs of t k-mers in each of the strings in Dna.
Parameters: - Dna (list) – matrix, a collection of strings dna, has t rows.
- k (int) – k-mer.
- t (int) – int, t is the number of k-mers in dna to return, also equal to the row number of dna 2D matrix.
- N (int) – the number of iterations that we plan to run the program.
- Returns:
- List, string matrix represent the best motifs, t k-mers of each row of the strings in Dna.
Examples
Although GibbsSampler performs well in many cases, it may converge to a suboptimal solution, particularly for difficult search problems with elusive motifs. A local optimum is a solution that is optimal within a small neighboring set of solutions, which is in contrast to a global optimum, or the optimal solution among all possible solutions. Since GibbsSampler explores just a small subset of solutions, it may “get stuck” in a local optimum. For this reason, similarly to RandomizedMotifSearch, it should be run many times with the hope that one of these runs will produce the best-scoring motifs. Yet convergence to a local optimum is just one of many issues we must consider in motif finding.
>>> Dna = ['CGCCCCTCTCGGGGGTGTTCAGTAAACGGCCA', 'GGGCGAGGTATGTGTAAGTGCCAAGGTGCCAG', 'TAGTACCGAGACCGAAAGAAGTATACAGGCGT', 'TAGATCAAGTTTCAGGTGCACGTCGGTGAACC', 'AATCCACCAGCTCCACGTGCAATGTTGGCCTA'] >>> k = 8 >>> t = 5 >>> N = 100 >>> best_motif_gibs = GibbsSampler(Dna, k, t, N) >>> best_motif_gibs ['AACGGCCA', 'AAGTGCCA', 'TAGTACCG', 'AAGTTTCA', 'ACGTGCAA']