bioin.motif.profile_with_pseudocount

bioin.motif.profile_with_pseudocount(motifs)[source]

The percentage of pseudocount number of nucleotides column wise from a motifs matrix.

Parameters:motifs (list) – 2D matrix, matrix of motifs in genome.
Returns:Dictionary, the percentile pseudocount of each nucleotides in each column of the motifs matrix.

Examples

Takes a list of strings motifs as input and then generate the count_with_pseudocount(motifs), then divide each element of the pseudocount matrix by the number of rows plus four in the pseudocount matrix, to obtain the profile_with_pseudocount (as a dictionary of lists.)

>>> motifs = ['AACGTA', 'CCCGTT', 'CACCTT', 'GGATTA', 'TTCCGG']
>>> profile_pseudo_dict = profile_with_pseudocount(motifs)
>>> profile_pseudo_dict
    {'A': [0.2222222222222222, 0.3333333333333333, 0.2222222222222222, 0.1111111111111111, 0.1111111111111111, 0.3333333333333333], 'C': [0.3333333333333333, 0.2222222222222222, 0.5555555555555556, 0.3333333333333333, 0.1111111111111111, 0.1111111111111111], 'G': [0.2222222222222222, 0.2222222222222222, 0.1111111111111111, 0.3333333333333333, 0.2222222222222222, 0.2222222222222222], 'T': [0.2222222222222222, 0.2222222222222222, 0.1111111111111111, 0.2222222222222222, 0.5555555555555556, 0.3333333333333333]}