Question | Answer |
What is the Hamming Distance Problem? | For every alignment i, output Ham(i), the hamming distance between P and T[i ... i+m-1] |
By computing cross correlations to solve the hamming distance problem, what time and space complexity can we expect? | O(n|sigma|logm) total time O(n) space |
Briefly explain how to solve the hamming distance problem using cross correlations | |
Name a situation where the cross correlation method to solve the hamming distance problem is worse than the naive method | When the alphabet is longer than m |
For the O(n(rootm)logm) method, how many times must a symbol occur for it to be classed as frequent? | root m |
How many frequent symbols can there be? | root m at most |
Stage 1 involves counting the number of matches involving frequent symbols. What is the time complexity of this step? | O(n(rootm)logm) |
Stage 1 involves counting the matches involving frequent symbols using cross correlations. How does this differ from stage 2, which involves counting the matches involving infrequent symbols? | An array of length (n - m + 1) is constructed initially containing zeros. During a single pass of T, if T[k] is infrequent, for all j such that T[k] = P[j], increase A[k - j] by one, as long as k - j > 0. A[i] is the number of matches at alignment i involving an infrequent symbol. |
How quick is stage 2 of the O(n(rootm)logm) algorithm? | O(n root m) |
How quickly can we classify each symbol as frequent or infrequent? | O(mlogm) |
How can we improve the O(n(rootm)logm) algorithm to O(n (rootmlogm))? | By changing the definition of frequent to be at least root(mlogm) occurrences. |
Want to create your own Flashcards for free with GoConqr? Learn more.