What is done in pairwise alignment?
Two sequences are stuck on top of each others and the common areas lined up
Two sequences are aligned from the beginning and their similarity analysed
Two sequences are aligned and their similarity analysed
Local alignment assigns correspondences to all residues in the sequence, leaving gaps if necessary
The image shows which of the following?
An indel
A mismatch
The image shows a mismatch
What can sequence identity be used for?
DNA sequence
Amino acid sequence
Gene sequence
Sequence identity can be how specific?
Global
Local
What are GAPS?
Spaces in part of the sequence that allows other areas to align better
Areas with a high density of indels
Proteins that align when sequences are swapped over
What is the scoring system that determines an optimal alignment?
+2 Match -1 Indel -2 Mismatch
+2 Mismatch -1 Match -2 Indel
+2 Match -1 Mismatch -2 Indel
Needleman and Wunsch algorithms find the highest scoring path across a matrix
Which is the light grey, and which is the dark grey?
Light- Remote homologues Dark - Close homologues
Light- Close homologues Dark- Remote homologues
PAM matrices used varies depending on the S.I
PAM250 scoring is used for remote homologues
In PAM250 scoring, scores are given based on what?
The residue rarity
The properties of the replacing residue- i.e. is one small hydrophobic residue replacing another
What does BLAST stand for?
Basic Loci Alignment Search Tools
Basic Loci Alignment Sequencing Tools
Basic Local Alignment Search Tools
What is different about BLAST techniques?
The compare residue 'words' instead of single residues/bases
They work in areas of less than 50bp long
They cover the whole genome very rapdily
A 'word' is 3bp or 11 aa long
Once the words have been found in the sequence, they are compared to the database- why?
To find exact matches
To find evolutionary relationships
To detect how common the words are
An E value is expectation, not probability
What is the range and significance for the E value? Smaller values are more likely
0-10
1-2
0-1
0.5
5
0.005
The E value is the probability of getting a result by change in a database of this size
What is the difference between P and E values
Both are values for 'What is the probability of getting this result by chance alone?', but P accounts for database size
Both are values for 'What is the probability of getting this result by chance alone?', E accounts for database size
In P values, <0.05 is significant
P value relies on a normal distribution
What are the limits of P?
0-0.1
An E of 4 is equal to - in a database of this size, I could get 4 matches by chance alone