Protein Evolution

Description

BSc Protein Form and Function Mind Map on Protein Evolution, created by Jen Harris on 26/05/2013.
Jen Harris
Mind Map by Jen Harris, updated more than 1 year ago
Jen Harris
Created by Jen Harris over 11 years ago
68
0

Resource summary

Protein Evolution
  1. Common supersecondary motifs
    1. Helix-loop-helix
      1. Coiled-coil
        1. Helix bundle
          1. Beta-alpha-beta unit
            1. Hairpin
              1. Beta-meander
                1. Greek key
                  1. Beta-sandwich
                    1. Combine to form domains
                      1. Independent folding units in tertiary structure
                        1. Individual domains have specific function
                          1. The major driving force in their folding is hydrophobic interactions
                        2. Important evolutionary units
                          1. ~40% of known structures in PDB are multidomain proteins
                            1. Of these >40% contain discontinuous domains
                              1. Domain insertion is a common evolutionary mechanism
                            2. 60-80% of genes in genomes code for multidomain proteins
                            3. Combine with different partners
                              1. Extend functional repertoire
                          2. Comparing proteins
                            1. Basic scoring system
                              1. Sequence identity % = number of identical residues/number of residues aligned x 100
                                1. >35% identity means proteins are evolutionarily related
                                2. Superimposition
                                  1. Root mean square deviation
                                    1. Compares the proximity of residues to one another
                                      1. < 3.5A if related
                                        1. Make alignment
                                          1. Sequence
                                            1. Structure
                                              1. Calculate distance between C-alpha of each pair of aligned residues
                                                1. Pythagoras in 3D
                                                  1. Add together for all residues
                                                    1. Divide by number of residues
                                          2. So - if 2 proteins have 99% identity...
                                            1. They will have a common 3D structure as long as <50 residues have this
                                              1. Their function is probably similar but even one residue different can completely change function.
                                            2. Recognising domains
                                              1. DNA sequence
                                                1. Limited use
                                                  1. Very closely related proteins wil have very similar DNA sequences
                                                    1. Perhaps useful for very short evolutionary distances
                                                  2. Amino acid sequences
                                                    1. Domains have similar amino acid sequences
                                                      1. As they diverge the sequence pattern is lost
                                                      2. Protein structure
                                                        1. Domains have the same fold
                                                          1. Has it been seen before?
                                                          2. Independently folding units
                                                            1. Many computer programs that try to use structural data to identify domains
                                                          3. Each domain takes a specific topology/fold
                                                            1. There is a limit to how many folds are possible in nature
                                                              1. ~10^3-4
                                                                1. Even though there are millions of protein sequences there are only so many different ways that structures can be fitted together!
                                                          4. Why structural similarity?
                                                            1. Divergent evolution from common ancestor
                                                              1. If something works it is unlikely to be selected against
                                                                1. Structure is much more highly conserved than sequence
                                                                  1. Makes sense; If a sequence is different that is absolute. But some amino acids have common properties and hence the structure can be retained even when the sequence is not.
                                                                2. Convergent evolution
                                                                  1. Only so many ways to pack helices and strands in 3D space
                                                                    1. Energetically favourable
                                                                  2. Protein classification
                                                                    1. Domains are an important part of structure
                                                                      1. Structure is conserved because it determines function
                                                                        1. Protein data bank
                                                                      2. CATH
                                                                        1. Class
                                                                          1. Assigned automatically for >90%
                                                                            1. Major secondary structure
                                                                            2. Architecture
                                                                              1. Gross orientation of secondary structures
                                                                                1. Shape of the fold
                                                                                2. Beta-roll
                                                                                  1. Up-down bundle
                                                                                    1. Alphabeta-prism
                                                                                      1. alpha-beta-alpha sandwich
                                                                                      2. Topology
                                                                                        1. Connections
                                                                                          1. Number of 2ry strctures
                                                                                          2. Homologous superfamily
                                                                                            1. Highly similar structures and functions
                                                                                              1. Is there enough evidence for shared evolutionary origin?
                                                                                              2. Process
                                                                                                1. 1. Chop proteins into domains
                                                                                                  1. 2. Sequence and structural analysis programs group by evolutionary and structural families
                                                                                                2. CATH domain structures and sequence relatives in sequenced genomes are fairly representative
                                                                                                  1. Applications
                                                                                                    1. "Ab initio" methods using protein structure
                                                                                                      1. Algorithms for recognising boundaries
                                                                                                        1. Swindells, 1995 - Detective
                                                                                                          1. Each domain should have a recognisable hydrophobic core
                                                                                                          2. Siddiqui and Barton, 1995 - DOMAK
                                                                                                            1. Residues comprising a domain make more internal contacts than external ones
                                                                                                            2. Holm & Sander, 1994 - PUU
                                                                                                              1. Parser for protein folding units
                                                                                                                1. Maximal interaction within domains
                                                                                                                  1. Minimal interaction between domains
                                                                                                                2. Seek consensus; in practice ~20% of cases
                                                                                                                  1. E.g. Beta-amylase
                                                                                                                    1. PUU: 1 domain
                                                                                                                      1. DOMAK: 2 domains
                                                                                                                        1. Detective: 3 domains
                                                                                                                    2. Sequence methods
                                                                                                                      1. Sequence-sequence comparison
                                                                                                                        1. BLAST
                                                                                                                          1. NW
                                                                                                                          2. Sequence-profile comparison
                                                                                                                            1. PSI-BLAST
                                                                                                                              1. Align sequences and colour-code degree of conservation
                                                                                                                                1. If identity <35%
                                                                                                                              2. Structural methods
                                                                                                                                1. Contact map
                                                                                                                                  1. Points of contact between residues in a protein
                                                                                                                                  2. Distance matrices
                                                                                                                                  3. Enter text here
                                                                                                                                2. Structural classification of proteins
                                                                                                                                  1. Class
                                                                                                                                    1. Fold
                                                                                                                                      1. Superfamily
                                                                                                                                        1. Family
                                                                                                                                          1. Species
                                                                                                                                          2. Protein family database Pfam
                                                                                                                                            1. Superfamily
                                                                                                                                              1. Clade
                                                                                                                                                1. Sequence based
                                                                                                                                                  1. Confers with SCOP
                                                                                                                                                2. Methods vary but agree on 60-70% of cases
                                                                                                                                                3. Homology
                                                                                                                                                  1. An absolute value
                                                                                                                                                    1. Either homologous or not!
                                                                                                                                                    2. Orthologs
                                                                                                                                                      1. Common ancestor
                                                                                                                                                        1. Different species
                                                                                                                                                          1. Same function
                                                                                                                                                        2. Paralogs
                                                                                                                                                          1. Common ancestor
                                                                                                                                                            1. Gene duplication
                                                                                                                                                              1. Same or different species
                                                                                                                                                                1. Different but related function
                                                                                                                                                              2. At least 2 conditions must be met:
                                                                                                                                                                1. Significant structural similarity
                                                                                                                                                                  1. Significant sequence similarity
                                                                                                                                                                    1. Functional similarity
                                                                                                                                                                    2. e.g. toxins
                                                                                                                                                                      1. Cholera
                                                                                                                                                                        1. Pertussis
                                                                                                                                                                          1. Heat stable enterotoxin
                                                                                                                                                                            1. High structural similarity
                                                                                                                                                                              1. Related functions
                                                                                                                                                                                1. No evidence that they evolved from a common ancestor
                                                                                                                                                                            2. Rossmann fold
                                                                                                                                                                              1. NAD binding domain
                                                                                                                                                                                1. Cofactor that reversibly accepts a hydride ion
                                                                                                                                                                                  1. Lost or gained by the substrate in the redox reaction
                                                                                                                                                                                    1. Found in all living cells
                                                                                                                                                                                      1. Metabolism
                                                                                                                                                                                        1. Accepts or donates electrons in redox
                                                                                                                                                                                          1. Remove two hydrogen atoms from reactant (R)
                                                                                                                                                                                            1. Hydride ion H-
                                                                                                                                                                                              1. Reduces NAD+ to NADH
                                                                                                                                                                                              2. Proton H+
                                                                                                                                                                                                1. Released into solution
                                                                                                                                                                                        2. 2 nucleotides joined through their phosphate group
                                                                                                                                                                                          1. One contains an adenine base
                                                                                                                                                                                            1. The other nicotinamide
                                                                                                                                                                                        3. One of the most ubiquitous domains
                                                                                                                                                                                          1. Alpha-beta fold
                                                                                                                                                                                            1. Central beta sheet
                                                                                                                                                                                              1. Surrounded by approximately 5 alpha helices
                                                                                                                                                                                                1. Strands in characteristic order 654123
                                                                                                                                                                                                  1. Crossover forms binding site
                                                                                                                                                                                              2. Lactate dehydrogenase
                                                                                                                                                                                                1. Has a Rossman fold at N-terminal domain
                                                                                                                                                                                                  1. Convert L-lactate to pyruvate
                                                                                                                                                                                                    1. Last step in anaerobic glycolysis
                                                                                                                                                                                                    2. C-terminal catalytic domain
                                                                                                                                                                                                      1. Substrate specificity
                                                                                                                                                                                                        1. Precise reaction
                                                                                                                                                                                                        2. Specific to lactate/malate dehydrogenases
                                                                                                                                                                                                          1. Malate dehydrogenase
                                                                                                                                                                                                            1. Interconversion of malate to oxaloacetate
                                                                                                                                                                                                              1. N-terminal domain is a Rossmann fold
                                                                                                                                                                                                                1. C-terminal catalytic domain
                                                                                                                                                                                                                2. Paralogs
                                                                                                                                                                                                                  1. 17% sequence identity in humans
                                                                                                                                                                                                                    1. Duplication event
                                                                                                                                                                                                                      1. BUT structurally very similar
                                                                                                                                                                                                                        1. Function of NAD-binding domain conserved
                                                                                                                                                                                                                          1. Change in sequence confers change in substrate specificity
                                                                                                                                                                                                                3. Human/zebrafish orthologs
                                                                                                                                                                                                                  1. 76% sequence identity
                                                                                                                                                                                                                    1. Suggests speciation from common ancestor
                                                                                                                                                                                                                4. Alcohol dehydrogenase
                                                                                                                                                                                                                  1. Two ADCDs flank a Rossmann fold
                                                                                                                                                                                                                    1. Same structure as LDH
                                                                                                                                                                                                                      1. 17% identity
                                                                                                                                                                                                                    2. Combine with different catalytic domains to achieve different functions
                                                                                                                                                                                                                    3. Sequence diversity
                                                                                                                                                                                                                      1. Different evolutionary constraints in different positions in the protein structure
                                                                                                                                                                                                                        1. Surface residues have the least evolutionary constraints and can also accommodate small insertions and deletions
                                                                                                                                                                                                                          1. Core residues more highly conserved, critical for folding and stability
                                                                                                                                                                                                                            1. Functional residues also highly conserved
                                                                                                                                                                                                                              1. Critical for enzyme function or for protein-protein interactions
                                                                                                                                                                                                                            2. Structural diversity
                                                                                                                                                                                                                              1. The core is usually highly conserved
                                                                                                                                                                                                                                1. Within a family ~50%
                                                                                                                                                                                                                                2. Residue insertions in loops connecting 2ry structures
                                                                                                                                                                                                                                  1. Substitutions can cause shifts in orientations of 2ry structure
                                                                                                                                                                                                                                    1. Domain embellishments
                                                                                                                                                                                                                                    2. Functional diversity
                                                                                                                                                                                                                                      1. Domain superfamily
                                                                                                                                                                                                                                        1. Dependent on the fold
                                                                                                                                                                                                                                          1. Some can support many similar functions
                                                                                                                                                                                                                                            1. P-loop hydrolases
                                                                                                                                                                                                                                            2. Some have limited repertoire of functions
                                                                                                                                                                                                                                              1. Globins
                                                                                                                                                                                                                                          2. 1 amino acid change can change the function of a protein
                                                                                                                                                                                                                                            1. Can share <10% sequence identity but still have the same function in different organisms
                                                                                                                                                                                                                                              1. Defining function
                                                                                                                                                                                                                                                1. Biochemical
                                                                                                                                                                                                                                                  1. Conservation
                                                                                                                                                                                                                                                    1. Chemistry?
                                                                                                                                                                                                                                                      1. Substrate?
                                                                                                                                                                                                                                                        1. Product?
                                                                                                                                                                                                                                                      2. Biological
                                                                                                                                                                                                                                                        1. Is cell localisation conserved?
                                                                                                                                                                                                                                                          1. Myoglobin
                                                                                                                                                                                                                                                            1. Haemoglobin
                                                                                                                                                                                                                                                          2. Schema
                                                                                                                                                                                                                                                            1. Enzyme classification system
                                                                                                                                                                                                                                                              1. GO terms
                                                                                                                                                                                                                                                        2. Challenges in comparing protein structures
                                                                                                                                                                                                                                                          1. Insertions or deletions of residues
                                                                                                                                                                                                                                                            1. Usually in connecting loops not secondary structures
                                                                                                                                                                                                                                                              1. Indels
                                                                                                                                                                                                                                                                1. Structural cores usually highly conserved
                                                                                                                                                                                                                                                                  1. Can still be considerable structural differences between relatives outside the core.
                                                                                                                                                                                                                                                                  2. Insertions usually in loops connecting secondary structures
                                                                                                                                                                                                                                                                    1. Substitutions can shift orientations of secondary structures
                                                                                                                                                                                                                                                                      1. Coping
                                                                                                                                                                                                                                                                        1. Ignore variable loop regions
                                                                                                                                                                                                                                                                          1. Only compare secondary structures
                                                                                                                                                                                                                                                                          2. Use algorithms which explicitly handle insertions/deletions
                                                                                                                                                                                                                                                                            1. Dynamic programming
                                                                                                                                                                                                                                                                              1. Simulated annealing
                                                                                                                                                                                                                                                                        2. CATHEDRAL
                                                                                                                                                                                                                                                                          1. Combines rapid graph theory secondary structure filter with dynamic programming
                                                                                                                                                                                                                                                                            1. Accurate residue alignment
                                                                                                                                                                                                                                                                              1. SVM
                                                                                                                                                                                                                                                                                1. Combine scores
                                                                                                                                                                                                                                                                                  1. Assess significance of match
                                                                                                                                                                                                                                                                                2. Scan against all domain structural representatives in CATH
                                                                                                                                                                                                                                                                                3. Fast structure comparison
                                                                                                                                                                                                                                                                                  1. Dihedral angles + chirality
                                                                                                                                                                                                                                                                                    1. Create overlap graph
                                                                                                                                                                                                                                                                                      1. Largest common structural motif
                                                                                                                                                                                                                                                                                        1. Compare using Bron Kerbosch algorithm
                                                                                                                                                                                                                                                                                          1. Largest common graph
                                                                                                                                                                                                                                                                                      2. Generally ~1000x faster than residue-based methods
                                                                                                                                                                                                                                                                                    2. SSAP
                                                                                                                                                                                                                                                                                      1. Residue-based method
                                                                                                                                                                                                                                                                                        1. Double dynamic programming
                                                                                                                                                                                                                                                                                        2. Compare vector environments
                                                                                                                                                                                                                                                                                          1. Path matrix
                                                                                                                                                                                                                                                                                            1. Generate path for each
                                                                                                                                                                                                                                                                                              1. Compare with dynamic algorithm
                                                                                                                                                                                                                                                                                                1. Add path to summary matrix
                                                                                                                                                                                                                                                                                                  1. Apply dynamic algorithm to summary matrix
                                                                                                                                                                                                                                                                                                    1. Final alignment
                                                                                                                                                                                                                                                                                        Show full summary Hide full summary

                                                                                                                                                                                                                                                                                        Similar

                                                                                                                                                                                                                                                                                        Repair of DNA double strand breaks by protein repair machines
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Protein folding
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Other structural methods
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Nuclear Magnetic Resonance
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Recognition and repair of deaminated pyrimidines
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Protein misfolding
                                                                                                                                                                                                                                                                                        Jen Harris
                                                                                                                                                                                                                                                                                        Protein evolution
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        DSB repair by protein machines
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Introduction
                                                                                                                                                                                                                                                                                        Jen Harris
                                                                                                                                                                                                                                                                                        Protein misfolding
                                                                                                                                                                                                                                                                                        sophie_connor
                                                                                                                                                                                                                                                                                        Double strand break repair by protein repair machines
                                                                                                                                                                                                                                                                                        sophie_connor