Coner

MB 451 Microbial Diversity

Department of Microbiology - NC State University

Home | Announcements | Course Info | Lectures | Labs | Exams | Term Project | Grades | ~~~null pyro

Problem set : Sequence alignment


  1. Align the following two sequences: Sequence A = GGCCUUCCGGCCACA and Sequence B = GCCCUUCCGGGCGCA

  2. Now add the following sequence to this alignment: Sequence C = GUCCUUUGGACGC

  3. Now add the following sequence to this alignment: Sequence D GACCUUUCGGUCAC

  4. Align the following sequences:
    • Sequence A : GUAGCAGUCCGUGGAUC
    • Sequence B : UAGUAGCAGCCGUGGAUC
    • Sequence C : GUAGCAGGCCGCGGUACC

  5. Align the following sequences:
    • Sequence A : CUCGAGUUAACCCGGCACCCG
    • Sequence B : GCUCGGGUUAACACGGACCCG
    • Sequence C : UCGAGCCAACUCGGACCCG

  6. Align the following sequences:
    • Sequence A : GGAUCGAGAGUUCC
    • Sequence B : UGGAACGAAAGAUCC
    • Sequence C : GGAUCGAUGAGAUCU

  7. Align the following sequences:
    • Sequence A: CCCCAGCUUCGGCUGGGGGAGG
    • Sequence B: CCUUAGCGAAAGCUAAGGAGG
    • Sequence C: CCUCAGCGUGAGCUGAGGAGG
    • Sequence D: CCCAAGCUUUGCUAUGGGAGG

  8. Now add Sequence E: CCAGCUUUGGCUGGAGG

  9. Now add Sequence F : CCAAGCGAGAGCUUGGAGG

  10. Align the following sequences (note that these are in Fasta format, commonly used for the electronic transfer of sequence data):

    >tRNA-A
    GGGCUCAUAGCUCAGCGGUAGAGUGCCUCCUUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA

    >tRNA-B
    GGGCUCAUCGCUCAGCGGUAGAGUGCCUCCCUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA

    >tRNA-C
    GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA

    >tRNA-D
    GGGCUCGUAGCUCAGCGGGAGAGCGCCGCCUUCGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA

    >tRNA-E
    GGGCCGGUAGCUCAGUCUGGUAGAGCGUCGCCUUGGCAUGGCGAAGGCCGGGGUUCAAAUCCCCACCGGU

Answer key

  1. No gaps are required, these just line up!
    Sequence A = GGCCUUCCGGCCACA
    Sequence B = GCCCUUCCGGGCGCA
  2. The placement of the first gap could be one of two places, either is as good as the other.
    Sequence A = GGCCUUCCGGCCACA
    Sequence B = GCCCUUCCGGGCGCA
    Sequence C = GUCCUUU-GGACGC- OR
    Sequence C = GUCCUU-UGGACGC-
    
  3. This one is easy, but it shows which of the two good alignmeents of C is best.
    Sequence A = GGCCUUCCGGCCACA
    Sequence B = GCCCUUCCGGGCGCA
    Sequence C = GUCCUUU-GGACGC-
    Sequence D = GACCUUUCGGUCAC-
  4. This one's trickier because they don't start at the same place. Why do you think the middle gaps work best where they are?
    Sequence A : GUAGCAGU--CCGUGG-AUC
    Sequence B : -UAGUAGCAGCCGUGG-AUC
    Sequence C : GUAGCAG--GCCGCGGUACC
  5. Don't be fooled by the easy ones!
    Sequence A : -CUCGAGUUAACCCGGCACCCG
    Sequence B : GCUCGGGUUAACACGG-ACCCG
    Sequence C : --UCGAGCCAACUCGG-ACCCG
              
  6. Another easy one.
    Sequence A : -GGAUCGA-GAGUUCC
    Sequence B : UGGAACGA-AAGAUCC
    Sequence C : -GGAUCGAUGAGAUCU
        
  7. Don't fall into the trap of over-using gaps. Are there any alternatives about exactly where these gaps should go? Why might one placement be better than another?
    Sequence A: CCCCAGCUUCGGCUGGGGGAGG
    Sequence B: CCUUAGCGAAAGCU-AAGGAGG
    Sequence C: CCUCAGCGUGAGCU-GAGGAGG
    Sequence D: CCCAAGCUUU-GCUAUGGGAGG
         
  8. Positions that are variable in sequence are better places to put gaps than positions that are conserved.
    Sequence A: CCCCAGCUUCGGCUGGGGGAGG
    Sequence B: CCUUAGCGAAAGCU-AAGGAGG
    Sequence C: CCUCAGCGUGAGCU-GAGGAGG
    Sequence D: CCCAAGCUUU-GCUAUGGGAGG
    Sequence E: CC--AGCUUUGGCU---GGAGG
    
  9. Placing these gaps is a balancing act.
    Sequence A: CCCCAGCUUCGGCUGGGGGAGG
    Sequence B: CCUUAGCGAAAGCU-AAGGAGG
    Sequence C: CCUCAGCGUGAGCU-GAGGAGG
    Sequence D: CCCAAGCUUU-GCUAUGGGAGG
    Sequence E: CC--AGCUUUGGCU---GGAGG
    Sequence F: CC-AAGCGAGAGCU--UGGAGG
  10. Long alignments are no different than short ones.
    tRNA-A GGGCUCAUAGCUCAGC--GGUAGAGUGCCUCCUUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA
    tRNA-B GGGCUCAUCGCUCAGC--GGUAGAGUGCCUCCCUUGCAAGGAGGAUGCCCUGGGUUCGAAUCCCAGUGAGUCCA
    tRNA-C GGGCUCGUAGCUCAGC--GGGAGAGCGCCGCCUUUGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA
    tRNA-D GGGCUCGUAGCUCAGC--GGGAGAGCGCCGCCUUCGCGAGGCGGAGGCCGCGGGUUCAAAUCCCGCCGAGUCCA
    tRNA-E GGGCCGGUAGCUCAGUCUGGUAGAGCGUCGCCUUGGCAUGGCGAAGGCC-GGGGUUCAAAUCCCCACCGGU---
Last updated April 05, 2009 by James W Brown