You can view your sequencing data by opening the .pdf files you downloaded. Look carefully at your data. How does it look? Here is an example section from the beginning of a good sequence:

At the top is the sequence as the machine interprets it, from
left to right, numbered just beneath. This example is from the
start of the sequence - notice the sequence numbering "10",
then "20" below to printed sequence. Below both the
interpreted sequence and numbering is the raw data from the sequencing
machine.
Some sequences don't
start off this cleanly - the sequence only becomes clear after
a few bases.
The sequence reads directly from the printout. Hopefully the
first 500 bases of sequence (after perhaps a dozen or so if it
has a rough start) should be reliable. Somewhere between 500 and 800, the sequence quality
will degrade to the point of unreliability.
If your sequence comes from more than one template, i.e. your culture wasn't pure or the PCR reaction was contaminated, you will have sequences in which some peaks look good (if both sequences have the same base at that position) and some are two peaks in the same place (where the two sequence differ):

If one of the sequences is much stronger than the other, this is no problem; the extra peak will be small compared to the main peak, and the machine can correctly read the stronger sequence. If they are close to the same strength, the machine will not correctly read either sequence. If the two sequences are from very closely-related organisms, these double peaks may be sporatic, and concentrated in the most variable regions of the rRNA. If they are distantly-related organisms, the double peaks will be more common, as as soon as the two sequences hve a difference in length (an insertion/deletion relative to each other), they will be out of sync and most of the peaks will be twined.
Print out a copy of your sequencing data (the .pdf file); you'll need this to turn in with your Term Project.
Now open the .clip file in a text editor (Notepad, Word, TextEdit, whatever), and print it out. This is the part of your sequence that the computer program in the sequencing machine has filtered and thinks is reliable. This is the sequence you'll actually use for your analysis. Go back to the printout of the .pdf of your data, and highlight the region of this sequence that is in the .clip file.
Be sure to open and look at (and print out) the data for all of your PCR reactions. |