|
Home | Announcements | Course Info | Lectures | Labs | Exams | Term Project | Grades |     |
<- Your Data | Tree Construcion ->
Term Project Part 2 - Database Searches
Getting started |
Your next task is to perform a search of the Ribosomal Database Project with
your sequence(s). This will give you a good idea of what kind
of organisms your sequences might come from.
Logging-in to the RDP web site
The URL of the RDP web site is: http://rdp.cme.msu.edu/
Click on the web address above to go there. This link
will open the RDP web page in a new browser
window, so you can go back-and-forth between the RDP site and
these directions. |
Loading your sequence into the RDP |
- On the RDP web site, click on the link for "myRDP". This takes you to a myRDP login page. You don't need an account to use this, however; just click on the "Test Drive" button. Now you're on the myRDP overview page. This page lists all of the public user data.
- Click on the "Upload" button. On the upload page, use these settings:
- Choose gene...: Bacterial 16S rRNA (you aren't likely to have an archaeal sequence)
- Assign group name : enter "MB451" followed by your initials, e.g. "MB451 JWB"
- Project : enter your full name
- Choose .clip file to load : click this botton, & navigate to wherever you saved the .clip data file for your sequence.
- Now click "upload". If there's a problem with your sequence, it'll let you know & return you to the Upload page. If it looks OK, it'll tell you there's 1 sequence in the file & ask if you want to load it - click "Continue".
- If you have more than one good sequence, repeat this process with these sequences as well.
- Your sequences should now appear at the top on the myRDP Overview page list. While you're doing other things, it will align your sequence(s) to the database; when it's done, the "1" will move from the "pending" column to the "A" (aligned) column.
- Click on the grey "+" box in front of your sequence listing(s). They should now be red "-" boxes. This adds your sequences to your working list
|
Sequence Match
|
Sequence Match is used to identify the most similar sequences in the RDP to yours.
- Now, click on the link to "Sequence Match" in the menubar at the top of the page.
- Scroll to the bottom of the page, & use the following settings:
- Strains : Type - to show only defined species in the results
- Source : Isolates - to exclude sequences from uncultivated organisms.
- Size : >1200 - so that only nearly full-length sequences are included.
- Quality : Good - to exclude potentially poor sequence data.
- KNN matches : 20 - so the best 20 sequence matches will be shown.
- Click on the "Do Seqmatch with Selected Sequence" button and wait for the results - usually less than a minute.
- Look at the "Hierarchy View" - this gives you the taxonomy (lineage) of the sequence(s) as the RDP sees it; Domain, Phylum, &c, &c, down usually to the genus (depending on how closely related your sequence is to something in the database).
- Click "Show Printer Friendly Results" to see the details. There will be a list of the 20 best matches in the database, and the similarty of these sequences (S_ab) is shown in orange (the similarity score in purple will probably not be calculated). S_ab is a complex similarity score, but 2 identical sequences will have a score of 1.0, and the closer the score is to 1.0, the more similar the sequences are.
- If none of the matches are very close (less than 0.5), try going back and changing the "Size" setting from the default of ">1200" to "Both" to allow the program to search the shorter sequences in the database. If you can't get any matches above S_ab=0.5, please see me about it ASAP.
- Once you have an informative lineage, print out this page.
- Look through the resulting sequence list and find the best match (highest S_ab or similarity scores), and click on the number in front of it (it should look something like "S000463918") to pull up it's sequence record. Print out this page. If you have ties, print them all out.
|
Classifier |
Classifier is used to estimate the taxonomy of your sequence.
- Click the link to "Classifier" in the menubar at the top of the page.
- Click the "Do Classification with Selected Sequences" button and wait for the results - it should just be a few seconds.
- Look at the "Hierarchy View" - this gives you the taxonomy (lineage) of the sequence according to this analysis.
This should look a lot like the results for the "Sequence Match".
- Change the "Confidence Threshold" to it's lowest level - 50% - and see if this changes the result (it usually doesn't).
- Print out this page.
|
Critical reminder! |
Remember that what you have identified is the closest relative of your isolate whose 16S rRNA sequence is available in the RDP.
You have not identified your isolate unless it is a perfect match - and even then you can't be sure! |
|
| Last updated
April 03, 2009
by James W Brown |