Alternate Versions of this document

Bioinformatics Education Dissemination: Reaching Out  Connecting and Knitting-together


West Nile Virus: Using Nucleotide Data At An Ecological Scale

Author: Stacey Kiser
Biology Instructor
Lane Community College
Eugene, OR


This lab focuses on one very specific area in today's lab for which the WNV RNA has been sequenced. Obviously this is a much larger problem in ecological terms. See for the Centers for Disease Control and Prevention's web site.

Today's objectives are:

Learn how to use the Biology Workbench site to compare DNA sequences

Build a phylogenetic tree to solve the problem presented in the case study about the White Stork in Israel.

Compare your trees and persuade your peers of your answer.

Warning: The Biology Workbench pages are research-level web sites. As students, we will be accepting the default settings for most of the programs (at this stage). You will also have to be patient and scroll many times to find the control buttons. Call me over any time you need some help or find yourself stuck.

Step 1: Starting your own session

Starting a Session:

Go to the Biology Workbench and select the Enter the Biology Workbench link to begin a session.

Select the Set up a free account link and follow the instructions to set up your own free account. You are now able to work on your problem (or future sequence problems) from any internet computer. Your saved sequences and alignments will be saved on a portion of the supercomputer site at San Diego State University for some time (months or years).

Record your account name: 

The most basic structure of the site asks you to choose a specific set of Tools, which offers you a specific set of commands. Scroll down the information screen that comes up after you log in. Near the bottom you will find the Tool Bar.

Choose Session Tools to begin a new session for your lab group. 

To start a new session, select the Start New Session option in the choice window and hit the Run button. It will ask you to type in a name for your session. Enter something that identifies your group, then hit the Start New Session button. You will then return to the above web page. Your new session name should be highlighted (instead of the Default Session example above). If it is not highlighted, scroll down through the list of session names and click on the grey button next to your session name to highlight it.

Your group name is:

You all can access your group's session from any internet connection again by simply logging in to my account and selecting your group's session name at this web page. Select the Resume Session choice then Run to get back into your previous session. 

Make sure that the radio button next to your new session name is selected, then click on the Nucleic Tools hyperlink to start searching for a protein.

Nucleic acids are the building blocks of DNA and RNA. The genetic coding for the West Nile Virus is RNA, but we can decode the RNA into the complimentary DNA and compare the sequences from the different bird samples collected in Israel. Remember, viruses can mutate very quickly, and these differences will show up in their genetic code. That way we can begin to build a tree that describes their relationships based on the accumulated changes in their DNA/RNA.

Step 2: Finding Nucleic Sequences

We next need to import the sequences from the original research paper, “Introduction of West Nile virus in the Middle East by Migrating White Storks.” The sequences that the authors, Malkinson et. al. used in their research are publicly available at the NCBI web site

Go there and use the Entrez search function to find Malkinson's original paper and the links to the nucleotide sequences. Search for “Malkinson m” to narrow the choices.

You will get a screen with multiple returns for “Malkinson m.” Choose the PubMed database and search the results until you get the link for the “Introduction of West Nile virus” paper. On the far right is a Links hotlink. Clicking on that brings up anything publicly available as a link to this paper, including nucleotide and protein sequences. 

Chose the nucleotide link and examine the titles on the next page. One of the sequences listed is the entire genome for the virus. This is much too large to align, so we will not import that sequence to our Workbench session. Select the four partial envelope protein sequences: two geese “Goo”, one stork “ST” and one gull “Gull”. We need to save the sequence data to our lab computers as a FASTA file then upload the sequence data to the Workbench site. 

To save the files as FASTA, select the FASTA option under the Display controls, then chose File under the Send to controls and click on Send to to save the data. The computer will probably ask you to either save the file or open it up with an application of your choosing. Save the file to the desktop giving it a name you can recall when you need to upload the file to the Workbench site.

Step 3: Uploading Nucleic Sequences

Now that we have saved the nucleotide sequences that we want to compare, we need to upload them to our Workbench site. Highlight Add New Nucelic Sequence then hit Run

You can then use the Browse feature to find the FASTA file you just saved on our computer. Once you have that selected and it is showing in your Browse window, click Upload File. The sequences should now show up in the windows below the Browse window.

We can then edit the names of the sequences so we can more easily see the source. While the sequences are still visible, delete some of the information in the sequence box (not the label box) so that the name better reflects the source of the data:


Delete everything from “gi_” to right before “ISR98”. Leave the > symbol at the beginning. Repeat this for all four sequences, click Upload and then Save to save the sequences to your Workbench files.

Step 4: Aligning Nucleic Sequences

The next step is to compare the sequences for differences and similarities like we did with the paper exercise alignments. After saving the edited sequences, select all four by clicking on the boxes to the left of their names. Scroll down through the tool selection screen until you find CLUSTALW - Multiple Sequence Alignment. Highlight that and hit Run.

The next screen has a series of settings on it. These settings can be very important to researchers, and you may want to investigate those in a future course. For right now we are going to accept all the default settings. Hit Submit. This part may take a few seconds as the computer aligns the sequences based on a series of mathematical calculations.

Scroll down the next window. You will see a color-coded sequence alignment, a phylogenetic tree and some statistical data about the alignments. Use these to answer the following questions about your data.


1. In how many places are there differences among the DNA sequences? Describe these points of difference and make sure to identify which sequesnce(s) differ(s) from the other(s).

2. Carefully look at the years of the samples. Birds use the middle eastern area as a stop-over when migrating twice a year between European breeding grounds and African over-wintering areas. Go to the NCBI web site and look up the whole Malkinson et. al. paper “Introduction of West Nile virus in the Middle East by migrating white storks.” Find information on the birds they sampled and whether or not they were migrants.




3. Based on the above information about migration, try to describe the spread of the WNV based upon a steady change in the virus' DNA and exchange among birds, either local or migratory. Ask your instructor if you need some help answering this question.

Step 5: Extended Exercises