This project was prepared as part of a BioQUEST faculty development workshop entitled Bioinformatics in Biology Education: Working with Sequence, Structure and Function at University of Vermont in April 2003. The BioQUEST Curriculum Consortium is committed to the reform of undergraduate biology instruction through an emphasis on engaging students in realistic scientific practices. This approach is sometimes characterized as an inquiry driven approach and is captured in BioQUEST's three P's (problem-posing, problem-solving, and peer-persuasion). As part of this workshop groups of faculty were encouraged to initiate innovative curricular projects. We are sharing these works in progress in the hope that they will stimulate further exploration, collaboration and development. Please see the following links for additional information:

Upcoming events               BEDROCK Problem Spaces

Bioinformatic Assignments in Computer Science Courses
Authors          Audiences          Overview           Materials          Resources           Future Directions



Jim Mahoney
Marlboro College

Peter Wilkinson
McGill University


Possible Audiences:

Computer Science instructors for classes at various levels, or programming in biology classes  


Brief Overview:

The two of us spent an hour on Saturday afternoon tossing around ideas on how to put bioinformatics material into computer science courses, particularly in open-ended exploratory projects. In this case the point is not primarily to help the students learn NCBI tools to study a biology question, but instead to use the biological problems as examples of the computer science issues behind these tools.  


Project Materials:

Some examples of these kinds of problems include

  • information management issues
    • various types of datbases, file formats, API's
    • relational databases / SQL
    • XML
    • parsing text files (.fasta, ...)
  • genetic sequences as codes, including
    • parsing it (finding open reading frames, start,stop, ...)
    • differences between nucleic, mitochondrial, chloroplast
    • how to characterize it (codon frequency, Shannon's entropy, ...)
  • discrete mathematics of phylogenetic trees
    • calculating distances between given sequences
    • creating trees by hand, then with a program
  • scripting techniques and WWW agents
    • automating routine tasks like those seen this weekend
    • web scraping, parsing, ...
  • user interface issues
  • 3D graphics and data visualization techniques


Resources and References:
a similar effort at another BEDROCK workshop
Sequence Logos - another way to graphically present bioinformatic data
Claude Shannon's paper on information theory
background on Huffman codes
phylogenetic trees
translation and open reading frames
perl Bio module
python Bio module
Bio SQL project
mysql database




- ESTproblem.doc