This project was prepared as part of a BioQUEST faculty development workshop entitled Bioinformatics in Biology Education: Working with Sequence, Structure and Function at University of Vermont in April 2003. The BioQUEST Curriculum Consortium is committed to the reform of undergraduate biology instruction through an emphasis on engaging students in realistic scientific practices. This approach is sometimes characterized as an inquiry driven approach and is captured in BioQUEST's three P's (problem-posing, problem-solving, and peer-persuasion). As part of this workshop groups of faculty were encouraged to initiate innovative curricular projects. We are sharing these works in progress in the hope that they will stimulate further exploration, collaboration and development. Please see the following links for additional information:

Upcoming events               BEDROCK Problem Spaces

 
Bioinformatic Assignments in Computer Science Courses
 
 
Authors          Audiences          Overview           Materials          Resources           Future Directions
 

 


Authors


Jim Mahoney
Marlboro College


Peter Wilkinson
McGill University

 
   
 


Possible Audiences:

Computer Science instructors for classes at various levels, or programming in biology classes  

 
 


Brief Overview:

The two of us spent an hour on Saturday afternoon tossing around ideas on how to put bioinformatics material into computer science courses, particularly in open-ended exploratory projects. In this case the point is not primarily to help the students learn NCBI tools to study a biology question, but instead to use the biological problems as examples of the computer science issues behind these tools.  

 
   
 


Project Materials:

Some examples of these kinds of problems include

  • information management issues
    • various types of datbases, file formats, API's
    • relational databases / SQL
    • XML
    • parsing text files (.fasta, ...)
  • genetic sequences as codes, including
    • parsing it (finding open reading frames, start,stop, ...)
    • differences between nucleic, mitochondrial, chloroplast
    • how to characterize it (codon frequency, Shannon's entropy, ...)
  • discrete mathematics of phylogenetic trees
    • calculating distances between given sequences
    • creating trees by hand, then with a program
  • scripting techniques and WWW agents
    • automating routine tasks like those seen this weekend
    • web scraping, parsing, ...
  • user interface issues
  • 3D graphics and data visualization techniques
 

 
 


Resources and References:

http://bioquest.org/bedrock/sunnyvale_workshop/project3.htm
a similar effort at another BEDROCK workshop

http://www.bio.cam.ac.uk/seqlogo/
Sequence Logos - another way to graphically present bioinformatic data

http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html
Claude Shannon's paper on information theory

http://www.maths.abdn.ac.uk/~igc/tch/mx4002/notes/node59.html
background on Huffman codes

http://aleph0.clarku.edu/~djoyce/java/Phyltree/cover.html
phylogenetic trees

http://bioweb.uwlax.edu/GenWeb/Molecular/Seq_Anal/Translation/translation.html
translation and open reading frames

http://www.bioperl.org
perl Bio module

http://www.biopython.org
python Bio module

http://www.biosql.org
Bio SQL project

http://www.mysql.org
mysql database

 

 
   
 


Attachments


- ESTproblem.doc