Homework: Computing Human Evolution
The files for this homework are in hw02.zip
. The only file you need to modify is
"dna.ml". Once you are finished, be sure to submit
Frequently Asked Questions
The course staff has documented some of the most frequently asked questions for this assignment
from past semesters in an FAQ.
- There are 90 points possible on the autograded part of this
assignment, plus 10 points that will be added during a later phase of
manual grading. Style and tests will each account for 5 points.
The functions for which we will specifically grade your your tests are as follows:
- Although we will grade only the tests for the problems defined above,
you must be sure to always exhaustively test all functions in your program.
- You may submit without penalty up to THREE times.
- Each extra submission costs you 5 points.
Biologists use evolutionary trees (Figure A below) to show species
evolving from ancestor species. To practice using enumerations and
recursion over lists and trees, we will write a program that automatically
generates hypothesis trees like the one in Figure A.
Figure A: Evolutionary Tree for Apes
Our program input will be real samples of DNA, the genetic code that describes
how to build organisms (Figure B). DNA has two complementary helices, each a
sequence of the nucleotides adenine, thymine, guanine, and cytosine.
Adenine always appears opposite thymine; same for guanine and cytosine.
Figure B: Modeling the DNA Double Helix
The assignment file "dna.ml" contains the DNA sequences for 8 ape species. This is
real data from the Entrez Nucleotide database
You will use this data to
- Generate all possible evolutionary trees with the ape DNA at
the leaves. Internal nodes in these trees correspond to ancestor
- Estimate each tree's complexity.
- Choose the simplest tree as a candidate.
- Compare your candidate with the tree in Figure A.
For example, given the four DNA helices ("GCAT", "TCGT", "TAGA" and "GAGA")
one possible evolutionary tree is shown in Figure C below. This constitutes an
evolutionary tree for the given helices, since the helices are at the leaves
of the tree, while the internal nodes have as labels helices that are in some
sense as close as possible to the labels of their children. Specific details
on constructing evolutionary trees are provided in the homework file.
Figure C: Completely Labeled Tree with Helices of Length Four
Browse these links for more background information:
We derive our images from Wikimedia sources available under Creative Commons licenses.