CSci 493 Syllabus        FAQ



Problem Set 3: Agreement Forests
CSci 493: Algorithms for Tree Distributions
Fall 2022


Learning Objective: to recap standard tree traversal techniques and gain fluency with the biopython phylo and nexus submodules.

General Notes

Program Descriptions

Program 3.1: Subtree Replacement.Due 10am, Wednesday, 26 September.
Learning Objective: to use built-in functions to manipulate tree objects.
Available Libraries: Biopython and core Python 3.7+.

Description: Write a program that takes a tree (as a Newick string) and a pair of leaves, and returns a new tree with the subtree containing those leaves replaced by a placeholder.

For example, if the user enters:

((1,2),((3,4),5))
as the tree, and 3 5, your program should find the least common ancestor of the leaves (in this case, the parent of (3,4) and 5), and replace it by a placeholder:
((1,2),A)

If instead, the leaves entered were 3 4, the output would be:

((1,2),(A,5))

Program 3.2: Cluster Reduction Rule.Due 10am, Thursday, 27 September.
Learning Objective: to use built-in functions to manipulate tree objects.
Available Libraries: biopython and core Python 3.7+.

Description: Write a program that takes two trees (as a Newick strings). If the trees have no identical subtrees, the two inputted trees are returned. Otherwise, an identical subtree (with 2 or more leaves) is replaced by a placeholder and the modified trees are returned.

For example, if the user enters:

((1,2),((3,4),5)); ((((1,2),3),4),5);
The only subtree in common is (1,2), so the output is:
(A,((3,4),5)); (((A,3),4),5);

For example, if the user enters:

((1,2),((3,4),(5,6)); ((((1,2),3),4),(5,6));
There are two subtrees in common: (1,2) and (5,6). Your program only needs to replace one of them. So, a possible output is:
((1,2),((3,4),A)); ((((1,2),3),4),A);

Program 3.3: Forest Equality.Due 10am, Friday, 28 September.
Learning Objective: to use built-in functions to manipulate tree objects.
Available Libraries: biopython and core Python 3.7+.

Description: Write a program that takes two forests and returns true if they are equal and false otherwise.

For example, if the user enters:

(1,2); ((3,4),5)); (6);
as the first forest and
(1,(6,2); ((3,4),5));
as the second forest, your program should return false, since the first forest has 3 trees and the second forest has 2 trees, so are not equal.

But, if the user enters:

((3,4),5)); (1,(2,6);
as the first forest and
(1,(6,2); ((3,4),5));
as the second forest, the forests have the same number of trees, and the trees can be matched to equivalent trees in the other forest, so, it should return true.