Abstraction and Modularity

In this homework, we’ll be working with some of OCaml’s features for abstraction and modularity.

Assignment Overview

This is the first homework where you’ll be submitting multiple files:

higherOrder.ml: practice with higher-order functions and generic types
tree.ml: practice with binary trees and binary search trees
setTest.ml: a reuseable module that we’ll use for testing different set implementations
listSet.ml: contains OLSet, an implementation of the set abstract data type using sorted lists
treeSet.ml: contains BSTSet, an implementation of the set abstract data type using binary search trees
vocab.ml: some practice with sets and higher-order functions

You must submit all six files to run the online testers.

In addition to these files, we have provided you with setInterface.ml, which defines the 'a set abstract data type and the operations which can be performed on it. Although you should read this file thoroughly, you should not modify it (doing so might cause your homework not to compile on our submission server, and you will receive no credit)!

You should do the work in the order specified below—our server will grade in that order, and it will be easier for you!

Setting Up

When you configure your OCaml project, make sure to make executables for the six files listed above.

Notice that while setTest.ml contains generic tests for all the set interfaces, if there are specific test cases you’ve written within either the OLSet or BSTSet modules, you will need to run the appropriate executable to run those tests.

Instructions

The goal of this homework is to implement several generic data structures and then use them to perform a basic, real-world task. The homework consists of eight problems, divided into three parts.

You may not use any OCaml library functions in this homework assignment, unless explicitly instructed to.

There is also an FAQ document for this homework. Now download the homework and follow the instructions below to get started—good luck!

higherOrder.ml

In the first problem, you will get some practice with functions that use generic types. Problem 2 introduces higher-order functions (which take another function as an argument, and/or produce a new function as a result). You should write as many tests as you deem necessary to ensure the correctness of your functions.

tree.ml

Problem 3 introduces binary trees and some recursive functions that operate on trees. Problem 4 gives you practice with binary search trees. Although implementations of some of these functions are available in the lecture notes, you should be confident that you understand the code you are submitting.

setInterface.ml

There’s no code to write for this part, but you should read through this file in its entirety. This file defines a module signature, also known as an interface, for the 'a set abstract data type. The word “abstract” means that we can define a set in terms of its behavior and mathematical properties, rather than its concrete implementation—as we’ll see later on, lists, binary search trees, and many other structures (even functions!) can be used to represent sets.

You should familiarize yourself with the definitions and values in this module before moving on to setTest.ml. (Remember, understand the problem! We’ve written the interface for you.)

It might be helpful to keep this file open in an Eclispse tab or separate window while you work on the rest of the assignment. Remember, do not modify this file.

setTest.ml

This file introduces a reuseable module that can be used to test other modules which conform to the SET interface defined in setInterface.ml. Because we will be implementing SET in two different ways, we can save some work by writing test cases against the interface (since the two implementations have that in common).

You should make sure to thoroughly test all of the values defined in setInterface.ml. Your TAs will be manually grading the completeness of your test coverage in Problem 5. Finish writing your tests before moving on to the next file!

listSet.ml

Problem 6 is the first concrete implementation of the SET interface. It defines a module structure that conforms to SET, and so it must contain all of the functions defined in the interface.

Inside this module, we tell the OCaml compiler that a 'a set is actually a 'a list, and so they are equivalent for purposes of the implementation. However, because the external interface doesn’t specify what a 'a set actually is, the fact that it’s a list is not made available to users outside the OLSet module. (That’s why you had to write test cases using only the SET interface.)

Representation Invariants

In addition to just implementing the interface, you will be responsible two representation invariants about the lists in the function.

The lists must be sorted in ascending order.
The lists may not contain duplicate elements.

If you have a value of type 'a set, you can assume it conforms to these invariants. Note that an arbitrary 'a list may not. These invariants make particular functions easier to implement, while some have to be a bit more careful about ensuring that all the 'a sets you create conform to the invariants.

Part of your grade for this part will be based on how well you maintain and leverage the invariants in your implementation. We will be automatically testing this. Note that some functions, while technically correct, are not the most optimal implementations given your invariants. Think carefully about your inputs and outputs!

You can define as many helper functions as you want in your solutions to these problems, but don't edit the SET interface or any of the mli files! Our tester relies on the interface we’ve given you. If you change it, your code may not be gradeable.

treeSet.ml

This is another implementation of the SET interface, but instead of lists, we will use binary search trees as the backing data structure. Remember to pay attention to the BST invariants and maximize code reuse where possible.

vocab.ml

This file explores using set data structures to solve a practical problem—how many SAT words show up in a body of text?

Grading

There are 100 total points, broken down as follows:

Problem 1 (Generics): 8
Problem 2 (Higher-order functions): 12
Problem 3 (Trees): 12
Problem 4 (BSTs): 8
Problem 5 (Testing sets): 5 (manually graded)
Problem 6 (Lists as sets): 15
Problem 7 (BSTs as sets): 15
Problem 8 (Vocab): 15
Kudos Problem (Frequency Table): n/a
Invariants: 5 (manually graded)
Coding style: 5 (manually graded)

As with Homework 2, we will be manually grading a portion of your homework. You will get a maximum of 85 points when you submit.

For this assignment, you may submit without penalty up to 3 times. Each additional (on time) submission costs you 5 points.