In this homework, we’ll be working with some of OCaml’s features for abstraction and modularity.

This is the first homework where you’ll be submitting multiple files:

**higherOrder.ml**: practice with higher-order functions and generic types**tree.ml**: practice with binary trees and binary search trees**setTest.ml**: a reuseable module that we’ll use for testing different set implementations**listSet.ml**: contains`OLSet`

, an implementation of the set abstract data type using sorted lists**treeSet.ml**: contains`BSTSet`

, an implementation of the set abstract data type using binary search trees**vocab.ml**: some practice with sets and higher-order functions

You must submit all six files to run the online testers.

In addition to these files, we have provided you with
**setInterface.ml**, which defines the `'a set`

abstract data type and the operations which can be performed on it. Although
you should read this file thoroughly, you *should not* modify it (doing
so might cause your homework not to compile on our submission server, and you
will receive no credit)!

You should do the work in the order specified below—our server will grade in that order, and it will be easier for you!

When you configure your OCaml project, make sure to make executables for the six files listed above.

Notice that while setTest.ml contains generic tests for *all* the set
interfaces, if there are specific test cases you’ve written within either the
`OLSet`

or `BSTSet`

modules, you will need to run the
appropriate executable to run those tests.

The goal of this homework is to implement several generic data structures and then use them to perform a basic, real-world task. The homework consists of eight problems, divided into three parts.

You *may not* use
any OCaml library functions in this homework assignment, unless explicitly
instructed to.

There is also an FAQ document for this homework. Now download the homework and follow the instructions below to get started—good luck!

In the first problem, you will get some practice with functions that use
*generic types*. Problem 2 introduces *higher-order functions*
(which take another function as an argument, and/or produce a new function as a
result). You should write as many tests as you deem necessary to ensure the
correctness of your functions.

Problem 3 introduces *binary trees* and some recursive functions that
operate on trees. Problem 4 gives you practice with *binary
search trees*. Although implementations of some of these
functions are available in the lecture notes, you should be confident that you
understand the code you are submitting.

There’s no code to write for this part, but you should read through this
file in its entirety. This file defines a *module signature*, also known
as an *interface*, for the `'a set`

abstract data type. The
word “abstract” means that we can define a set in terms of its behavior and
mathematical properties, rather than its concrete implementation—as we’ll see
later on, lists, binary search trees, and many other structures (even
functions!) can be used to represent sets.

You should familiarize yourself with the definitions and values in this module before moving on to setTest.ml. (Remember, understand the problem! We’ve written the interface for you.)

It might be helpful to keep this file open in an Eclispse tab or separate
window while you work on the rest of the assignment. Remember, *do not*
modify this file.

This file introduces a reuseable module that can be used to test other
modules which conform to the `SET`

interface defined in
setInterface.ml. Because we will be implementing `SET`

in two
different ways, we can save some work by writing test cases against the
interface (since the two implementations have that in common).

You should make sure to *thoroughly* test all of the values defined
in setInterface.ml. Your TAs will be manually grading the completeness of your
test coverage in Problem 5. Finish writing your tests before moving on to the
next file!

Problem 6 is the first concrete implementation of the `SET`

interface. It defines a module *structure* that conforms to
`SET`

, and so it must contain all of the functions defined in the
interface.

Inside this module, we tell the OCaml compiler that a `'a set`

is
actually a `'a list`

, and so they are equivalent for purposes of the
implementation. However, because the external interface doesn’t specify what a
`'a set`

actually is, the fact that it’s a list is not made
available to users outside the `OLSet`

module. (That’s why you had
to write test cases using only the `SET`

interface.)

In addition to just implementing the interface, you will be responsible two
representation *invariants* about the lists in the function.

- The lists must be sorted in ascending order.
- The lists may not contain duplicate elements.

If you have a value of type `'a set`

, you can assume it conforms
to these invariants. Note that an arbitrary `'a list`

may not. These
invariants make particular functions easier to implement, while some have to be
a bit more careful about ensuring that all the `'a set`

s you create
conform to the invariants.

Part of your grade for this part will be based on how well you maintain and leverage the invariants in your implementation. We will be automatically testing this. Note that some functions, while technically correct, are not the most optimal implementations given your invariants. Think carefully about your inputs and outputs!

You can define as many helper functions as you want in your
solutions to these problems, but don't edit the `SET` interface or any of
the `mli` files! Our tester relies on the interface we’ve given you. If you
change it, your code may not be gradeable.

This is another implementation of the `SET`

interface, but
instead of lists, we will use binary search trees as the backing data
structure. Remember to pay attention to the BST invariants and maximize code
reuse where possible.

This file explores using set data structures to solve a practical problem—how many SAT words show up in a body of text?

There are 100 total points, broken down as follows:

- Problem 1 (Generics): 8
- Problem 2 (Higher-order functions): 12
- Problem 3 (Trees): 12
- Problem 4 (BSTs): 8
- Problem 5 (Testing sets): 5 (manually graded)
- Problem 6 (Lists as sets): 15
- Problem 7 (BSTs as sets): 15
- Problem 8 (Vocab): 15
- Kudos Problem (Frequency Table): n/a
- Invariants: 5 (manually graded)
- Coding style: 5 (manually graded)

As with Homework 2, we will be manually grading a portion of your homework. You will get a maximum of 85 points when you submit.

For this assignment, you may submit without penalty up to **3**
times. Each additional (on time) submission costs you 5 points.