CIS 1200

Homework 3: Abstraction and Modularity

In this homework, we’ll be working with some of OCaml’s features for abstraction and modularity.

Assignment Overview

This is the first homework where you’ll be working in multiple files:

higherOrder.ml: Problems 1 and 2. Practice with higher-order functions and generic types

tree.ml: Problems 3 and 4. Practice with binary trees and binary search trees

setTest.ml: Problem 5. A reuseable module that we’ll use for testing different set implementations

listSet.ml: Problem 6. Contains OrderedListSet, an implementation of the set abstract data type using sorted lists

treeSet.ml: Problem 7. Contains BSTSet, an implementation of the set abstract data type using binary search trees

vocab.ml: Problem 8. Further practice with sets and higher-order functions

In addition to these files, we have provided you with setInterface.ml,which defines the abstract data type 'a set and the operations that can be performed on it. Although you should read this file thoroughly, you should not modify it in any way (and doing so might cause your homework not to compile on our submission server, and you will receive no credit)!

You should work on the problems in the order specified below—-our server will grade in that order, so it will be easier for you to follow along!

Note: The Codio menus will now provide several “Run …” options, one for each file that you need to execute.

Once you are finished, be sure to submit.zip`. (The zip file created by Codio will include just the six files mentioned above, but if, for some reason, you create the zip file yourself, be sure that it includes just these six.)

Setting Up

Visit the CIS 1200 course in Codio and select the “HW03: Abstraction and Modularity” project.

  • Checkpoint 0: You have already learned what you need to know to complete Problems 1, 3, and 4. Get started whenever you’d like!
  • Checkpoint 1: after learning the lecture 9 material (transform and fold), you can complete Problem 2.
  • Checkpoint 2: after learning the lecture 10 material (sets), you can complete the rest (Problems 5, 6, 7, and 8).

Feel free to try problems sooner than the checkpoints recommend, using the textbook, deep dive videos, or other resources to get you started!

Instructions

The goal of this homework is to implement several generic data structures and then use them to perform a basic, real-world task. The homework consists of eight problems, divided into three parts.

You may not use any OCaml library functions in this homework assignment, unless explicitly instructed to. Additionally, only = should be used for comparison (not ==).

There is also a FAQ document for this homework. This zip file contains the original source, in case you need to refer to it.

higherOrder.ml

In problem 1, you will get some practice with functions that use generic types. Recall that generic types in OCaml can be used to help abstract out frequently used programming patterns. Problem 2 introduces higher-order functions (functions that take another function as an argument and/or produce a new function as a result). You should write as many tests as you feel are necessary to ensure the correctness of your functions.

These higher-order functions are provided:

(* `transform` takes another function as an argument (and applies it to every element of a list). *)
let rec transform (f: 'a -> 'b) (l: 'a list) : 'b list =
  begin match l with
  | [] -> []
  | x :: xs -> f x :: transform f xs
  end
(* `fold` collapses a list into a single result value. *)
let rec fold (combine: 'a -> 'b -> 'b) (base: 'b) (l: 'a list) : 'b =
  begin match l with
  | [] -> base
  | x :: xs -> combine x (fold combine base xs)
  end
(* `filter` uses a predicate function to filter out elements that don't satisfy a condition `pred`. *)
let rec filter (pred: 'a -> bool) (l: 'a list) : 'a list =
  begin match l with
  | [] -> []
  | x :: xs ->
      let rest = filter pred xs in
      if pred x then x :: rest else rest
  end

tree.ml

Problem 3 covers binary trees and some recursive functions that operate on trees. Problem 4 gives you practice with *binary search trees. Recall that a binary search tree is a binary tree that follows some additional invariants:

  • Empty is a binary search tree, and
  • Node (lt, v, rt) is a binary search tree if both
    • lt is a binary search tree, and every value in lt is less than v
    • rt is a binary search tree, and every value in rt is greater than v

Notice that this description is recursive, just like our datatype definition! You may assume that all of the trees that are provided to the functions in this problem during grading satisfy this invariant.

NOTE: Many of the functions in the remainder of this file are available in the CIS 1200 lecture notes. Although it is okay to use those as a reference, you should ensure you understand them. The best way to do this is to rewrite them from scratch, without looking at the lecture notes as you do so.

setInterface.ml

There’s no code to write for this part, but you should read through this file in its entirety. It defines a module signature, also known as an interface, for the 'a set abstract data type. The word “abstract” means that we can define a set in terms of its behavior and mathematical properties, rather than its concrete implementation; we’ll see later on that lists, binary search trees, and many other structures (even functions!) can be used to represent sets.

Familiarize yourself with the definitions and values in this module before moving on to setTest.ml. (Remember, understand the problem!)

It may be helpful to keep this file open in a separate window while you work on the rest of the assignment. But please remember, do not modify this file.

setTest.ml

This file introduces a reuseable module that can be used to test other modules that conform to the SET interface defined in setInterface.ml. Because we will be implementing SET in two different ways, we can save work by writing a single set of test cases that can be used for both.

Make sure to write tests that thoroughly exercise all of the functions defined in setInterface.ml. Your TAs will be manually grading the completeness of your test coverage in Problem 5. Finish writing your tests before moving on to the next file!

NOTE: We strongly advise you write your tests for all the functions in the order they appear in the interface. That way, after writing all tests, you will be able to implement the functions one at a time in the same order and see your tests start to pass incrementally.

listSet.ml

Problem 6 is the first concrete implementation of the SET interface. It defines a module structure that conforms to SET, and so it must contain all of the functions defined in the interface.

Inside this module, we tell the OCaml compiler that an 'a set is actually an 'a list, and so they are equivalent for purposes of the implementation. However, because the external interface doesn’t specify what a 'a set actually is, the fact that it’s a list is not made available to users outside the OrderedListSet module. (That’s why you had to write test cases using only the SET interface.)

Representation Invariants

In addition to just implementing the interface, you will be responsible for maintaining two representation invariants on the lists returned by the function.

  • The lists must be sorted in ascending order.
  • The lists may not contain duplicate elements.

When you are given a value of type 'a set, you can assume it conforms to these invariants. They make particular functions easier to implement, while some have to be a bit more careful about ensuring that all the 'a sets you create conform to the invariants. Part of your grade for this part will be based on how well you maintain and leverage the invariants in your implementation. Note that some ways of implementing the required functions, while technically correct, are not the best implementations, given the invariants. Think carefully about your inputs and outputs! You can define as many helper functions as you want in your solutions to these problems, but don’t edit the SET interface or any of the mli files. (Our tester relies on the interface we’ve given you. If you change it, your code may not be gradeable!)

treeSet.ml

Problem 7 develops another implementation of the SETinterface, using binary search trees as the backing data structure instead of lists. Remember to pay attention to the BST invariants and maximize code reuse where possible.

NOTE: Two BSTs can be equal as sets without having the exact same structure. The OCaml (=) operator will only return true if the trees look exactly the same, and not if they contain the same elements but don’t have the same exact structure.

vocab.ml

This file demonstrates how to use set data structures to solve a practical problem – counting how many words from the official SAT study guide show up in a given body of text.

NOTE: Remember to re-comment the designated lines before submitting!

OCaml StyleChecker

Before submitting to Gradescope, you should run the stylechecker in Codio to make sure you have no style violations.

  1. In your project, select “Style Check Project” from the dropdown menu at the top of the screen where “Build Project” lives.
  2. Each style violation will appear with its corresponding file, line number, and a suggestion on how to fix it. If you have no violations, “No style violations” will be printed.
  3. An assignment with “No style violations” in Codio will receive 5.0/5.0 pts for Autograded Style Score in Gradescope.

If you are ever unsure about OCaml style, check out the CIS 1200 OCaml Programming Style Guide.

Grading

There are 100 total points, broken down as follows:

  • Problem 1 (Generics): 8
  • Problem 2 (Higher-order functions): 12
  • Problem 3 (Trees): 12
  • Problem 4 (BSTs): 8
  • Problem 5 (Testing sets): 5 (manually graded)
  • Problem 6 (Lists as sets): 15
  • Problem 7 (BSTs as sets): 15
  • Problem 8 (Vocab): 15
  • Kudos Problem (Frequency Table): n/a
  • Coding Style: 5
  • Design: 5 (manually graded)

As with Homework 2, we will be manually grading a portion of your homework. 5 points will go to “design”, which includes utilizing previous functions and leveraging invariants in your implementation. Another 5 points will go to testing. You will receive a maximum of 90 points when you submit.

For this assignment, you may submit without penalty up to 3 times. Each additional submission before the deadline costs you 5 points.