Skip to main content

This assignment is due before 11:00PM on Wednesday, October 18, 2017. There are two parts to this homework assignment:

You’ll need to submit your solutions to both parts of the homework before the deadline.

Collaboration policy.

Programming Assignment 4: Kd-Trees

This assignment is also used in Coursera. You may not download solutions to a Coursera version of the assignment from the web. You may not use Coursera’s autograder to check the correctness of your solution.

Write a data type to represent a set of points in the unit square (all points have x- and y-coordinates between 0 and 1) using a 2d-tree to support efficient range search (find all of the points contained in a query rectangle) and nearest-neighbor search (find a closest point to a query point). 2d-trees have numerous applications, ranging from classifying astronomical objects to computer animation to speeding up neural networks to mining data to image retrieval.

Operations

The goals of this part of the assignment are:

Geometric primitives

To get started, use the following geometric primitives for points and axis-aligned rectangles in the plane.

Primitives

The immutable data type Point2D (part of algs4.jar) represents points in the plane. Here is the subset of its API that you may use:

public class Point2D implements Comparable<Point2D> {
   public Point2D(double x, double y)              // construct the point (x, y)
   public  double x()                              // x-coordinate 
   public  double y()                              // y-coordinate 
   public  double distanceTo(Point2D that)         // Euclidean distance between two points 
   public  double distanceSquaredTo(Point2D that)  // square of Euclidean distance between two points 
   public     int compareTo(Point2D that)          // for use in an ordered symbol table 
   public boolean equals(Object that)              // does this point equal that object? 
   public    void draw()                           // draw to standard draw 
   public  String toString()                       // string representation 
}

The immutable data type RectHV (part of algs4.jar) represents axis-aligned rectangles. Here is the subset of its API that you may use:

public class RectHV {
   public    RectHV(double xmin, double ymin,      // construct the rectangle [xmin, xmax] x [ymin, ymax] 
                    double xmax, double ymax)      // throw a java.lang.IllegalArgumentException if (xmin > xmax) or (ymin > ymax)
   public  double xmin()                           // minimum x-coordinate of rectangle 
   public  double ymin()                           // minimum y-coordinate of rectangle 
   public  double xmax()                           // maximum x-coordinate of rectangle 
   public  double ymax()                           // maximum y-coordinate of rectangle 
   public boolean contains(Point2D p)              // does this rectangle contain the point p (either inside or on boundary)? 
   public boolean intersects(RectHV that)          // does this rectangle intersect that rectangle (at one or more points)? 
   public  double distanceTo(Point2D p)            // Euclidean distance from point p to closest point in rectangle 
   public  double distanceSquaredTo(Point2D p)     // square of Euclidean distance from point p to closest point in rectangle 
   public boolean equals(Object that)              // does this rectangle equal that object? 
   public    void draw()                           // draw to standard draw 
   public  String toString()                       // string representation 
}

Do not modify these data types.

Brute-force implementation

Write a mutable data type PointSET.java that represents a set of points in the unit square. Implement the following API by using a red–black BST:

public class PointSET {
   public         PointSET()                               // construct an empty set of points 
   public           boolean isEmpty()                      // is the set empty? 
   public               int size()                         // number of points in the set 
   public              void insert(Point2D p)              // add the point to the set (if it is not already in the set)
   public           boolean contains(Point2D p)            // does the set contain point p? 
   public              void draw()                         // draw all points to standard draw 
   public Iterable<Point2D> range(RectHV rect)             // all points that are inside the rectangle (or on the boundary) 
   public           Point2D nearest(Point2D p)             // a nearest neighbor in the set to point p; null if the set is empty 

   public static void main(String[] args)                  // unit testing of the methods (optional) 
}

Implementation requirements

You must use either SET or java.util.TreeSet; do not implement your own red–black BST.

Performance requirements

Your implementation should support insert() and contains() in time proportional to the logarithm of the number of points in the set in the worst case; it should support nearest() and range() in time proportional to the number of points in the set.

Corner cases

Throw a java.lang.IllegalArgumentException if any argument is null.

2d-tree implementation

Write a mutable data type KdTree.java that uses a 2d-tree to implement the same API (but replace PointSET with KdTree). A 2d-tree is a generalization of a BST to two-dimensional keys. The idea is to build a BST with points in the nodes, using the x- and y-coordinates of the points as keys in strictly alternating sequence.

kdtree1
insert (0.7, 0.2)
kdtree2
insert (0.5, 0.4)
kdtree3
insert (0.2, 0.3)
kdtree4
insert (0.4, 0.7)
kdtree5
insert (0.9, 0.6)
insert1
insert2
insert3
insert4
insert5

The prime advantage of a 2d-tree over a BST is that it supports efficient implementation of range search and nearest-neighbor search. Each node corresponds to an axis-aligned rectangle in the unit square, which encloses all of the points in its subtree. The root corresponds to the unit square; the left and right children of the root corresponds to the two rectangles split by the x-coordinate of the point at the root; and so forth.

Clients

You may use the following interactive client programs to test and debug your code.

Analysis of running time and memory usage (optional and not graded)

Deliverables

Submit the files PointSET.java and KdTree.java along with PointSETTest.java and KdTreeTest.java.

We will supply algs4.jar. Your may not call library functions except those in those in java.lang, java.util, and algs4.jar.

Written Assignment 4: Binary Search Trees

The goals of this assignment are to test your understanding of the material covered in sections 3.1 to 3.3 of the textbook, and the lecture and recitation materials. You should read the textbook chapters before doing this part of the assignment.

Written homeworks must be typeset in LaTeX and submitted in PDF format. Please insert a page break between each question, so that your answer to each question starts on a new page in your PDF document.

Q1. BST height

What is the best case BST height? Worst case? If shuffling , probabilistically, leads to a log n tree height, why don’t we simply shuffle our input data before building our BST based symbol table to avoid worst case behavior?

Q2. Deleting nodes from a BST

Describe two methods for deleting nodes from a BST. What effect do they have on its running time? (No need for a formal proof, an English explanation is fine).

Q3. Traversals

Draw the binary tree for this level-order traversal: P E S A N V L Y I. Give the in-order traversal, preorder traversal, and postorder traversal of the tree that you drew, including elements for the null leaves. For each of these travesals, could you reconstruct the tree and if so, explain how.

Q4. Proof 1

Prove that if a node in a BST has two children, its successor has no left child and its predecessor has no right child.

Q5. Proof 2

Prove that no compare-based algorithm can build a BST using fewer than lg( N!) ~ N lg N compares.