Due: 26 April 2017 at 11:59:59pm

Download: hw6.zip

To Submit: Go Here

This is a group project. Teams of size 1 or 2 are acceptable. If you are working in a team of two, only one student needs to submit the project.

Getting Started

Note:The compilation setup is the same as in HW5. You'll need Menhir for the parser. You'll also need to add the nums and unix libraries to compile this project. If you use OCamlBuild, you can compile the project from the command line by running ocamlbuild -Is x86,util -libs nums,unix -use-menhir main.native (or use the provided Makefile).

Many of the files in this project are taken from the earlier projects. The new files (only) and their uses are listed below. Only those preceeded with an asterisk will need to be modified for this assignment.

llprograms/*.ll   - example .ll programs used in testing
atprograms/*.oat  - example .oat programs used in testing

datastructures.ml - set and map modules (enhanced with printing)
analysis.ml       - helper functions for propagating dataflow facts
cfg.ml            - "view" of LL control-flow graphs as dataflow graphs
*solver.ml        - the general-purpose iterative dataflow analysis solver 
*liveness.ml      - you will implement liveness analysis here
alias.ml          - an example dataflow analysis for testing solver
analysistests.ml  - test cases for liveness

*backend.ml      - you will implement register allocation heuristics here
----------------------------------------------------------------------------
(used in developing our tests, but perhaps useful to you:)
printanalysis.ml  - a standalone program to print the results of an analysis
Note: As usual, running main.native --test will run the test suite. main.native also now supports the -opt flag, which turns on the Oat optimization pass.

Overview

The OAT compiler we have developed so far produces fairly inefficient code, since it performs no optimizations at any stage of the compilation pipeline. In this project, you will implement a a simple dataflow analyses and register-allocaiont optimizations at the level of our LLVMlite intermediate representation in order to improve code size and speed.

Provided Code

The provided code makes extensive use of modules, module signatures, and functors. These aid in code reuse and abstraction. If you need a refresher on OCaml functors, we recommend reading through Chapter 9 of Real World OCaml.

In datastructures.ml, we provide you with a number of useful modules, module signatures, and functors for the assignment, including:

Task I: Dataflow Analysis

Your first task is to implement a version of the worklist algorithm for solving dataflow flow equations presented in lecture 21. Since we would ideally be defining several analyses, we'd like to reuse as much code as possible between each one. In lecture, we saw that each analysis differs only in the choice of the lattice, gen and kill sets, the direction of the analysis, and whether we take the meet or join to compute the flow into a node. We can take advantage of this by writing a generic solver as an OCaml functor and instantiating it with these parameters.

The Algorithm

Assuming only that we have a directed graph where each node is labeled with a dataflow fact and a flow function, we can compute a fixpoint of the flow on the graph as follows:

let w = new set with all nodes
repeat until w is empty
  let n = w.pop()
  old_out = out[n]
  let in = combine(preds[n])
  out[n] := flow[n](in)
  if (!equal old_out out[n]),
    for all m in succs[n], w.add(m)
end
  

Here equal, combine and flow are abstract operations that will be instantiated with lattice equality, meets or joins and the function computed by the gen and kill sets of the analysis, respectively. Similarly, preds and succs are the graph predecessors and successors in the flow graph, and do not correspond to the control flow of the program. They can be instantiated appropriately to create a forwards or backwards analysis.

Note: Don't try to use OCaml's polymorphic equality operator (=) to compare old_out and out[n]! Use the supplied Fact.compare.

Getting Started and Testing

Be sure to review the comments in the DFA_GRAPH and FACT module signatures in solver.ml, which define the parameters of the solver. Make sure you understand what each declaration the signature does -- your solver will need to use each one (other than the printing functions)!

(a.) Implement the solver

Your first subtask is to fill in the solve function in the Solver.Make functor in solver.ml. The input to the function is a flow graph labeled with the initial facts. It should compute the fixpoint and return a graph with the corresponding labeling. You may find the set datatype from datastructures.ml useful for manipulating sets of nodes.

To test your solver, we have provided a full implementation of an alias analysis in alias.ml. Once you've completed part 1, the alias tests in the test suite should all be passing. These tests compare the output of your solver on a number of programs with pre-computed solutions in analysistest.ml. Each entry in this file describes the alias information in a program from ./llprograms. To debug, you can compare these with the output of the Graph.to_string function on the flow graphs you will be manipulating.

(b.) Implement the Liveness flow functions

Task II: Register Allocation

For this task, you will implement a better register allocation strategy that makes use of the liveness information that you compute in Part I. Most of the instructions for this part of the assignment are found in backend.ml, where we have modified the code generation strategy to be able to make use of liveness information. The task is to implement a single function live_layout that beats our example "simple" register allocation strategy. We recommend familiarizing yourself with the way that the simple stragies work before attemting to write your own allocator.

The compiler now supports several additional command-line switches that can be used to select among different analysis and code generation options for testing purposes:

  --print-regs prints the register usage statistics for x86 code
  --regalloc {none|simple|live} use the specified register allocator
  --liveness {trivial|dataflow} use the specified liveness analysis
  

Note that for testing purposes, you can run the compiler with the -v verbose flag and/or use the --print-regs flag to get more information about how your algorithm is performing.

The goal for this part of the homework is to create a strategy such that code generated with the --regalloc live --liveness dataflow flags is "better" than code generated using the simple settings, which are --regalloc simple --liveness trivial. See the discussion about how we compare "live" register allocation to "simple" allocation in backend.ml. The "quality" test cases report the results of these comparisons.

Of course your register allocation strategy should be correct, so we still perform all of the correctness tests that we have used in previous version of the compiler. Your allocation strategy should not break any of these tests.

Task III: Experimentation / Validation

Of course we want to understand how much of an impact your register allocation strategy has on actual execution time. For the final task, you will create a new OAT program that highlights the difference. There are two parts to this task.

Create a test case

Post an OAT program to piazza. This program should exhibit significantly different performance when compiled using the "simple" register allocation strategy vs. using the "live" register allocation strategy with dataflow information. See the file atprograms/regalloctest.oat for an uninspired example of such a program. Yours should be more interesting.

Post your running time

Use the unix time command to test the performance of your register allocation algorithm. This should take the form of a simple table of timing information for several test cases, including the one you create and those mentioned below. You should test the performance in several configurations:
  1. using the --liveness trivial --regalloc none flags (baseline)
  2. using the --liveness trivial --regalloc simple flags (simple)
  3. using the --liveness dataflow --regalloc live flags (live)
  4. using the --clang flags (clang)

Test your compiler on at least these three programs:

Report the processor and OS version that you use to test. For best results, use a "lightly loaded" machine (close all other applications) and average the timing over several trial runs.

The example below shows one interaction used to test the matmul.ll file in several configurations from the command line:

~/cis341/hw/hw.regalloc/soln> ./main.native --liveness trivial --regalloc none llprograms/matmul.ll 
~/cis341/hw/hw.regalloc/soln> time ./a.out

real	0m1.647s
user	0m1.639s
sys	0m0.002s

~/cis341/hw/hw.regalloc/soln> ./main.native --liveness dataflow --regalloc live llprograms/matmul.ll 
~/cis341/hw/hw.regalloc/soln> time ./a.out

real	0m1.164s
user	0m1.154s
sys	0m0.002s

~/cis341/hw/hw.regalloc/soln> ./main.native --clang llprograms/matmul.ll 
~/cis341/hw/hw.regalloc/soln> time ./a.out

real	0m0.061s
user	0m0.053s
sys	0m0.004s

Grading

Projects that do not compile will receive no credit!

Your team's grade for this project will be based on:

Last modified: Fri Apr 14 09:43:08 EDT 2017