This package depends on the files from Project 2.
This is a group project. Groups of size 1 or 2 are acceptable.
A complete lexer for this project is included in lexer.mll (incuding rules for the extra credit). You should understand this file, but do not modify it.
Find in cfast.ml a specification of the AST for this assignment. There are already rules in parser.mly and lexer.mll for arithmetic. You must add the following rules, summarized below:
Instead of directly building X86 assembler as you did in the expressions assignment, this time you'll be generating intermediate code. The abstract instructions are defined in cfil.mli. These are mapped to X86 instructions in the file cfilerase.ml, which you may read at your leisure. All of your code generation logic belongs in controlflow.ml, where you must implement il_of_ast as defined in controlflow.mli.
We have provided some code to get you started. You don't need to use our definitions if you have a different design in mind.
To store temporary values or the values of variables in this assignment, you should use space on the stack. The module Cfil defines the type svar to represent a single 4-byte stack location. You can generate unique svars with mk_svar (). You'll have to keep track of the svars you use, as when you finally construct the proc containing your IL, you will need to specify all of the stack locations you use as an svar list.
You don't have to recycle svars after you're finished with them, but doing so will result in better code (since we aren't doing any optimization).
The let node, like OCaml's let, introduces a scoped variable binding. A variable's name bound by some let is only valid inside the expression wrapped by that let. It is not an error for one let to shadow another let's binding; that is, a program like this:
is perfectly valid (and has a return value of 6).
Note that, as in C and Java, the expression a = 3 returns the value 3 (and not unit as would a := 3 in OCaml).
Module Cfil expects you to generate program code in basic blocks. Since this can be inconvenient, we suggest that you first generate your program as a flat list of instructions. After this step, you can easily convert the list to a control flow graph. Look at the type cgext we have provided and the IL pseudo-instruction Ext for hints. Also remember that you can create fresh (that is, uniquely named to the current run of the compiler) labels using X86struct.mk_label ().
Seq evaluates its left hand side, discards the result if there is any, then evaluates its right hand side.
If evaluates its guard. If the guard is nonzero, the If evaluates and uses the result of its left branch. If the guard is zero, it evaluates and uses the result of its right branch.
Since this language does not require any static checking, it is possible
to write unsafe programs. In particular, you can write a program of
the form
While evaluates its guard. If the guard is nonzero, the body of the While is executed and its result is discarded; control flow then returns to the start of the loop. If the guard is zero, then the body of the While is skipped.
For evaluates its condition (or uses 1 as the value of its condition if none was given). If the condition is nonzero, then the body of the For is evaluated and its result is discarded; For then evaluates its increment expression (or does nothing if none was given) and returns to the start of the loop. If the condition is zero, then the body of the loop (and its increment expression) are skipped.
The while and for loops do not ever return values. Seq and Let return the value of their right-hand expressions. if may return values or it may return unit (or no value). Again, assume that if one branch of the if returns a value, the other branch will do the same.
Unless you're doing the extra credit, you don't need to generate code for the Switch or the ScBinop AST nodes. For those nodes, you can simply failwith "No extra credit."
Implement some combination of switch and short-circuit evaluation. switch looks up an integer in a table of labels. If the integer is bound to a label in the table, control flow passes to that label. If nothing in the table matches, control flow passes to the default label. Switches can return values (and it is expected that every branch will consistently return or not return a value). In short-circuit evaluation, the right-hand side of a logical operator is evaluated only if necessary. If the operator is ||, the right-hand side need not be evaluated if the left-hand side is nonzero. If the operator is &&, the right-hand side need not be evaluated if the left-hand side is zero. Make sure to return the correct value!
For switch, you will need to add the following to the grammar:
There are no delimiters between switch branches:
No new grammar rules for short-circuit evaluation are necessary. You will need to generate code for the ScBinop node. There are no special instructions for short-circuit evaluation.
You can choose to implement switch in terms of If
instructions or in terms of the Switch control-flow IL instruction.
If you choose to do the latter, you will need to implement the
Il.Switch case in cfilerase.ml. This will allow you
to use different dispatch techniques, including jump tables (in which
case you are permitted to use
We will also award extra credit points for interesting and clever solutions. This will be based in part on timings for benchmarks collected on ENIAC.
Edit the included readme.txt file, making sure to fill in your name(s) and email address(es).
You should also spend some time to ensure that your code is readable and well-commented. We have some suggestions and requirements for your programming style. You will definitely lose points for the file submission requirements (code must compile, 80 column limit, no tabs); however, barring any particularly egregious cases, you may take the remaining rules as suggestions.
Part of your grade will be based on testing. Some example tests and utility functions are provided in controlflow_test.ml. For this assignment, we require that you define at least 10 new unit tests. (These must be sufficiently different from one another to count— tests like 1 + 1 is 2, 1 + 2 is 3, 1 + 3 is 4 are too similar to be useful). In addition, you should write (and identify in your documentation) an eleventh more difficult test. Although anything will do, it is better if this test exercises most of the project's functionality. We will collect everyone's difficult tests and run them as part of our standard test suite during grading.
You should contribute at least one small test to the mailing list and are encouraged to send in test cases to highlight parts of the specification about which you may have questions. It's OK to count tests from the mailing list toward your ten (but you must write your own unique larger test case).
The grading breakdown for this project is as follows:
Extra credit breaks down like so:
Remember that your raw score will be scaled— that is, more difficult assignments will be worth more than less difficult assignments, even if the more difficult assignment has fewer available points.