CIS194: Final project

Overview/important dates

For CIS 194 you will complete a final project which will tie together some of the things you have learned and give you some practical Haskell development experience. The expectation is for you to spend around 15-20 hours working on the project. Here are some important dates:

Friday, November 21 – Project proposals due
Friday, December 5 – Checkpoint submission
Thursday, December 11, 12noon-2pm – Final project demos, Levine Lobby
Monday, December 15 – Final project submission deadline
Thursday, December 18, 2:30pm-4:30pm – Final project demos, Greenberg lounge, 1st floor Skirkanich

Get started early!

Format

You may work by yourself, or in groups of up to three students. Note, however, that projects for groups of three will be held to somewhat higher standards than those for individuals or pairs. Groups of five are right out.

There are two types of projects you may complete:

Application/library

For your project you may write some sort of Haskell application or library which does something fun/useful/interesting. Your imagination is the limit. Some possibilities/suggestions include:
- A program to play a game (like tic-tac-toe, Connect 4, othello, gomoku, poker, mancala, …) against the user.
- A program to solve puzzles like sudoku or kenken.
- A program to generate random mazes and let the user interactively solve them, or to solve mazes input by the user.
- An implementation of some interesting data structure like red-black trees, 2-3-4 trees, binomial heaps, or Fibonacci heaps.
- An parser and interpreter for a small programming language, such as a while language.
- A raytracer.
- Take an interesting program you have written in some other language, and figure out how to port/re-implement it in idiomatic Haskell.
- Write an IRC bot offering functionality of your choice. You might want to look at existing Haskell IRC libraries such as fastirc, simpleirc, irc, ircbot, etc.
- Write an alternative to a common command-line tool in Haskell; something simple like grep or netcat is OK. For the intrepid, consider implementing an ssh client, HTTP daemon, or similar.
- Do something cool with APIs available on the Internet. (A Twitter bot, perhaps?)
- Write a serverside library in Haskell to streamline pragmatic access to some dataset over an HTTP API.
- Find an existing web service who does not have an official Haskell wrapper and write one yourself. A good example that came up recently in a personal project is Lob.
- Write a toolkit to normalize/transform data: this toolkit would include, for an example, data structures/functions to take a messily-formatted phone number and normalize it to E.164, data structures/functions to take a human name and store it in a standard format (caveat), data structures/functions to normalize US postal addresses, etc. This could be massively helpful to other developers who use Haskell for scraping and data processing.
- Build a parser and theorem prover for intuitionistic propositional logic.
- Build a parser, type checker, and interpreter for the lambda calculus.
- Whatever else your creativity suggests!
Open-source contribution

For your project you may choose an open-source library or application on Hackage to contribute to. Contributions may include bug fixes, new features, and/or documentation. Here are a few suggestions—these are projects whose authors/maintainers have indicated that there would be good ways for beginning Haskell students to contribute. (But you are free to work on any project you like, as long as you can find a reasonable way to contribute.) If you want to try contributing to one of these projects, you should contact the relevant person(s) and discuss it with them prior to submitting your project proposal.

Open-source projects students have contributed to in prior years include a package to efficiently compute prime numbers using a mutable-array-based sieve and Haskell bindings to the Kinect.
- Flowbox (Contact: Wojciech Daniło, wojciech.danilo at gmail dot com)
  
  From Wojciech: We are creating Flowbox – a dataflow programming framework intended for high-performance data processing and analysis. Flowbox facilitates the processing of any kind of data – like sound, images or big data in an easy and visual way. It is based on our programming language – Luna, which has two interchangeable representations – visual and textual one – and you can switch between them in any time.
  
  On top of it we have built an example use case for the platform – Flowbox FX, intended for high-end image and video post processing. We have some strong background in the VFX industry (I was, for example, leading for a few years the R&D department in one of the biggest film studios in Europe – in Alvernia Studios).
  
  A few months ago we went to San Francisco and Los Angeles and were talking to some big studios out there, like Dreamworks, Tipett Studio or Rythm and Hues to name few. Right now we are releasing our commercial product – the Flowbox FX and it is being tested in some facilities. But everything we are doing is based on our programming language. The visual representation is easy to grasp and suitable for non-programmers to create advanced components – like fire or smoke simulations. And, we are strongly thinking about releasing the language as an open source project – so far I’m sure we will release it free for everyone.
  
  Luna (the language) is very interesting – it is a pure functional, lazy, object oriented one (with immutable objects) and some funny things, like easier monad support built into compiler. On the beginning we were compiling it to Haskell, and right now we are slowly switching to compile it to Haskell core – and this will be the final solution we were looking for.
  
  Contact me if you are interested in learning more and contributing.
- IHaskell (Contact: Andrew Gibiansky, andrew.gibiansky at gmail dot com)
  
  IHaskell is a platform for interactive evaluation and analysis of Haskell expressions. Andrew suggests the following says a student from CIS194 might contribute:
  - Try to get IHaskell working on 7.10 before it is released – not terribly hard, probably, but requires getting 7.10 installed and maybe making sure some of the IHaskell dependencies work on 7.10.
  - Implementing a smarter autocomplete: scan sources of popular libraries on Hackage, collect statistics about identifier and module usage, and suggest common ones before less common ones.
  - Take on some of the more interesting issues, such as adding :m -MyModule to remove a module from scope, making a comprehensive test suite (and fixing some issues with the previous one), allowing inline template Haskell, parsing and using LANGUAGE pragmas, and others.
  - Adding ihaskell-display package support for some common libraries.
  - Working with me to hammer out a few uses for interactive widgets. I have a proof of concept interactive widget for Parsec working, where if you display a Parsec a (with Show a), it pops up a textbox and anything you type into the textbox is asynchronously and automatically parsed, results are displayed, and errors are highlighted. Many other interactive widgets could be useful, but haven’t been developed because I lack the time.
  - Using chrisdone’s present library in IHaskell to lazily show data types for debugging.
  - Using TypeHoles to implement smarter completion for types and values.
  Andrew is clear in his email to me that he is eager to provide mentorship through the process.
- Chatter (Contact: Rogan Creswick, creswich at gmail dot com)
  
  There are a handful of tasks on Chatter (an NLP toolkit) that might work.
  
  Setting up a framework for evaluating performance might be a fun and cleanly-separable task of a good size (it’d mostly mean wiring up existing APIs, defining a ui and optionally learning about how lazy evaluation impacts taking timing measurements in Haskell and using criterion).
- Mateusz Kowalczyk (fuuzetsu at fuuzetsu dot co dot uk) has several ideas. He says:
  - tsuntsun is a front-end to tesseract OCR software. Possible work involves improving the interface (hey, I’m a programmer not a designer…), adding features such as on-the-fly translation through Bing or another service (this seems like a nice little project, you end up with a lib to talk to the service even if they don’t get to integrate it), add support for history (probably not enough for a project by itself) or if the student is more ambitious, automatic region detection as boasted by the (proprietary) software KanjiTomo. So there’s talking to the service, messing around with a GUI (gtk2hs) or work with images/pattern recognition (I can only help with Haskell side here). The plus side is that the existing code is pretty primitive if they choose to do something with the GUI, no 7-layer deep monad transformers.
  - free-game is a game library but it is pretty small in what it offers: you get some basic stuff but it’s not a full-blown suite. A project could involve writing useful libraries around it, I could come up with some specifics if there’s interest. I can’t offer help with hacking on free-game itself but I wouldn’t mind overseeing any libs that spawn around it.
  - For a while now I have quite an annoyance with criterion; it produces those pretty HTML + JS graphs you can mouse-over and stuff, right? The problem is that they are absolutely useless to the point of hanging your browser if you have more than a few benchmarks on the page. I think a nice project would be developing a reporting package with ‘diagrams’ or something which takes Criterion’s output (CSV) and spits outs images we can actually inspect. This seems like something a student can get on with pretty easily and take it as far as they wish while having practical value at the same time.
  I wish I could offer more ideas but it’s hard to come up with something that will fit into 20 hours including getting the feel for things and that might be interesting to the student, have some value to the rest of us and doesn’t feel like an exercise. I am a big fan of “learn by actually hacking stuff” approach, just not in such a (relatively) small timescale ;).
  
  I don’t mind overseeing someone if they happen to pick anything I mentioned here or something that interests me. I don’t mind volunteering as “overseer” for some other project if the student is willing and is likely to join the community for longer.
- Robot (Contact: Chris Wong, lambda.fairy at gmail dot com)
  
  From Chris: I maintain a GUI testing package called Robot. It’s a simple library with clear semantics.
  
  Some project ideas, from easiest to hardest:
  - Taking screenshots. XHB exposes a GetImage call; it shouldn’t be too much work integrating that into the library.
  - Adding a configurable delay between operations. This involves some work with monads (ReaderT specifically).
  - Windows and Mac support. Some good practice with Cabal and the FFI here.
  - xdotool does a few things, like searching for windows by title, that I’d like to see in Robot. Porting some of these features over sounds like a good (albeit open-ended) project.
- Ernesto Rodriguez (neto at netowork dot me) is a Master’s student at Utrecht University. He supplies the following ideas:
  
  A project that might be good for a introductory FP course is Cryptographer. Its objective is to encrypt data in html files so you can publicly share those files but only people with the password can see the contents. The nice thing is that you can send one of those files to anyone since it’s html so all you need is a browser to open it. Anyways, in my wishlist (and future steps) for this tool I have the following:
  - When appending data to an encrypted file, add checks to ensure the provided decryption key is correct (currently, if you give a wrong key it simply decrypts gibberish and appends your content to it)
  - Add support for data other than text. For example, embedding images and files by encrypting the base64 encoding of their bits.
  - I use the tool primarily for passwords, so would be nice to add some tools for them. In particular, it would be nice if the encrypted file could contain buttons which one can click to copy passwords into clipboard. Also if it could have a setTimeout() somewhere so the file gets encrypted again after some time automatically.
  - Add more ciphers. Currently I encrypt data using TwoFish. I use it because I like the cipher but also because since I use GHCJS to generate the html file that performs decryption. The cipher must thus be written 100% in Haskell (or do some foreign calls to an external JavaScript library). I used to support BlowFish as well but I removed it since the cipher is not 100% secure.
  - A UI (both html and desktop) would be nice. Even if encryption could be done 100% on a HTML UI would also be advanced. But in order for appending to work over the net, code has to be added so files are retrieved via Ajax (not wget as I currently do it :P).
  - Improve the command line interface. I use my own experimental extension of CmdArgs for the command line (which I wrote only to try GHC Generics out). An ambitious student could consider improving that tool or simply using standard cmdargs for the command line arguments.
  The most advanced library I use in the project is Pipes which is easy to gasp and a lot of the code is pure code (ie. encryption algorithms, generating html, ect) so that makes things simple as well.

Project proposal

You must submit a project proposal by Friday, November 21. This gives us a chance to discuss your proposal and ensure it will make a suitable project. You are encouraged to submit your proposal earlier than November 21 if you already have an idea. You should also feel free to submit several project proposals if you would like help deciding which is most suitable.

To submit your proposal, send an email of a few paragraphs to me (eir at cis dot upenn dot edu) with the subject “CIS 194 final project proposal”. Try to answer the questions: What do you propose to do? What do you hope to learn from the project? What are some concrete goals, i.e. how will we judge the success of your project?

There is no formal formatting requirement for the proposal, but I will ask you to revise proposals that are too vague. I should have a decent idea of what the final product will look like, so that way, the TAs and I can evaluate if you achieve what you set out to. Your proposal must also address what to expect at the checkpoint, and must explicitly discuss how you plan on testing your work (that is, via unit tests, or QuickCheck, or …).

Checkpoint submission

By midnight on Friday, December 5, you must submit your progress toward your final project. In this submission must be a README that describes the progress you have made and what we should be looking for in your work. Your submission must compile and run, doing something interesting. Of course, your project will be incomplete at this stage, but what you show us should convince us that it’s likely you will be able to finish on time.

(In rare cases, it may be appropriate for a checkpoint submission not to compile. If you believe this applies to you, you must email me no later than Wednesday, December 3, explaining why you think it’s appropriate that your checkpoint version not compile and/or run.)

Note that the checkpoint submission will constitute part of your final project grade, though there’s no need to stress out about it. In particular, we will not be grading style at the checkpoint. Indeed, we hope not to look at your code here, but instead to run your program and see what progress you’ve made that way.

The checkpoint is also a good opportunity to ask questions of what a good next step would be. Include these questions in your README.

The purpose of the checkpoint is twofold: to make sure you get started on your project before the last minute, and to provide a convenient space for you to ask questions and get some direction and feedback.

Final submission

Final submissions are due by Monday, December 15.

Your final submission should consist of any and all code you have written, along with a document describing your project (a simple text file is fine). The document should contain

a description of your project and what you accomplished;
instructions on how to compile/run/try out/play with your project;
a description of work you did and things you learned along the way.

Submit your project as a compressed file (.tar.gz, .zip, etc.) through Canvas. If you contributed to an external project, then your submission should contain a specific listing of what, exactly, were your contributions. The code itself can be on, e.g., GitHub – you don’t have to submit a copy.

Grading will be as follows:

Checkpoint (25%). Did you make some progress on your project by the time of the checkpoint meeting?
Style (25%). Your project should use good Haskell style and be well-documented.
Correctness (25%). Your project should be free of compilation errors and should correctly accomplish whatever it is supposed to accomplish. This means that if the deadline is looming, your time would be better spent fixing bugs in what you already have than adding one last feature.
Effort/accomplishment (25%). We will be looking for evidence that you put energy and effort (~15-20 hours) into your project and that you have learned something. This is where the document you submit along with your project comes in: be sure to use it to highlight work you did and things you learned, especially if it is not obvious from looking at the final product. For example, if you spent two hours trying an approach that ultimately did not work, you should write about that and what you learned from the experience. However, we will not necessarily look with sympathy on unnecessary work: for example, if you spent five hours trying to track down a bug without asking for help, that’s just plain silly stubbornness. If you are stuck on something, please ask for help. We want you to spend your time making progress on your project, not banging your head against a wall (although a small amount of head-banging can be healthy).

Project demos

You will demonstrate your working project at either of the demo days listed at the top:

Thursday, December 11, 12noon-2pm – Final project demos, Levine Lobby
Thursday, December 18, 2:30pm-4:30pm – Final project demos, Greenberg lounge, 1st floor Skirkanich

You do not need to attend both, though you’re welcome to. A project demo should show off your hard work, show us what’s interesting about your project, and highlight a particularly challenging bit of code. I do not expect PowerPoint slides!

The demo day on Dec. 11 is a combined demo with students from the other CIS19x courses. Lunch will be served. The demos will be presented in a “science fair” format, with folks wandering around from presenter to presenter. Come with a laptop and show off your work. Although this date is before the final deadline, I expect the work you’re showing to be close to complete. If you are not ready in time, wait to demo until the following week! Conversely, you can always demo your work, get feedback from the instructors, and then incorporate that feedback into your submission.

The demo day on Dec. 18 is a more traditional presentation format, with you standing at the front of the room on a projector. I expect demos to run 5-10 minutes.

The demo is a required part of the final project. Please be in touch now if you cannot make either of these dates.