Stephen's project ideas

This page collects a selection of ideas for projects that might be of interest to Bachelor's, Master's and doctoral students. Many of them could be of interest to research collaborators in general.

Warning: these are almost all challenging and research-oriented project ideas. I certainly don't want to discourage anyone by saying that. But they do require a strong student with a the right mix of practical skill, self-management ability and determination. (Students: these are skills you learn by challenging yourself! So if any of them interests you, you are very much encouraged to take on the challenge....)

Despite all being somewhere on the challenging side, they do vary a fair bit difficulty. They also vary in bakedness (i.e. how much I've thought them through) and in research value. I can't promise they are all totally novel, although there should be some mileage for novel research-worthy approaches in all of them except possibly those marked with an asterisk (*). Note that for a Bachelor's project, novelty is not important.

There are dependencies between some of the projects, such that I envisage somebody attempting one only after another one has been completed. I've highlighted the clearly dependency-free cases by putting them first, although it's not entirely clear-cut—some of the later ones could also be attempted from a fresh start. In any case, you can probably pick out most of the dependencies yourself from reading the titles, and I have tried to put the more dependent titles later in the second list.

Don't hesitate (really!) to ask me about any of them. The titles are just here to give you a flavour. In most cases I have prepared some more detailed notes which are not (yet) online. I will do my best to add these once they become readable enough, but in the meantime I am always happy to send more information by private correspondence.

It would bring you bad karma to take these titles or ideas and work on them without at least contacting me. (I doubt you can grok the idea from just reading the title, but who knows.) Some of these ideas arose from discussions with others, who I have done my best to acknowledge.

Dependency-free projects

10/2014A generic API fuzz-testermed/highB+
9/2014A linker, mostly functionallymed/highB+REMS
9/2014An optimising linkermedB+REMS
9/2014System calls: intercepting, specifying, emulatingmedB+REMS
9/2014A debug-time program slicermedB+
9/2014Bounds checking using libcrunchhighB+
9/2014A garbage collector using liballocsmed/highB+
6/2014Debugging information for free: observing a simple compilermedPhD+
6/2014A simple compiler in Prolog (or: “the next 700 intermediate representations”)medB+
ptrace() for the 21st century”: parallel dynamic analysis using observer processeslow/medM+
12/2013Economic frame allocationmed/highB+Jeremy Singer
12/2013Federated garbage collectionmedB+
A tool for inferring file formats by dynamic analysis of programs that use themmedB+
02/2013Framed byte streams: inferring structure back out of Unix I/Olow/medB+
Data description for Unix IPC channels, and using it to reason about multiprocess compositionslow/medB+
02/2013Don't be over-eager: detecting avoidable eager memory commitment in user programs (doing file I/O)low/medB+
Really descriptive data description: a DDL capturing both textual and binary encoding idiomsmedM+
05/2013Copy profiling: understanding how programs copy data structureslow/medM+FAN project
05/2013Repetition profiling: finding avoidably repeated work in program executionslow/medM+FAN project
“Unanticipated software update”: binary backporting of source-level patchesmedM+
Lightweight link-time optimisationmed/highB+
Lazy I/O: a clever execution model for stupid codelowM+
A domain-specific language for debugger-friendly compiler optimisationslowM+
Whole-program symbolic execution without recompiling (*)med/highB
Input-sensitive symbolic execution low/medM+
API usage analysis using symbolic traces (a data-dependent form of API usage pattern)low/medM+
07/2013Selective dynamic recompilation for dynamic analysis (e.g. bounds checking)low/medM+
DTrace with dynamic compilation: low-overhead probes for JVMs and similar runtimesmedB+
09/2012ChannelScope: a communication-centric tracing tool (for Unix IPC; perhaps intra-process too)low/medB+
“Compiling for the web”: a radically traditional toolchain for web development (using Emscripten)medB+
Stable source-code coordinates: a feedback-driven approachlow/medB+
Language-independent APIs: using symbolic traces to understand language interoperabilitylow/medM+
Language-independent APIs: using model theory to formalise language interoperabilitylowM+Alan Mycroft
Library transitions made a piece of Cakemed/highM+Michael Tautschnig

Projects with as-yet-unfulfilled dependencies

05/2013Beyond text: an interactive data editorlow/medB+
07/2013Reactive heaps: an abstraction for programming with file data as objectsmed/highB+
“Everything is a dlopen()”: retrofitting greater flexiblity onto Unix I/OlowM+
“Solving the zgrep problem”: OS-enabled smart rendezvous between code and datamedB+
Symbolic execution of partial programs using API usage contractslow/medM+
Scrap your scraper: a declarative approach to coding file I/Olow/medM+

Content updated at Mon 6 Oct 19:58:00 BST 2014.
validate this page