Stephen Kell

About Me

Photo by Emilie Selbourne, Cambridge, June 2017. Thanks, Emilie.

I'm a Lecturer (“Assistant Professor” in US English) at King's College London's Department of Informatics, where I teach topics in computer systems and software development, and do research within the Software Systems group.

Although my forename is very common in English-speaking countries, increasingly even English-speakers seem to have trouble pronouncing it. Try the following mouseover-assisted IPA, due to Wikipedia, if you're not sure: stvən/.

For my publications, click here (or keep reading).

I am a practical “computer scientist” who builds things; I'd argue this makes me more engineer than scientist per se. I have wide interests: my goal is to make it easier and cheaper to develop useful, high-quality software systems. So far, my work has mostly focused on programming languages and the systems that support them—including language runtimes and operating systems.

Micro-blog and calendar

Recent blog entries

(all entries)

Research overview

My primary research goal is to make possible increasingly humane, programmable computer systems. Currently, I am pursuing this through a humane evolution of the Unix-like operating system.

If I had to limit myself to only a dozen research areas, they would be these.

Most of my research concerns programming, but I identify as a “systems” researcher. For me, “systems” is a mindset rather than a research topic. It means I am primarily interested in the practical consequences of an idea; its abstract properties are of interest only subserviently. (If you can come up with a better definition, let me know.) I generally hesitate to identify as a “programming languages” researcher, not because I dislike languages but because they're only one piece of the puzzle. I'd identify as a “programming systems” researcher if that were a thing.

You can read more below about what I'm working on. A recurring theme is that I prefer pushing innovations into the software stack at relatively low levels, to the extent that this is sensible (which I believe can be surprising). Another is the pursuit of infrastructure that is well-integrated, observable, simple, flexible, and designed for human beings—particularly if we believe (as I do) that computers are for augmenting the human intellect, not just for high-performance data crunching.

I believe that technology cannot be pursued in isolation. “Good” engineering research is done with an understanding of history, human culture, and that research's place in them. This might mean asking small questions, like the impact of a technical change on a wider software ecosystem. It also means asking bigger questions of the human and social assumptions behind the work: who might use it; who it will benefit; why this is an overall good; and so on. By its answers to such questions, perhaps implicitly, all technical work invariably subscribes to what one could call a “philosophical basis”. I believe that understanding these bases can lead to more valuable technical work. Some of the relevant questions are so big that they cannot be answered by experimental methods. I do not subscribe to the view that to be “scientific” is an unalloyed good, or that the methods of the humanities are faulty. To do good engineering research is to be more than purely a scientist.

Very brief biography

I did my Bachelor's degree in computer science in Cambridge, graduating in 2005. I then stayed on for a year as a Research Assistant, before starting my PhD in 2006.

My PhD work was based in the Networks and Operating Systems group under the supervision of Dr David Greaves, and centred on the problem of building software by composition of ill-matched components, for which I developed the Cake language.

During summer 2007 I took an internship at Fraser Research, doing networking research.

From January 2011 until March 2012 I was a research assistant in the Department of Computer Science at the University of Oxford. There I worked within the research programme of the Oxford Martin School's Institute for the Future of Computing. My work mostly focused on constucting a continuum between state-space methods of program analysis (notably symbolic execution) with syntactic methods (such as type checking). I ended up moving on before I could see this to fruition, but in 2019 I was awarded some funding from VeTSS and it was briefly taken up in the (now aborted) PhD work of Justus Adam (see the paper at TAPAS '20).

From May 2012 to May 2013, I was a postdoctoral researcher at USI's Faculty of Informatics in Lugano, Switzerland, within the FAN project.

From May until October 2013 I was temporarily a Research Assistant at Oracle Labs, on the Alphabet Soup project.

I returned to Cambridge in October 2013, as a Research Associate (later Senior Research Associate) at the Computer Laboratory (now become the Department of Computer Science and Technology). I worked within the REMS project, and participated in both the Programming, Logic and Semantics group and the Systems Research Group (particularly its Networks and Operating Systems subgroup). As of August 2018 I continue to be a visiting researcher at the Department.

In my industrial life, I enjoyed some spells working for Opal Telecom and ARM around my Bachelor's studies. More recently, I have been consulting for Ellexus, and conducted some industrial research work (in reality rather development-heavy) for Oracle Labs.

I maintain (sometimes neglectfully) two CVs: a snappy two-page one here, and a longer academic one here. Please be aware that both may be a little rough or incomplete at times. If you're recruiting for jobs in the financial sector, even technical ones, please don't bother reading my CV or contacting me—I promise I am not interested.

Research

Current work

Here's what I'm working on as of April 2020. In most cases these are my own individual projects; a few are with collaborators, as I describe.

Making the Unix process environment more descriptive and reflective

Unix-like processes are with us to stay, but are impoverished in ways that hold us back. It is too difficult to build robust tools and systems which automate or abstract meta-level programming tasks, such as debugging, profiling, visualisation, instrumentation, composition, error reproduction and dynamic update. In turn, this massively hampers software development generally! In large part, the root problem is that the state of a Unix process is too opaque. Unix lacks adequate meta-level facilities for describing user-supplied abstractions such as data representation, memory layouts, procedural interfaces, file formats, inter-process protocols, and so on. To fix this, rather than designing a whole new system with strong meta-level abstractions, as has been done in Smalltalk and Lisp programming systems, my work chooses instead to evolve the services offered by existing Unix-like systems. Since meta-level tasks are far from uncommon, vestigial features to help carry them out can often be found “lurking” in modern Unix. By extending and evolving these features, my work can (only half-jokingly) be said to be “dragging Unix into the 1980s”. It does so while maximising compatibility, allowing new benefits to be applied to old code, in whatever language—the work spans C and C++ right up to JavaScript, Python and OCaml. One output of this work is liballocs, a run-time library fast whole-process tracking of per-allocation metadata, including detailed type information. Another is libdlbind, which helps make JIT-compiled code visible to debuggers and dynamic loaders. Both of these are building blocks used by my other research efforts (below). Motto: if you must unify, unify at the meta-level of the operating system, not at the base language level or in language implementations.

Unsafe languages made safer and more debuggable

Huge amounts of code have been written, and continue to be written, in unsafe programming languages including C, C++, Objective-C, Fortran and Ada. Performance is one motivator for choosing these languages; continuity with existing code, particularly libraries, is another. The experience of compiling and debugging using these languages is a patchy one, and often creates large time-sinks for the programmer in the form of debugging mysterious crashes or working around tools' limitations. I have developed the libcrunch system for run-time type checking and robust bounds checking. This consists of a runtime library based on liballocs, together with compilers wrappers and other toolchain extensions for instrumenting code and collecting metadata. Slowdowns are typically in the range 0–35% for type checking and 15–100% for bounds checking. The system is also unusually good at supporting use of uninstrumented libraries, recovering from errors, and supporting multiple languages. Its design is a nice illustration of an approach I call toolchain extension: realising new facilities using a combination of source- and binary-level techniques, both pre- and post-compilation, together with the relevant run-time extensions. Critically, it re-uses the stock toolchain as much as possible, and takes care to preserve binary compatibility with code compiled without use of these extensions. The mainline implementation uses the CIL library; during 2014—15, Chris Diamand developed a Clang front-end, with help from David Chisnall. In 2016–17 Richard Tynan developed a version of the runtime for the FreeBSD kernel. Motto: unsafe languages allow mostly-safe implementations; combine compile- and run-time techniques to make this the pragmatic option

Rigorous specification of systems languages and system services

The semantics of programming languages are well studied, but in some cases, notably C and C++, they remain hard to nail down. With Cambridge and ISO WG14 colleagues, I'm involved in an effort to put C as actually practised on a firmer semantic footing. Relatedly, with JKU and Kent colleagues I have worked on understanding the most common ways in which C is effectively extended with inline assembly. Meanwhile, two equally crucial but much less-studied areas are the semantics of linking and (separately) of system calls. The challenges of verified toolchains and verified run-time environments depend critically on accurate specifications for these services. With my colleagues on the REMS project, we're working both on detailed specifications of ELF linking and Linux system call interface. This also means developing tools for experimentally testing and refining these specifications. In summer 2015 Jon French developed tools for specifying system calls in terms of their memory footprints. Motto: a specification for every instruction.

Principled approaches to debugging information

Debugging native code often involves fighting the tools. Programmers must trust to luck that the debugger can walk the stack, print values or call functions; when this fails, they must work with incomplete or apparently impossible stack traces, “value optimized out” messages, and the like. Obtaining debugging insight is even harder in production environments, where unreliable debugging infrastructure risks both downtime and exploits (see Linus Torvalds in this thread, among others). With REMS colleagues, Francesco Zappa Nardelli, and talented students including Théophile Bastian and Simon Ser, we're working on tools and techniques for providing a robust, high-quality debugging experience. This comes in several at its simplest, it means sanity-checking compiler-generated debugging information. At its most complex, it means propagating more information across compiler passes, to enable debugging information that is correct by construction. Debugging is also in tension with compiler optimisation, and is also inherently in need of sandboxing (to limit who can consume what application state). Motto: if it compiles, it should also debug.

Cross-language programming without FFIs

Language barriers are holding back programming. They splinter the ecosystem into duplicated fragments, each repeating work. Foreign function interfaces (FFIs) are the language implementers' attempt at a solution, but do not serve programmers' needs. FFIs must be eliminated! Instead, using a descriptive plane (again, provided by liballocs), we can make higher-level language objects appear directly within the same metamodel that spans the entire Unix process, alongside lower-level ones, which are mapped into each language's “object space”. For example, in my hacked nodeJS (quite a bit of V8 hacking was required too), all native functions visible to the dynamic linker are accessible by name under the existing process object. You're no longer dependent on hand-written C++ wrapper code, nor packages like node-ffi to access these objects. The type information gathered by liballocs enables direct access, out-of-the-box. It's far from a robust system at the moment, but the proof-of-proof-of-concept-concept is encouraging. Motto: FFIs are evil and must be destroyed, but one VM need not (cannot) rule them all.

Alternative approaches to static type- and memory-correctness assurances

An infamous dichotomy has long divided programming and programmers: between statically type-checked programming, and “not that”. Type-correctness is a great property, because it is highly generic (all programs have some latent notion of “type”) and is very nearly always what you want. Meanwhile, syntax-directed compile-time checking is a specific and limited method of establishing that property. Gradual typing helps with refactoring towards a syntactically checkable form, but still banks on syntax-directed checkability as the end point. Dependently typed languages are pushing that envelope from one particular direction: by using increasingly sophisticated, less syntactic compile-time analyses, expanding the range of checkable properties while seeking to preserve soundness. I'm interested in a third, complementary approach: what if we throw away the hard requirement of soundness, and also lose the paradigm of devising new languages, but try to generalise our approach to static checking beyond the syntactic, but retaining the sweet spot that checking remains sound on those programs that are syntactically amenable? This would still allow some (perhaps underapproximate) static checking even on gnarlier programs, such as reflective or downcast-heavy code. This has been taken up by the PhD work of Justus Adam.

Comprehensive, easy-to-use instrumentation on managed platforms

Programmers often want to develop custom tooling to help them profile and debug their applications. On virtual machines like the JVM, this traditionally means messing with low-level APIs such as JVMTI, or at best, frameworks like ASM which let you rewrite the application's bytecode. In Lugano, work done together with my DAG colleagues proposed a new programming model for these tasks based on aspect-oriented primitives (join points and advice), and also a yet higher-level model based on an event processing abstraction reminiscent of publish-subscribe middleware. On the run-time side, our infrastructure addresses the critical problem of isolation: traditionally, instrumentation and application share a virtual machine, and may interfere with one another in ways that are difficult to predict and impossible, in general, to avoid except by arbitrarily excluding shared classes (such as java.lang.* and the like). Our work gets around this by avoiding doing any bytecode instrumentation within the VM—instead at any instrumentation point we insert only a straight-to-native code path that hands off run-time events to a separate process (the “shadow VM”) which performs the analysis. This allows much stronger coverage properties than are possible with in-VM bytecode instrumentation, and nicely handles distributed applications—but also brings challenges with event synchronisation and ordering. Motto: robust, high-coverage without bytecode munging or hacky exclusion lists.

Radically easier composition

Programming by composition is the norm, but the norm also imposes needlessly high costs on developers. Keeping up with APIs as they evolve is a familiar drain on programmers' time. Inability to migrate to a different underlying library or platform is a familiar inhibitor of progress. It doesn't need to be this way. Why isn't software soft, malleable and composable? Why is so much tedious work left to the human programmer? My PhD work explored an alternative approach in which software is composed out of units which aren't expected to match precisely. A separate bunch of languages and tools, collectively called an “integration domain”, are provided to tackle the details of resolving this mismatch. I designed the language Cake to provide a rule-based mostly-declarative notation for API adaptation. The Cake compiler accepts a bunch of rules relating calls and data structure elements between APIs, including in various context-sensitive fashions. The Cake compiler then generates (nasty C++) wrapper code, saving the programmer from writing wrappers by hand. Motto: down with manual glue coding, down with silos!

What is that?

A recurring theme of my work is that I like to ask big questions, including questions of the form “what is X?” (for various X). The X so far have been the following.

Published peer-reviewed articles

A note about paywalls: those unfamiliar with research publishing may not be aware that authors see none of the proceeds from paywalled content. I have no interest, financial or otherwise, in keeping my work behind paywalls. If something you'd like to read isn't available, this is either from my mistake or neglect; please contact me.


See also my author page on DBLP (with much credit to Michael Ley and others for this great bibliographic resource).

Peer-reviewed abstracts, tool demonstrations and similar

Manuscripts, reports, invited papers, dissertations etc.

Talks

Grants

I'm currently an investigator on the following research grants.

  • August 2020 – July 2023: CapC: Capability C semantics, tools and reasoning. EPSRC, ref EP/V000470/1, £485k (PI: Batty)
  • I was previously an investigator on the following grants.

    I believe that funding policy in the UK (and elsewhere) is on the wrong track, and increasingly so. We should fund people, not projects. In universities, that means mostly funding institutions via a block grant, with corresonding reduction of the funds allocated to individual investigators' short-term projects.

    Interests

    My research interests are outlined at the top of this page. In summary: although broad, they fall mostly within the intersections of systems, programming languages and software engineering.

    PhD work

    For my PhD I worked on supporting software adaptation at the level of the operating system's linking and loading mechanisms. Here “adaptation” means techniques for connecting and combining software components which were developed independently, and therefore do not have matched interfaces. My emphasis was on techniques which are practical, adoptable and work with a wide variety of target code, at the (hopefully temporary) expense of safety and automation.

    My dissertation is now available as a technical report.

    Students

    I am currently not supervising any PhD students.

    I previously supervised the following students

    I am actively seeking research students who wish to do a PhD under my supervision. If this interests you, please contact me.

    Funding is available, by competition, through King's College London, and possibly from other sources. Please contact me (regardless of the time of year!).

    I frequently supervise research-flavoured projects conducted by Bachelor's and Master's students. These are in the Teaching section.

    I also maintain a list of project ideas for students at all levels, and am always happy to talk about other ideas.

    History

    During 2005–06 I was a Research Assistant in Cambridge on the XenSE and Open Trusted Computing projects, under Steven Hand. Both projects seek to implement a practical secure computing platform, using virtualisation (and similar technologies) as the isolation mechanism.

    XenSE never had a web page of its own, but you might want to look at the abstract on the project's EPSRC Grant Portfolio page, or check out the mailing list.

    OpenTC is a large EU-funded project involving many major industrial and academic partners, focused on the use of Trusted Computing Group technology to realise many common secure computing use cases.

    As part of my work as an RA, I became interested in secure graphical user interfaces including L4's Nitpicker, a minimal secure windowing system. I began work on ports of this system to Linux, XenoLinux and the Xen mini-OS: the Linux version became mostly functional (but not yet secure!) while the others were stymied by various limitations with shared memory on Xen. These limitations are mostly fixed now, but I haven't had time to revisit the project since. Feel free to contact me for more information. If you wanted to take these up, I'd be glad to hear from you.

    See also my attic page.

    Service and appointments

    In research

    Currently I'm the Programme Chair for Onward! Papers 2020, on the External Review Committee for PLDI 2020, on the PC for ECOOP 2020, on the PC for <Programming> 2020, and on several workshop PCs.

    In the past few years I have been on the PC for <Programming> 2019, on the External Review Committee for OOPSLA 2019, on the PC for ISMM 2019, on the PC for the PLOS workshop in 2019, and on the PC for Salon des Refusés at <Programming> 2019. a co-organiser of the 2018 VMIL workshop, on the PC for SPLASH workshops (2016), a co-organiser of Salon des Refusés workshop, and a reviewer for ACM TACO and Software: Practice and Experience.

    Before that, on the PC for Onward! Essays (2015) and contributor to a CCC/CRA visioning workshop reporting on funding priorities for topics in accessible computing. Previously I was an external reviewer for the ASE journal, on the programme committee for PPPJ 2013, publicity chair for SC 2013, on the programme committee for RESoLVE 2012 at ASPLOS, and the shadow programme committee for Eurosys 2012.

    Even longer ago, I was privileged to contribute external reviews to SVT at SAC 2012, ESOP 2010 and EuroSys 2009.

    I'm a member of the ACM, ACM SIGPLAN and SIGOPS.

    At the University of Kent

    I continue to be an Associate Member of the Kent Interdisciplinary Research Centre in Cyber Security (iCCS), in connection with the security applications of my work on topics of run-time safety, program analysis and formal specification.

    In Cambridge

    Since August 2018, when I moved from Cambridge to Kent, I've remained a (sometimes-) visiting researcher at the University of Cambridge's Computer Laboratory (now, more properly, the Department of Computer Science and Technology).

    From December 2015 until September 2017 I served on the University of Cambridge's Board of Scrutiny, which has an oversight role in University governance. During the 2016–17 year I also acted as the Board's Secretary.

    At the Computer Laboratory I served on, and (from June 2017 until July 2018) chaired the “post-doc forum”, later renamed Research Staff Forum, representing research staff.

    From May 2015 until July 2018, I was a postdoctoral affiliate at Christ's College (of which I'm also an alumnus, from my BA and PhD days).

    From summer 2015 until summer 2018 I was an elected officer of the Postdocs of Cambridge Society (PdOC) which, in addition to providing networking and social events within the cross-disciplinary postdoctoral community, represents postdoctoral staff to various University bodies.

    Since May 2008 I've been a Fellow of the Cambridge Philosophical Society.

    Teaching

    Lecturing

    I was recently a lecturer on the following modules at the University of Kent.

    Previously at Kent, I lectured on the following discontinued modules.

    Projects

    I'm interested in supervising bright and enthusiastic Bachelor's and Master's students for their individual projects. For ideas and to find out what I'm interested in, see my list of suggested projects. I'm also always extremely happy to talk to students who have their own ideas.

    I've supervised many projects in the past. I was fortunate to work with the following final-year research project (CO620) student in Kent:

    Although I'm no longer in Cambridge, Bachelor's students there can still read my thoughts about Part II projects. I supervised the following Part II projects in Cambridge:

    Older teaching material (including various Cambridge-specific things) can be found on my attic page.

    Software

    My projects

    Note: as of late 2011, I have started using GitHub. Over time, code for all my larger projects will start to appear on my GitHub page. This page is likely to remain the more complete list, for the time being.

    My research work involves building software. Inevitably, this software is never “finished”, rarely reaches release-worthy state, and usually rots quickly even then. So not much here is downloadable yet; this will improve over time, but in the meantime I list everything and invite you to get in touch if anything sounds interesting. My main projects, past and present, are:

    Smaller contributions

    I've also submitted small patches to various open-source projects including LLVM (bugfix to bitcode linker), binutils (objcopy extension), gcc (documentation fix for gcj), Knit (compile fixes), Datascript (bug fixes), DICE (bug fixes), pdfjam (support “page template” option) and Claws Mail (support cookies file in RSS plugin). Some of them have even been applied. :-)

    Goodies

    Apart from my main development projects, I sometimes produce scripts and other odds and ends which might be useful to other people. Where time permits, I try to package them half-decently and make them available here.

    For computer science researchers

    I have written some scripts which attempt to retrieve decent BibTeX for a given paper (as a PDF or PostScript file). details

    For researchers in a research institution

    I've written a nifty script for printing papers, which helps people save paper, share printed-out papers and discover perhaps unexpected collaborators within their institution. details

    For supervisors of Tripos courses (in Cambridge)

    I have a Makefile which downloads and compiles Tripos past-paper questions. It's pretty much self-documenting. Here it is.

    For Computer Laboratory historians

    I compiled a list of the University Officers of the Mathematical Laboratory or Computer Laboratory over the period 1937–2000, using the data from ACAD.

    For general audiences

    I have built a sizable collection of vaguely useful shell scripts and other small fragments. One day “soon” I will get round to publishing them. The biggest chunks are my login scripts, which use Subversion to share a versioned repository of config files across all the Unix boxes that I have an account on, and the makefile and m4 templates that build this web page. I need to clean these up a bit. In the meantime, if you're interested in getting hold of them, let me know.

    Thoughts

    Occasionally I write down some thoughts which somehow relate to my work. They now appear in the form of a blog.Posts are categorised into research, and teaching, development, publishing, higher education and meta strands. There's also an index of all entries.

    Personal

    I have the beginnings of a personal web page. It's very sparse right now. Have a look, if you like.

    Contact

    Officetbd
    E-mailsrkell@acm.org
    PostDr Stephen Kell
    Department of Informatics
    King's College London
    Bush House, 30 Aldwych
    London, WC2B 4BG
    United Kingdom

    Content updated at Thu 5 Aug 23:27:00 BST 2021.
    validate this page