Entitled ``Software Components: Only The Giants Survive'', his essay argues that component-based software development is doomed to failure. Starting from McIlroy's initial proposal of a software component library [2], he cites three reasons why it ``won't work''. Each contains a worthwhile point, but is argued somewhat obtusely, and the arguments in no way justify the conclusions. I'll now respond to each of his major points in turn.
Firstly, he claims that there's ``no business model'' for components. Perhaps owing to a latent assumption that components only ever arise as by-products from in-house development work, what he actually argues is something quite different: that the particular development model he envisages wouldn't be economical.
It's true: highly general, out-of-the-box re-usable components are more expensive to create than a ``good enough'' version, and small components are difficult to market. But Lampson's argument is predicated on the assumption that components are developed and delivered in a very particular way: as unmodifiable black-boxes in a unidirectional provider-consumer fashion. As a result, they can't be developed collaboratively or incrementally (hence the ``generality'' problem). Similarly, consumers can't read the source code when documentation fails them (``documentation''), or fix bugs themselves (``testing'').
So we agree that this development model does not make a good one for component-based development - but it's not the only way. Take a look at the open-source world: bugs are plentiful and documentation is often scant, but communication between developers is extensive in each direction, and source code is fully available. Under this development model, most of Lampson's stated problems go away. Perhaps unwittingly, he's done a good job of furthering the existing arguments (such as those of Raymond [3] and, more recently, Spinellis and Szyperski [4]) as to why closed-source non-collaborative methods are not such a good way to develop software in the first place.
Note that components don't require all the liberty of open source: even a simple Microsoft-style ``shared source'' license, together with close collaboration between a component's consumers and developers, would avoid most of the problems Lampson identifies. We can therefore sidestep awkward arguments about funding, because component developers can still charge their customers directly. Crucially, this development model embraces ``opportunistic reuse'': if a partial solution is available, use it - and ideally contribute any enhancements and fixes back upstream.
Interestingly, open-source software still suffers from surprisingly high levels of code duplication, either by reimplementation or, better but still undesirable, by source-level copy-and-paste. That phenomenon is mostly outside the scope of this piece, but much of the work I cite later has direct application to problems which underly it.
I haven't yet covered a few of the properties which Lampson claims make components too expensive to develop: simplicity, customisability and stability. The latter is a familiar problem: inevitably, some kinds of innovation and improvement require interface changes. Any component author avoids making unnecessary interface changes. When they do happen, the approach is simple: inform your clients of ongoing changes, consult them about future changes, and keep the old version available. If neither you nor they want to maintain support for the old interface, then they can stick with the old version of the component. Alternatively, there is research into tools which can automatically glue plug-incompatible interfaces together by generating adapters [8,28].
Of course, that's only half of the stability problem: subtle and unwitting interface breakages can be caused by either client or provider making unstated assumptions which are subsequently broken by the other party, without any compilation error or other obvious message as a flag. Stricter and more precise interface specification is the obvious approach, and Lampson himself later mentions the need for this (``specification with teeth'') - without acknowledging its obvious application to components. Better specifications would also help with the ``simplicity'' requirement, since they naturally aid comprehension.
Finally, his take on customisability is particularly odd one: it's clear that a reusable component should be carefully parameterised, but it seems to me that the need for programmability is far rarer than he implies. We don't need a programmable hash table, a programmable HTML parser, a programmable MPEG decoder or a programmable gzip. Perhaps this insistence on programmability foreshadows his focus on very large ``components'', to which I'll return later.
Lampson's second stated problem is a converse of the first: components are difficult for a client to understand, and therefore expensive (often unpredictably so) to consume. Again, this is certainly a problem, but not an insurmountable one, and again, specifications are key.
Of course, most software currently isn't specified very well, let alone using a toolchain ``with teeth''. It's a fair point that uncertainty over the hidden costs associated with the specification's failings might disincline developers from reusing a foreign component. That's why people are working on better specification methods. Even better, this work is taken very seriously by a number of high-profile tool vendors - not least Microsoft, who fund a great deal of relevant research. As a result, practical advances are continually making their way into compilers and other tools that people really use. Here, off the top of my head, are a few of the relevant research areas, with sample references for each. (I can't claim to be an expert in most of the areas, so many of the references are not the most up-to-date examples.)
As more of this technology makes its way into popular tools and languages, we can expect to see continuing improvements in software, and increased adoption of component-based techniques.
The third and final of Lampson's arguments is that of ``conflicting world views''. Many aspects of ``world view'' are simple environmental dependencies such as conventions for data encoding, memory management, exception handling and so on. These are essentially language-level features, although lower-level languages (such as C) may permit many simultaneous different approaches to each.
Whether inter- or intra-language, this kind of interoperability is rarely impossible, since most languages provide some kind of escape hatch (such as Java's JNI [18] and Haskell's FFI [19]). What varies is how much glue code it is necessary to write, and how much this reduces the maintainability of the program. Both of these concerns are subjects of research. Aspect-oriented programming [20] can improve maintainability in the intra-language case. Various autogeneration tools [21], virtual machines [5,22] and language extensions [23] try to minimise the need for hand-written glue code in the inter-language case.
It's not always the language-level features which cause conflicts. Harder are the problems summarised as ``architectural mismatch'', after the paper by Garlan et al [24]. Any component's interface makes certain assumptions about the environment within which the component was designed to be instantiated; traditionally these aren't stated explicitly in the interface specification. Such assumptions might involve the threading model, the presence or absence of other components, the behaviour or performance of component connection mechanisms, the order of component initialisation, and other factors.
This sounds like a hard problem, but the counter-strategies are simple and familiar. Again, they centre on improving the expressivity and checkability of specifications. When writing a component, a developer should assume as little as possible about their clients and dependencies. Anything he does assume must be declared explicitly. This is not common practice in most software development today. Making it convenient to specify components in this way is a central goal of component research, as evidenced by the work cited in Section 3.
The complementary step is to make it convenient to compose components which have individually been specified in this way. Separation between computational and compositional code, with specially-designed languages and tools for creating the latter, is a well-known approach. Here are some examples of relevant work:
A final point of Lampson's is that ``giant components'' such as popular web browsers, databases, operating system kernels and virtual machines, are a success. What does ``success'' mean here? For that matter, what have we been meaning by ``component'' all this time?
Looking not only at McIlroy's initial proposal but at the entire canon of ensuing research, it seems clear that ``success'' with ``components'' is something observed in the development work needed to realise a new project. We should be able to construct new software by re-using pre-existing implementation unchanged, wherever any is available which fulfils some intrinsic requirements of the new project. The converse property should also be true: if any new implementation is necessary, it should be possible later to re-use that in any project for which it is suitable. Components are the units of re-use, and relative success is the extent to which these properties hold.
Lampson hails these giant components as a success because they're widely re-used. This is certainly true: we cannot deny their usefulness. It's also clear, however, that they're not sufficient, because they're too general. The construction of any application will require lots of additional code (or ``customisation'' as Lampson might call it), in addition to these giant components. Much of this code is not inherently application-specific, so we still face the problem of how to re-use this code wherever possible.
Of course, Lampson could be right that the component approach can never solve this latter problem. Hopefully my earlier arguments have done something to counter that suggestion - and if there are uncertainties remaining, there are surely enough promising ideas emerging to justify continued research.
It's worth skipping back to Lampson's early comments on Unix tools, which he cites as a rare successful example of small components. They are successful, he claims, because they ``have a very simple interface and ... most of them were written by a single tightly-knit group''. But he chooses not to explore the obvious follow-up questions: by what criteria are they a success, and what makes other components so different that they fail by these criteria?
Just like the ``giant components'' (see Section 5 above), Unix tools are a ``success'' because they're widely deployed and widely used. Since they were written by a single tightly-knit group, they compose easily with each other. But again, they're sufficiently general-purpose that they're only useful when combined not only with each other, but with some code or data of your own. This is where the difficulties start: most applications are concerned with entities more complex than character streams. Although it's easy for experienced users to forget it, Unix tools are anything but easy to use.
The supposedly advantageous ``simple interface'' is actually rather too simple, because it's oblivious to the internal structure of data. This shortcoming will be familiar to anyone who's tried to process some moderately-structured data through a Unix pipeline without it being corrupted by countless interacting text-substitution and escaping conventions. Users must also contend with a bewildering variety of command-line options and formatting conventions, with less consistency than you might expect. It takes considerable smarts and experience to master these intricacies. Unsurprisingly, 99% of computer users will never touch Unix tools, and engineers with Unix experience can demand significantly higher rates of pay.
In this light, Unix tools don't seem like such a success any more. Why, then, haven't they been overtaken by something better? The answer can only be the continued popularity of Unix itself, which in turn is better explained by social phenomena than by any technical study. By contrast, other components cannot hope to piggy-back on Unix's established market position and giant install base, nor indeed to be quite so general-purpose, so their ``success'' is unlikely to rank highly on Lampson's scale. Fortunately, as we've already discussed, this is not the only way to evaluate success.
To give credit where it's due, Lampson's comments on the nature of software efficiency are undoubtedly true. We sacrifice raw mechanical efficiency for human efficiency, i.e. delivering results quickly using little manpower.
On the other hand, there's possibly an interesting discrepancy between actual and apparent costs here. If you worked out the cost of optimising away 80% of the bloat and computational wastefulness of the world's most-used software, and compared this to the costs in electricity and new hardware that these inefficiencies incur every year, there might well suddenly be a case for carefully optimising these few programs. Alternatively, there might not. It'd be interesting to read more on this.
Lampson mentions ``declarative programming'' as a technique which could be researched in favour of component technology. His definition seems roughly to equate to higher-level and more domain-specific languages.
Undoubtedly these are helpful ideas, but they've also been known for a long time - since far earlier, in fact, than the start of serious research into software re-use. They haven't solved all the problems yet. In application, they are inherently less universal than the idea of components. For example, there's a limit to how domain-specific a language can be before diminishing returns set in, making it no longer worth the effort of implementing an interpreter. For the software that's left over, for the interpreters themselves, and for the domain-specific code itself, component re-use is a worthwhile goal.
Perhaps in a few situations, reimplementation will prove inherently easier than solving the difficulties of re-use - but only when we've tackled the re-use problem in all other situations will that be a case for giving up. We're still a very long way from that point.
Dominating the tone of Lampson's entire argument is a pessimism which seems almost to overlook the very nature of research. We do research to solve unsolved problems. To convince anyone that something ``won't work'' takes far more than a summary of the deficiencies among existing practices. His argument is quite a plausible explanation as to why certain sectors of the software industry haven't had more success with components. But the conclusion that research into component technology should be abandoned is not remotely justified.
This document was generated using the LaTeX2HTML translator Version 2008 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -address 'Stephen Kell, ' -nonavigation -notop_navigation -nobottom_navigation thoughts/lampson-response.tex
The translation was initiated by Stephen Kell on 2024-04-03