A Response to Lampson's Essay on Software Components

Stephen Kell

Stephen.Kell@cl.cam.ac.uk

Introduction

I was recently fortunate enough to chat briefly with Andrew Birrell and Michael Isard, of Microsoft Research Silicon Valley, while they were visiting the Computer Laboratory here in Cambridge. I told them a little about my interests in component-based software development, and, playing Devil's advocate, they pointed me towards Butler Lampson's essay on the subject [1]. It's an interesting read.

Entitled ``Software Components: Only The Giants Survive'', his essay argues that component-based software development is doomed to failure. Starting from McIlroy's initial proposal of a software component library [2], he cites three reasons why it ``won't work''. Each contains a worthwhile point, but is argued somewhat obtusely, and the arguments in no way justify the conclusions. I'll now respond to each of his major points in turn.

Business model? Development model

Firstly, he claims that there's ``no business model'' for components. Perhaps owing to a latent assumption that components only ever arise as by-products from in-house development work, what he actually argues is something quite different: that the particular development model he envisages wouldn't be economical.

How not to develop components

It's true: highly general, out-of-the-box re-usable components are more expensive to create than a ``good enough'' version, and small components are difficult to market. But Lampson's argument is predicated on the assumption that components are developed and delivered in a very particular way: as unmodifiable black-boxes in a unidirectional provider-consumer fashion. As a result, they can't be developed collaboratively or incrementally (hence the ``generality'' problem). Similarly, consumers can't read the source code when documentation fails them (``documentation''), or fix bugs themselves (``testing'').

So we agree that this development model does not make a good one for component-based development - but it's not the only way. Take a look at the open-source world: bugs are plentiful and documentation is often scant, but communication between developers is extensive in each direction, and source code is fully available. Under this development model, most of Lampson's stated problems go away. Perhaps unwittingly, he's done a good job of furthering the existing arguments (such as those of Raymond [3] and, more recently, Spinellis and Szyperski [4]) as to why closed-source non-collaborative methods are not such a good way to develop software in the first place.

Note that components don't require all the liberty of open source: even a simple Microsoft-style ``shared source'' license, together with close collaboration between a component's consumers and developers, would avoid most of the problems Lampson identifies. We can therefore sidestep awkward arguments about funding, because component developers can still charge their customers directly. Crucially, this development model embraces ``opportunistic reuse'': if a partial solution is available, use it - and ideally contribute any enhancements and fixes back upstream.

Interestingly, open-source software still suffers from surprisingly high levels of code duplication, either by reimplementation or, better but still undesirable, by source-level copy-and-paste. That phenomenon is mostly outside the scope of this piece, but much of the work I cite later has direct application to problems which underly it.

What else needs fixing?

I haven't yet covered a few of the properties which Lampson claims make components too expensive to develop: simplicity, customisability and stability. The latter is a familiar problem: inevitably, some kinds of innovation and improvement require interface changes. Any component author avoids making unnecessary interface changes. When they do happen, the approach is simple: inform your clients of ongoing changes, consult them about future changes, and keep the old version available. If neither you nor they want to maintain support for the old interface, then they can stick with the old version of the component. Alternatively, there is research into tools which can automatically glue plug-incompatible interfaces together by generating adapters [8,28].

Of course, that's only half of the stability problem: subtle and unwitting interface breakages can be caused by either client or provider making unstated assumptions which are subsequently broken by the other party, without any compilation error or other obvious message as a flag. Stricter and more precise interface specification is the obvious approach, and Lampson himself later mentions the need for this (``specification with teeth'') - without acknowledging its obvious application to components. Better specifications would also help with the ``simplicity'' requirement, since they naturally aid comprehension.

Finally, his take on customisability is particularly odd one: it's clear that a reusable component should be carefully parameterised, but it seems to me that the need for programmability is far rarer than he implies. We don't need a programmable hash table, a programmable HTML parser, a programmable MPEG decoder or a programmable gzip. Perhaps this insistence on programmability foreshadows his focus on very large ``components'', to which I'll return later.

Understanding requires specification

Lampson's second stated problem is a converse of the first: components are difficult for a client to understand, and therefore expensive (often unpredictably so) to consume. Again, this is certainly a problem, but not an insurmountable one, and again, specifications are key.

Of course, most software currently isn't specified very well, let alone using a toolchain ``with teeth''. It's a fair point that uncertainty over the hidden costs associated with the specification's failings might disincline developers from reusing a foreign component. That's why people are working on better specification methods. Even better, this work is taken very seriously by a number of high-profile tool vendors - not least Microsoft, who fund a great deal of relevant research. As a result, practical advances are continually making their way into compilers and other tools that people really use. Here, off the top of my head, are a few of the relevant research areas, with sample references for each. (I can't claim to be an expert in most of the areas, so many of the references are not the most up-to-date examples.)

stronger type systems, such as those of Java [5] and ML [6];
stronger interface specifications, such as those including the notion of sequencing or protocol [7,8];
module metadata and versioning [9,10,11];
language features and annotations (such as ``design by contract'') for interface specifications and verification [12,13,14];
software modelling and visualisation tools [15,16];
specification matching and component retrieval [17].

As more of this technology makes its way into popular tools and languages, we can expect to see continuing improvements in software, and increased adoption of component-based techniques.

World views? Languages and architectures

The third and final of Lampson's arguments is that of ``conflicting world views''. Many aspects of ``world view'' are simple environmental dependencies such as conventions for data encoding, memory management, exception handling and so on. These are essentially language-level features, although lower-level languages (such as C) may permit many simultaneous different approaches to each.

Whether inter- or intra-language, this kind of interoperability is rarely impossible, since most languages provide some kind of escape hatch (such as Java's JNI [18] and Haskell's FFI [19]). What varies is how much glue code it is necessary to write, and how much this reduces the maintainability of the program. Both of these concerns are subjects of research. Aspect-oriented programming [20] can improve maintainability in the intra-language case. Various autogeneration tools [21], virtual machines [5,22] and language extensions [23] try to minimise the need for hand-written glue code in the inter-language case.

It's not always the language-level features which cause conflicts. Harder are the problems summarised as ``architectural mismatch'', after the paper by Garlan et al [24]. Any component's interface makes certain assumptions about the environment within which the component was designed to be instantiated; traditionally these aren't stated explicitly in the interface specification. Such assumptions might involve the threading model, the presence or absence of other components, the behaviour or performance of component connection mechanisms, the order of component initialisation, and other factors.

This sounds like a hard problem, but the counter-strategies are simple and familiar. Again, they centre on improving the expressivity and checkability of specifications. When writing a component, a developer should assume as little as possible about their clients and dependencies. Anything he does assume must be declared explicitly. This is not common practice in most software development today. Making it convenient to specify components in this way is a central goal of component research, as evidenced by the work cited in Section 3.

The complementary step is to make it convenient to compose components which have individually been specified in this way. Separation between computational and compositional code, with specially-designed languages and tools for creating the latter, is a well-known approach. Here are some examples of relevant work:

linkage models making both exports and dependencies (imports) explicit [25];
composition languages and linkage languages [26,27];
automatic adaptation, based on type systems and interface specifications [7,8,28];
so-called ``lightweight container'' and ``inversion of control'' architectural styles, increasingly popular among contemporary component middlewares [29].

Exploding the ``giant component''

A final point of Lampson's is that ``giant components'' such as popular web browsers, databases, operating system kernels and virtual machines, are a success. What does ``success'' mean here? For that matter, what have we been meaning by ``component'' all this time?

Looking not only at McIlroy's initial proposal but at the entire canon of ensuing research, it seems clear that ``success'' with ``components'' is something observed in the development work needed to realise a new project. We should be able to construct new software by re-using pre-existing implementation unchanged, wherever any is available which fulfils some intrinsic requirements of the new project. The converse property should also be true: if any new implementation is necessary, it should be possible later to re-use that in any project for which it is suitable. Components are the units of re-use, and relative success is the extent to which these properties hold.

Lampson hails these giant components as a success because they're widely re-used. This is certainly true: we cannot deny their usefulness. It's also clear, however, that they're not sufficient, because they're too general. The construction of any application will require lots of additional code (or ``customisation'' as Lampson might call it), in addition to these giant components. Much of this code is not inherently application-specific, so we still face the problem of how to re-use this code wherever possible.

Of course, Lampson could be right that the component approach can never solve this latter problem. Hopefully my earlier arguments have done something to counter that suggestion - and if there are uncertainties remaining, there are surely enough promising ideas emerging to justify continued research.

Odds and ends

The Unix anomaly

It's worth skipping back to Lampson's early comments on Unix tools, which he cites as a rare successful example of small components. They are successful, he claims, because they ``have a very simple interface and ... most of them were written by a single tightly-knit group''. But he chooses not to explore the obvious follow-up questions: by what criteria are they a success, and what makes other components so different that they fail by these criteria?

Just like the ``giant components'' (see Section 5 above), Unix tools are a ``success'' because they're widely deployed and widely used. Since they were written by a single tightly-knit group, they compose easily with each other. But again, they're sufficiently general-purpose that they're only useful when combined not only with each other, but with some code or data of your own. This is where the difficulties start: most applications are concerned with entities more complex than character streams. Although it's easy for experienced users to forget it, Unix tools are anything but easy to use.

The supposedly advantageous ``simple interface'' is actually rather too simple, because it's oblivious to the internal structure of data. This shortcoming will be familiar to anyone who's tried to process some moderately-structured data through a Unix pipeline without it being corrupted by countless interacting text-substitution and escaping conventions. Users must also contend with a bewildering variety of command-line options and formatting conventions, with less consistency than you might expect. It takes considerable smarts and experience to master these intricacies. Unsurprisingly, 99% of computer users will never touch Unix tools, and engineers with Unix experience can demand significantly higher rates of pay.

In this light, Unix tools don't seem like such a success any more. Why, then, haven't they been overtaken by something better? The answer can only be the continued popularity of Unix itself, which in turn is better explained by social phenomena than by any technical study. By contrast, other components cannot hope to piggy-back on Unix's established market position and giant install base, nor indeed to be quite so general-purpose, so their ``success'' is unlikely to rank highly on Lampson's scale. Fortunately, as we've already discussed, this is not the only way to evaluate success.

Efficiency, and some speculation about actual cost

To give credit where it's due, Lampson's comments on the nature of software efficiency are undoubtedly true. We sacrifice raw mechanical efficiency for human efficiency, i.e. delivering results quickly using little manpower.

On the other hand, there's possibly an interesting discrepancy between actual and apparent costs here. If you worked out the cost of optimising away 80% of the bloat and computational wastefulness of the world's most-used software, and compared this to the costs in electricity and new hardware that these inefficiencies incur every year, there might well suddenly be a case for carefully optimising these few programs. Alternatively, there might not. It'd be interesting to read more on this.

The role of declarative programming

Lampson mentions ``declarative programming'' as a technique which could be researched in favour of component technology. His definition seems roughly to equate to higher-level and more domain-specific languages.

Undoubtedly these are helpful ideas, but they've also been known for a long time - since far earlier, in fact, than the start of serious research into software re-use. They haven't solved all the problems yet. In application, they are inherently less universal than the idea of components. For example, there's a limit to how domain-specific a language can be before diminishing returns set in, making it no longer worth the effort of implementing an interpreter. For the software that's left over, for the interpreters themselves, and for the domain-specific code itself, component re-use is a worthwhile goal.

Perhaps in a few situations, reimplementation will prove inherently easier than solving the difficulties of re-use - but only when we've tackled the re-use problem in all other situations will that be a case for giving up. We're still a very long way from that point.

Conclusions

Dominating the tone of Lampson's entire argument is a pessimism which seems almost to overlook the very nature of research. We do research to solve unsolved problems. To convince anyone that something ``won't work'' takes far more than a summary of the deficiencies among existing practices. His argument is quite a plausible explanation as to why certain sectors of the software industry haven't had more success with components. But the conclusion that research into component technology should be abandoned is not remotely justified.

Bibliography

1

B.W. Lampson. Software Components: Only the Giants Survive. In Computer Systems: Theory, Technology, and Applications, K. Sparck-Jones and A. Herbert (editors), Springer, 2004, pp 137-146.

2

M.D. McIlroy. Mass produced software components. Report on the NATO Software Engineering Conference, pages 79-87, 1968.

3

E.S. Raymond. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. 2nd Edition, O'Reilly, 2001.

4

D. Spinellis and C. Szyperski. How Is Open Source Affecting Software Development? Guest editorial, IEEE Software vol. 21 issue 1, January 2004.

5

J. Gosling, B. Joy, and G. Steele. The Java Language Specification. Addison Wesley, 1996.

6

R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of Standard ML (Revised). MIT Press, 1997.

7

D.M. Yellin, R.E. Strom. Protocol Specifications and Component Adaptors. ACM Transactions on Programming Languages and Systems, Vol. 19, No. 2, pp. 292-333, 1997.

8

R. Passerone, L. de Alfaro, T.A. Henzinger, and A. Sangiovanni-Vincentelli. Convertibility verification and converter synthesis: two faces of the same coin. In Proceedings of the 2002 IEEE/ACM international Conference on Computer-Aided Design. ACM Press, New York, pp. 132-139, 2002.

9

A. Orso, M. Harrold, and D. Rosenblum. Component Metadata for Software Engineering Tasks. Proc. 2nd International Workshop on Engineering Distributed Objects (EDO 2000), Springer-Verlag Lecture Notes in Computer Science, Nov. 2000, pp. 129-144.

10

J. Gosling, K. Arnold, and D. Holmes. The Java Programming Language. 4th Edition. Addison Wesley, 2005.

11

P. Sewell. Modules, abstract types, and distributed versioning. In Proceedings of the 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '01, pp. 236-247. ACM Press, New York, 2001.

12

B. Meyer. Applying ``Design By Contract''. IEEE Computer vol. 25, issue 10 (Oct. 1992), pp. 40-51.

13

C. Flanagan, K.M. Leino, M. Lillibridge, G. Nelson, J.B. Saxe, and R. Stata. Extended static checking for Java. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, PLDI '02, pp. 234-245. ACM Press, New York, 2002.

14

M. Barnett, K. Rustan, M. Leino and Wolfram Schulte. The Spec# programming system: An overview. In CASSIS 2004, LNCS vol. 3362, Springer, 2004.

15

J.-P. Tolvanen and M. Rossi. MetaEdit+: defining and using domain-specific modeling languages and code generators. In Companion of the 18th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA '03, pp. 92-93. ACM Press, New York, 2003.

16

M. Lanza and S. Ducasse. CodeCrawler: An Extensible and Language Independent 2D and 3D Software Visualization Tool. Tools for Software Maintenance and Reengineering, pp. 74-94, Franco Angeli, Milano, 2005.

17

B. Fischer. Specification-Based Browsing of Software Component Libraries. Proc. Automated Software Engineering, pp. 246-254, Kluwer, 1998.

18

Sun Microsystems Inc. Java Native Interface Specification. 1997.

19

M, Chakravarty (ed.). The Haskell 98 foreign function interface 1.0: An addendum to the Haskell 98 report. http://www.cse.unsw.edu.au/%7Echak/haskell/ffi/.

20

G. Kiczales. Aspect-oriented programming. ACM Computing Surveys, 28A(4), 1996.

21

D.M. Beazley. SWIG: An Easy to Use Tool for Intergrating Scripting Languages with C and C++. In 4th Annual Tcl/Tk Workshop, Monterey, July 1996.

22

D.S. Platt and K. Ballinger. Introducing Microsoft .NET. Microsoft Press. Redmond, WA, 2001.

23

K.E. Gray, R.B. Findler, and M. Flatt. Fine-grained interoperability through mirrors and contracts. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, OOPSLA '05, pp. 231-245. ACM Press, New York, 2005.

24

D. Garlan, R. Allen, and J. Ockerbloom. Architectural mismatch or why it's hard to build systems out of existing parts. In Proceedings of the 17th international Conference on Software Engineering, ICSE '95. ACM Press, New York, pp. 179-185, 1995.

25

M. Flatt and M. Felleisen. Units: cool modules for HOT languages. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pp. 236-248. ACM Press, New York, 1998.

26

F. Achermann, M. Lumpe, J.-G. Schneider and O. Nierstrasz. Piccola: a Small Composition Language. In Formal Methods for Distributed Processing: A Survey of Object-Oriented Approaches, H. Bowman and J. Derrick (Eds.), pp. 403-426, Cambridge University Press, 2001.

27

A. Reid, M. Flatt, L. Stoller, J. Lepreau, and E. Eide. Knit: Component Composition for Systems Software. In Proc. of the 4th Symposium on Operating Systems Design and Implementation, pages 347-360, San Diego, CA, October 2000.

28

C. Haack, B. Howard, A. Stoughton, and J.B. Wells. Fully Automatic Adaptation of Software Components Based on Semantic Specifications. In Proceedings of the 9th international Conference on Algebraic Methodology and Software Technology, Lecture Notes In Computer Science, vol. 2422, pp 83-98. Springer-Verlag, London, 2002.

29

R. Johnson. J2EE Development Frameworks. IEEE Computer vol. 38, issue 1 (Jan. 2005), pp. 107-110.

About this document ...

A Response to Lampson's Essay on Software Components

This document was generated using the LaTeX2HTML translator Version 2008 (1.71)

The command line arguments were:
latex2html -split 0 -address 'Stephen Kell, ' -nonavigation -notop_navigation -nobottom_navigation thoughts/lampson-response.tex

The translation was initiated by Stephen Kell on 2024-04-03

Stephen Kell,