A different angle on the basic features of object-oriented programming

Many texts characterise object-oriented programming (OOP) by its support for encapsulation, inheritance and polymorphism. That's probably a good way to learn about it, but as I've thought about it more, I've come to see OOP in slightly different terms. My own take is quite useful when thinking about module systems: in particular, it gives a new perspective on the arguments of Szyperski [1], which I'll explain afterwards. So, here's what I see as the fundamental differences between (statically-typed) OOP, such as that found in C++, and a more conventional procedural language such as C.

1. Module identity and dynamic module instantiation

Many procedural languages have some sort of module system. It's not always clear what ``module system'' means, but it's roughly any feature of language design which allows some parts of a program to be disallowed from referencing the internals of other parts (i.e. supports information hiding). It often comes together with separate compilation. C's module system, for example, is based on compilation units: we can control the visibility of global variables and function definitions by using the static modifier, roughly as C++'s private lets us make a definition invisible outside of its containing class.

So, what's the minimum we have to add to C to support C++-style encapsulation? Suppose we allowed C programs to contain more than one instance of a particular compilation unit. In other words, all the globals it defines could exist multiple times in memory. We could then crudely map C++-style class definitions to C-style module definitions. We also need some way of naming the different copies of the module independently, just as we can identify different C++ objects using their address. In other words, we need module identity. Finally, we need to be able to dynamically instantiate modules at run time (akin to C++'s new operator). Together, these features add encapsulated objects to C.

2. Subtyping

All object-oriented languages feature subtyping in some form. Interface inheritance is the most obvious form: by giving a type to modules or objects, based on the operations they export, we can naturally define subtyping as the supersetting of modules' defined operations. Given this, implementation inheritance is nothing but a shorthand for module definitions: we can base a new module definition on an existing one, without duplicating the latter's code. The new module type automatically inherits the interface of the existing one, so it's automatically a subtype.

3. Subtype polymorphism as a language feature

All that's left is the ability to override methods in our new module types, such that they support virtual function call, or more generally late binding. Most C programmers will know how to do this: we define a table structure, containing a set of function pointers. Instead of calling functions directly by name, we use a table associated with our particular module instance. The address of the appropriate function resides at a known index into the table, so when we want to call the virtual function, we have to index into the table and call the function whose address we find there. Object-oriented languages make this look-up a language feature, with the same syntax as a normal call. Whenever we appear to call a function which has been overridden, the compiler emits code to do the table look-up.

Reflections

Szyperski argues that we need concepts of both ``import'' and ``inheritance'' in an object-oriented language, because classes don't do all the things required by a module system. Hopefully his argument will leave you asking the converse question, which he doesn't address: can module systems do all the things that we use classes for? My three points above are the answer: yes, as long as the language and module system have a few simple features. I find this quite useful in understanding the relationship between OOP, structured programming and module systems. If you have any comments, please contact me.

References

  1. Szyperski, CA. ``Import is Not Inheritance-Why We Need Both: Modules and Classes.'' Proceedings of the European Conference on Object-Oriented Programming (1992): 19-32.

Content updated at Fri 19 Jan 17:17:43 GMT 2007.
validate this page