Three Myths About Turbo Pascal

I don't like the winter solstice. The nights are long and the short, cold days are too dark because the sun is so low in the sky. I find, as my ancestors did, that it's an especially good time for fires and candles and their reminder of the sun that was and will be. Perhaps it's appropriate, then, that I ran into three Turbo Pascal myths between 1991 and 1992.

Myth number one: A procedural variable must always point to actual code; it can't have Nil as its value.

Myth number two: Since you can't assign object methods to procedural variables, you can't call methods indirectly (that is, through a pointer).

Myth number three: You can't write object-oriented DLL's in TPW, v1.

If you believe any of these myths, you should read on because, as you'll see, each is only `sort of' true: Each elevates a restriction or a difficulty to the status of a full-fledged impossibility.

The First Myth

The first myth springs from the peculiar syntax of procedural variables, which tends to hide the fact that a procedural variable is really just a special type of 32-bit pointer. (I would have preferred to have had procedural types implemented more like normal pointers, so that we would have said something like ProcVar^(ProcArg) to make an indirect call, just as we use ^ to "dereference" a normal pointer to data. This would have made the `pointer nature' of procedural types quite obvious and eliminated the need for the ridiculous @ and @@ kludge. However, Borland apparently decided that making indirect calls look just like direct calls was more important than syntactic regularity, which I think is a shame.) Part of the syntactic peculiarity is that while you can freely cast any 32-bit variable (including longints and records) to any pointer or procedural type, you can't cast 32-bit expressions to procedural expressions. Thus, although you can say PtrVar := PtrFn(Fn Arg); ProcVar := ProcType(PtrVar);, you can't say ProcVar := ProcType(PtrFn(FnArg));. Similarly, though you can say const NilVar: pointer = Nil; ProcVar := ProcType(NilVar);, you can't say ProcVar := ProcType(Nil);.

Since Pascal treats parameter binding much like variable assignment, the same restrictions apply to passing procedural types to other procedures. If a procedure expects a ProcType argument, we can pass it ProcType(NilVar) but we can't pass it Nil or ProcType(Nil).

Since it's incredibly unlikely that the interrupt table at 0:0 will contain a reasonable procedure, we don't want to ever actually call a procedural variable whose value is Nil: the Nil value signals that the program shouldn't do anything here, or that it should take some sort of default action. Thus, if we know that at some point a procedural variable may be set to Nil, we should precede its use with a if (@ ProcVar) = Nil then {take the default action} else {use the ProcVar};.

The Second Myth

Debunking the second myth is going to require two things: aggressive use of casting, and a bit of background on just what a method call is. While method calls look and act very differently than normal calls -- the call looks like a reference to one of the object's fields, and there's the implicit with Self do that lets us refer to the object's fields as if they were global variables -- at the level of words on the stack they're not all that different than a normal procedure or function call. All methods have an `invisible', or implicit, parameter, var Self, after any regular, or explicit, parameters; constructors and destructors also add an implicit word parameter (the 16-bit VMT pointer) between the explicit parameters and Self. Also, while constructors act as if they return a boolean, they actually return a pointer which contains @ Self if Fail was not called, and Nil if it was.

The implicit parameters and the special handling of constructor results are the only differences between method calls and normal calls: there's no magic involved. If we simply define a procedural type, ProcType, that explicitly declares the method's implicit parameters after any normal parameters, we can then use ProcType to cast any pointer variable to a procedural variable. Once it's cast, the pointer acts just like a normal procedural variable; we can assign it to another procedural variable or use it to call a procedure. Just as with a normal procedural type, the only difference between a direct and indirect call lies in the way we make the call: The parameters are pushed and popped in the same way; the called code operates just the same; and indirectly called methods have the same full access to their object's fields (through the Self pointer) as directly called methods do.

Thus, if we have a method with no arguments and no results, we would simply make the declaration type Niladic = procedure (var Self);. To use it, we remember that we can only cast pointer variables, not pointer expressions, and so do something like PtrVar := @ ObjectType.Method; Niladic(PtrVar)(Self); Now, while there is something strange looking about a cast (in parentheses) followed by an argument list (in parentheses), indirect method calls are typically rare and concentrated in a few key routines, even in programs that rely heavily on them. (My typical uses for indirect method calls involve things like executing a list of object/method pairs on every timer tick, or calling a window object's message handler after DMT lookup reveals that it does have a handler.) What's more, the strange look of an indirect method call does not translate into strange object code: Using a cast to a procedural type generates the exact same code as using an normal procedural type, and that's both a little smaller and only slightly slower than a normal, direct procedure or function call.

Methods that require parameters or that return results are only slightly different than our Niladic example above. We simply have to remember to put any explicit parameters before the implicit parameter(s). Thus, we might use type SimplePredicate = function (var Self): boolean; for a method that takes no arguments and returns a boolean, and type UntypedDyadic = procedure (var A, B; var Self); for a method that requires two untyped memory references.

Just as with a normal procedure call, the compiler will not let us make an indirect method call with the wrong number or type of arguments. This is obviously desirable behavior, but it's tempered with a bit of a caveat: When we make a cast, we are effectively telling the compiler that we know exactly what we are doing. If we accidentally use a pointer to a UntypedDyadic method as a Niladic, the compiler will neither require nor accept the two var parameters to the UntypedDyadic method but the procedure will probably use them and the result will not be pretty! Similarly, the compiler will not complain if you cast a data pointer or the address of a near routine into a procedural type: it will blithely generate code that will (at best!) crash your computer.

The Third Myth

The third myth is, in many ways, the closest to true. While we can build object oriented DLLs using TPW v1, it does take a bit of work, and the result suffers from the not insignificant restriction that we can only call a virtual method within the module that defines it: We cannot define a virtual method within a DLL then call it from another DLL or from an EXE module. This restriction comes from the fact that TPW, v1 only implements `near objects', with 16-bit VMT pointers that refer to the defining module's data segment, and not `far objects' with 32-bit VMT pointers. When you make a virtual call, TPW expects the virtual method table to be in the current data segment. Windows, however, gives each DLL and EXE module its own data segment. When you make a virtual method call out of the module that contains the code, the current data segment will not have the object's VMT in it. The call will go to an essentially random address! It's important to note that it's not enough to merely avoid explicitly making virtual calls: Calling a static method that ultimately calls a virtual method is just as dangerous.

There's no getting around this ban on virtual method calls, but it will probably go away in some future release of TPW and, in the meanwhile, it doesn't prevent us from making static method calls across EXE/DLL and DLL/DLL boundaries.

Now, any DLL entry point has to be declared export so that it can have the special prologue/epilogue code that saves and changes DS on entry and restores it on exit. A little experimentation will convince you that you can't directly export objects' methods: you have to export a `flat' (not object-oriented) shell routine that in turn calls the method. The bindings unit on the DLL user's end simply repeats the objects' declarations and declares that the DLL shell routines are their external implementations.

Figure 1 is an extremely simplistic example of an object-oriented DLL's export library. It consists entirely of a series of shell routines and the associated exports statements. (Figure 1 exports objects that are defined in the SimplObj unit, though there's no reason that exported objects can't be defined in the export library.) Just like procedural types that are used to make indirect method calls, the exported shell routines have to explicitly declare both the methods' explicit arguments (if any) and their implicit (VMT and Self) arguments. All the shell routines have to do (besides invisibly set and restore DS) is to pass all their arguments on to the actual method. The simplest and most general way to do this seems to be to use a procedural type to make an indirect call through a typed constant pointer to the actual method.

Figure 2 is the corresponding bindings unit for the DLL user. Windows is designed to make it easy for you to call DLL entry points under any name you wish, so there isn't any problem binding a shell routine named Simple_Setup to the using code's Simple.Setup.

Figure 2 also illustrates another use for renaming, or aliasing, DLL entry points. Simple.Method and Simple.Alias both refer to the same procedure - external DLL index 3 - but where Method expects a pointer argument, Alias expects segment and offset words. When you construct an alias, you have to be very careful that the alias pushes the same number of words, in the same order, as the routine it's aliasing. You should also be thoroughly familiar with the parameter passing conventions detailed in the Inside Turbo Pascal section of the manual, as in many cases only a pointer is passed and the called code is responsible for copying the argument into a local variable.

Virtual methods aside, DLL-resident objects act just like EXE-resident objects: The using code can even call constructors and destructors through the extended syntax to New() and Dispose(). While the shell routines do slow execution down a bit, so does normal DLL linkage. For most reasonably sized argument lists, the shell routine's repushing the arguments and calling the method will cost about as much as the standard export routine prologue/epilogue code. In other words, not much: Only the shortest and fastest methods will be significantly slowed by placing them in a DLL.

Admittedly, constructing the export library and the bindings unit is a tedious, error-prone chore, but it's also quite susceptible to automation: My shareware DllMaker (available on the PC Techniques code disk) will read the interface section of any number of units, and generate an export library and a bindings unit.

While a ban on virtual methods is certainly not insignificant, it's not as if static objects are worthless! Not only do they provide us with handy tools to implement data abstraction and modularity, they also offer all the power of explicit inheritance. In other words, static objects are better than no objects.

As we head into spring here in California, I've dug a new garden bed along my North fence line and debunked three Pascal myths. The one thing all three parts of this article have in common is the basic unity of procedural and pointer types. With any luck, this `seed notion' will bear as fruitfully for you as the raspberries and blueberries I'll probably plant along the fence. One area you may wish to explore is using procedural types and indirect calls to implement procedure aliases without using DLLs.


September, 1994:
The bare-root raspberries I planted in '92 never came to life. The ones I planted last summer seem to be virus infected: New canes grow for a while, then wither and die. I'm getting a handful of berries, but that's about it. The blueberries never thrived, either, and this year one of the two bushes died, so I'm not getting any fruit on the survivor.

I hope this article has born more fruit than that garden bed has!


This article originally appeared in PC Techniques

Copyright © 1992, Jon Shemitz - jon@midnightbeach.com - html markup 9-3-94..10-16-94

Bio:
Jon Shemitz has used Turbo Pascal since version 1, which is longer than he's known the mother of his son. All four live together happily in Santa Cruz, California.

KBD icon Return to Publications list