Performance and Flexibility =========================== This chapter covers the advanced topic of balancing performance and flexibility in a Dylan program. If you are writing a stand-alone program and are comfortable with using type constraints as you would in a static language, you do not need to read this chapter carefully. You may want to skim the chapter, so that you have an idea of what options are available to you in the future for larger or more complex projects. We start out by describing what Dylan’s execution model is, and what we mean by an *efficiency model*. The efficiency model can help a programmer to choose the appropriate language features for a particular problem. We also explore advanced features of Dylan that will let the programmer negotiate with the compiler to trade away part of the flexibility of the execution model for enhanced performance. Execution model --------------- Dylan is a dynamic language — everything in Dylan is defined in terms of a dynamic *execution model*. As we saw in :ref:`offset-method-dispatch`, the execution model of how a method is chosen when a generic function is called with a particular set of arguments is highly dynamic: the arguments are evaluated; the types of the arguments are determined; the applicable methods are found and sorted according to specificity; and, finally, the most specific, applicable method is called. This model implies that values and types can change, and that methods can be added right up until the generic function is called, and any of these changes still have an effect on which method is ultimately chosen. This dynamism — the model that value, number, and type of arguments; return values; applicable method; and method choice and execution are all determined at the last possible moment — is what gives the Dylan language its power. You might think that this dynamism also means that Dylan must perform poorly, because the only way to obey its execution model is to do a lot of extra computation at run time. But not every program makes use of dynamic features. Most functions accept and return a fixed number of values (often they return only one), and those values are often of a fixed or constrained type. Even programs that do use dynamism will not require it everywhere. So, a good Dylan compiler will identify the static parts of a program, and will compile them statically (that is, in a manner that is competitive with what a compiler of any good static language would do). To do that, the compiler uses a technique called *partial evaluation* — operations that can be evaluated at compile time (that the compiler knows can have only one outcome), will be done at compile time. Thus, even though the programmer can continue to think and program in terms of Dylan’s dynamic execution model, the compiler will generate efficient code when it can show that it can obtain the same return value without carrying out the full process at run time. For small projects — projects that can fit in a single library — the compiler can analyze the entire project and generate code that is competitive with any static language. If type constraints are used for all module variables, slots, parameters, and return values (as they would be in a static language), the compiler can generate code equivalent to that generated by compilers for static languages. In the remainder of this chapter, we examine how we can use type constraints, limited types, open classes, open generic functions, domain sealing, and primary classes to balance performance and flexibility in Dylan programs. Efficiency model ---------------- Dylan is a powerful language: Many of the built-in, or primitive, language operations are high-level operations, such as the method-dispatch mechanism, the collection facility, and the exception mechanism. Because of Dylan’s powerful features, it can be hard for the programmer to develop an *efficiency model* — a model of the absolute or relative cost of different approaches to a problem. In contrast, in other languages, such as C, every language construct can be explained directly in terms of a small number of machine instructions. Although it may be easy to understand the performance of a C program in terms of a simple model, programming in C is more work for the programmer — the higher-level abstractions are not provided, and must often be built from scratch. For example, a C programmer expects that the run-time cost of calling a function is the cost of possibly saving registers on a stack, passing the arguments, executing a machine instruction for jumping to a subroutine, and then executing a return instruction at the end of the function; if it is a call through a function pointer, or a C++ virtual function, the cost of an indirect jump must be added. In Dylan, the story is more complicated, because Dylan has a more sophisticated execution model: A call to a generic function might be much more expensive in a dynamic situation, because computing the most specific method could take much longer than would execution of the method itself. To write efficient programs in Dylan, you have to understand what constructs in the language can be expensive in time or space, and how you can reduce those costs in common cases. This understanding is based on an *efficiency model* — a conceptual model of how a program in Dylan runs at a low level. One problem with developing an efficiency model is that there is no single way to implement many Dylan operations. Different compilers do things in different ways, and certain compilers have multiple techniques for compiling the same piece of code, depending on circumstances. Nonetheless, we shall try to give an intuitive feel for which features of Dylan are costly, and which features enable the compiler to make optimizations. .. index:: single: type constraint; performance implications single: performance; type constraints Type constraints ---------------- In Dylan, variables, parameters, return values, and slots can all have type constraints. Dylan’s dynamic nature means that type constraints can be looser than is typical of a static language, or can even be deferred altogether, in support of rapid prototyping or evolutionary development. Type constraints in a dynamic language serve three primary purposes: #. Type constraints are required for method dispatch: the methods of a generic function are distinguished by the types of their required arguments. The generic function chooses the applicable methods by sorting them according to the type constraints of their parameters. #. Type constraints can be used optionally to enforce program restrictions. The compiler ensures that a variable, parameter, return value, or slot will never take on a value that is incompatible with the type constraint of the parameter, return value, or slot. (If the compiler cannot prove at compile time that an incorrect type is impossible, it inserts a run-time check to enforce the type constraint.) #. Type constraints allow the compiler to generate better code, because they are a contract between the programmer and the compiler that the variable, parameter, return value, or slot in question will never take on a value that is incompatible with its type constraint; hence, the compiler needs only to generate code for dealing with the declared type. Many Dylan compilers use *type inferencing* to determine the possible types of variables, parameters, and slots that do not have explicit type constraints. Within a library, the compiler essentially knows everything about the variables and functions that are not exported at the library interface — it can analyze all uses of variables, and all callers and callees of functions. Through this analysis, the compiler can develop a worst-case scenario of the possible types of every variable, parameter, return value, and slot. As a result, these compilers generate efficient code even if the programmer does not fully declare all types (as would be required in most static languages). .. topic:: Comparison with C: Static languages such as C have little need for type inferencing, because the type of every value must be declared, and the types can be checked easily at compile time. On the other hand, when a problem domain is ill-specified, the program is evolving through development, or a value may take on one of several types, the programmer must construct union types, and must use variant records or other bookkeeping to track the actual type of the value manually. Dylan automatically handles this bookkeeping and uses type inferencing to minimize the associated overhead. At the same time, when the type of a variable can change at run time, Dylan also automatically tracks the changing type. Some compilers have a facility for generating *performance warnings*, which inform you when type inferencing is not able to determine types sufficiently to generate optimal code. Some compilers have a facility for generating *safety warnings*, informing you when type inferencing is not able to determine types sufficiently to omit run-time type checking. As an example, consider these definitions (which are similar to, but not exactly the same as, the definitions on which we settled in :doc:`time-mod`): .. code-block:: dylan define abstract open class () slot total-seconds :: = 0, init-keyword: total-seconds:; end class ; define method decode-total-seconds (sixty-unit :: ) => (hours :: , minutes :: , seconds :: ) let total-seconds = abs(sixty-unit.total-seconds); let (total-minutes, seconds) = truncate/(total-seconds, 60); let (max-unit, minutes) = truncate/(total-minutes, 60); values (max-unit, minutes, seconds); end method decode-total-seconds; Because we made the choice to store ``total-seconds`` as an integer, and because *60* is an integer constant, the compiler can infer that the *truncate/* calls are for an integer divided by integer. There is no need to consider whether to use floating-point or integer division. If we were more concerned with testing out ideas, we might have left unspecified the type of the ``total-seconds`` slot (implicitly, its type would then be ````), or, if we wanted to keep the option of having times more accurate than just seconds, we might have specified that its type was ````, allowing for the possibility of using floating-point numbers, which can express fractional seconds. If we left the type of the ``total-seconds`` slot unspecified, the compiler would need to check the arguments to ``truncate/``, on the off chance that an argument was not numeric at all. In some compilers, you would be able to get a compile-time safety warning stating that a run-time type error is possible (which, if unhandled, will result in program failure), and that the check, and the possibility of a run-time error, could be avoided if the compiler knew that ``total-seconds`` was a ````. .. topic:: What is a safe program? Dylan is always safe in that a programming error cannot cause a corruption of the program (or of other programs). For example, an out-of-bound array access or passing an argument of incompatible type simply cannot happen. The compiler will either prove that the requested action is impossible, or will insert code to verify bounds or type at run time, and will signal an error if the bounds or type is incorrect. When we discuss safety in this section, we are referring to whether or not such errors will be visible to the user. If we have not provided for a recovery action, signaling of an error will halt the program. See :doc:`exceptions`, for an example of how run-time errors can be handled by the program. .. topic:: Comparison with Java: Java recognizes the need for safe operations, and has eliminated many of the unsafe practices of C and C++, adding such checks as array-bounds checks and type-cast checks at run time. However, Java retains the C mathematical model that trades performance for correctness. Java integers are of a fixed size, and computations that cannot be represented in that size silently overflow. In contrast, Dylan requires numeric operations to complete correctly or to signal an error. Several Dylan implementations are also expected to provide libraries for infinite-precision numerical operations. If we specified the type of the ``total-seconds`` slot as ````, the compiler would have to dispatch on the type of ``total-seconds``, using either floating-point or integer division as necessary. In some compilers, we would be able to get a compile-time performance warning stating that this dispatch could be omitted if the compiler knew that ``total-seconds`` was of a more restricted type. Note that the type of the return value of ``decode-total-seconds`` can be inferred: ``max-unit`` and ``minutes`` must be ```` (inferred from the definition of ``truncate/``), and ``seconds`` must have the same type as ``total-seconds`` (````, in our example); thus, the compiler does not have to insert any type checks on the return values of ``decode-total-seconds``. Dylan enforces declared return types in the same way as it enforces parameter types, by eliminating the check where type inferencing can show it is not needed, and using the enforced types to make further inferences. From this example, you can see how the compiler can get a lot of mileage from a small number of constraints, and how it can point you to the places where further clarification will produce the most performance and safety benefits. At the same time, Dylan does not require that you have all your types thought out in advance of compiling the program; the dynamic nature of the language allows Dylan to defer considering type information until the program is actually running. In good Dylan development environments, there is support for resolving and continuing from run-time type errors during program development (rather than requiring editing of the code and recompilation). Remember that your code is more suited to reuse when it has fewer and more general type constraints. If you have a compiler that can issue safety and performance notes, try to generalize and minimize your type constraints, being guided by your safety and performance requirements. Often, just the constraints required to specify method applicability will be sufficient for good safety and performance. Declaring the types of module variables, slots, and return values of functions is also useful and can help to document your program. Declaring types for constants and local variables can be useful for enforcing program correctness, but is unlikely to create optimization opportunities, and might actually reduce performance, because the compiler will insert type checks to enforce such constraints if they are overly restrictive. .. _perform-limited-types: Limited types ------------- Some of Dylan’s built-in types are extremely general. When these types are used, the compiler’s type inferencing is thwarted, and less efficient code will be generated. The place where this situation is most obvious is in the ```` types, where the elements of a collection are essentially like multiple slots, all with the same type constraint. For the built-in collections, elements typically have a general default type (often simply ````), and there can be an arbitrary number of them. The ``limited`` mechanism is a way to specify that you expect to store objects of a particular type in the collection, and to specify how many elements will be in the collection. As an example, in :ref:`nlanding-vehicle-containers`, the ``generate-gates`` method returns a ````. Without further information, the compiler must assume that that vector might contain objects of any types. As a result, the following code in the ``build-simple-airport`` method from :ref:`nlanding-airport-test-file`, will be inefficient: .. code-block:: dylan let gates = generate-gates(gates-per-terminal, capacity); ... for (gate in gates) gate.connected-to := taxiway-vector; end for; Because the compiler can infer only that ``gates`` is a ````, it must generate extra code to determine whether each ``gate`` has a ``connected-to`` method on it. We can use limited types to constrain ``gate-instances`` as follows: .. code-block:: dylan define constant = limited(, of: ); define method generate-gates (gates-per-terminal :: , default-gate-capacity :: ) => (gates :: ) let result = make(, size: reduce1(\+, gates-per-terminal)); ... values(result); end method generate-gates; With the limited constraint of the return value of ``generate-gates``, the compiler can ensure that only gate objects will ever be stored in the vector; hence, it can be sure that each ``gate`` will be a ```` and will have a ``connected-to`` method. Note that limited-collection types are instantiable types; that is, you can make an object of a limited type. This capability is different from similar constructs in certain other languages, in which those constructs are only an assertion about the range or type of values to be stored in the collection. Having declared the return value of ``generate-gates`` to be a ````, it would be an error to return a ```` instead; hence, we changed the argument to ``make`` when constructing ``result`` to be ```` instead of the original ````. If ```` and ``connected-to`` are not *open* (as described in `Open generic functions`_ and `Open classes`_), the compiler can infer that ``connected-to`` is used here to set a slot in the gate instance and to further optimize the code generated. We do not delve into the exact details of what the compiler has to know to make this optimization, but it is worth noting that, if either the class or the generic function were open, the optimization could not be made. .. topic:: Comparison with C++: The Dylan limited-collection types provide a capability similar to that offered by the C++ template classes. Unlike in C++, the base type of a limited-collection type (the equivalent of a C++ class template — in the example above, ````) is also a valid type. Dylan’s dynamic capabilities mean that Dylan can defer determining the element type of a collection until run time, in effect adapting the class template as it goes along. By using a limited type, the compiler can generate more efficient code. Another use of limited types is to allow compact representations. We can use ``limited`` with the built-in type ```` to specify numbers with a limited range that can be stored more compactly than integers. It is especially useful to use a limited range in combination with a limited collection; for example, .. code-block:: dylan define constant = limited(, of: limited(, min: -128, max 127)); In the preceding example, we define a type that can be represented as a one-dimensional array of 8-bit bytes. .. topic:: Comparison with C: C provides efficient data representations, because its data types typically map directly to underlying hardware representations. A drawback of C is that its efficient data representations are often not portable: The size of a ``short int`` may vary across platforms, for instance. Dylan takes the more abstract approach of describing the requirements of a data type, and letting the compiler choose the most efficient underlying representation. A drawback of the Dylan approach is that it cannot easily be used for low-level systems programming, where data structures must map reliably to the underlying hardware. Most Dylan systems provide a foreign-function interface to allow calling out to C or some other language more suitable to these low-level tasks. Some Dylan systems augment the language with machine-level constructs that provide the level of control necessary while staying within the object model as much as possible. .. topic:: Comparison with Java: Java recognizes that portable programs need well-defined data types, rather than types that map to the particular underlying hardware differently in each implementation. However, Java retains some of C’s concreteness in simply specifying four distinct sizes of integer (in terms of how many binary digits they hold), and forcing the programmer to convert integer types to objects manually, when object-oriented operations are to be performed. In contrast, Dylan’s limited-integer types specify, at the program level, the abstract requirements of the type, giving the compiler freedom to map the program requirements as efficiently as possible to the underlying architecture. .. _perform-enumerations: Enumerations ------------ Many languages provide enumeration types both to enforce program correctness and to provide more compact representation of multiple-choice values. Dylan does not have a built-in enumeration type, but you can easily construct enumerations using the ``type-union`` and ``singleton`` type constructors. For example, consider the ```` and ```` classes, where there are only two valid values for the ``direction`` slot in each class. Rather than enforcing the restrictions programmatically, as we did in :ref:`slots-virtual-slots`, we can create types that do the job for us: .. code-block:: dylan define abstract class () slot direction :: , required-init-keyword: direction:; end class ; define constant = type-union(singleton(#"north"), singleton(#"south")); define class () keyword direction:, type: ; end class ; define constant = type-union(singleton(#"east"), singleton(#"west")); define class () keyword direction:, type: ; end class ; Here, the abstract superclass specifies that the read-only slot ``direction`` must be a ````, and that it must be initialized when an instance is created with the keyword ``direction:``. The constant ```` is a type specification that permits only the symbol ``#"north"`` or the symbol ``#"south"``. The class ```` specifies that, when an instance of ```` is made, the initial value must be of the ```` type. We handled the longitude case similarly. The use of ``type-union`` and ``singleton`` to create enumeration types in this fashion is common enough that the function ``one-of`` is usually available in a utility library as a shorthand: .. code-block:: dylan define constant one-of = method (#rest objects) apply(type-union, map(singleton, objects)) end method; With this abbreviation, the direction types can be written more compactly: .. code-block:: dylan define constant = one-of(#"north", #"south"); define constant = one-of(#"east", #"west"); Some Dylan compilers will recognize the idiomatic use of ``type-union`` and ``singleton`` to represent such enumerations more compactly. For instance, a compiler could represent the direction slot of a latitude or longitude as a single bit, using the getter and setter functions to translate back and forth to the appropriate symbol. Direct methods -------------- The definition of the ``one-of`` constant is a method called a *direct method* or *bare* *method*. It is the equivalent of a function in other languages. A bare method does not create an implicit generic function, and invoking a bare method does not use method-dispatch procedure, but rather calls the method directly. We choose to use a bare method here because we are sure that ``one-of`` will never need method dispatch: it performs the same operation independent of the types of its arguments. The bare method serves to document this intent. If there were some possibility of additional methods, it would be more perspicuous to use a generic function, even if there is initially only one method. Most Dylan compilers will generate equally efficient code for a bare method and for a generic function with only one method, so the choice of which to use should be based on whether or not it would ever make sense to have additional methods that discriminate on parameter types. Tail calls ---------- The most important construct in the Dylan execution model is the function call, because function calls are the most common operation in the language. Remember that all slot accesses and assignments, arithmetic operations, and collection accesses obey the execution model of function calls, even if the syntax for them does not look like that of function calls. We have already discussed how Dylan compilers can optimize away run-time checking of argument types and the overhead of method dispatch, and that good compilers will generate equally efficient code for calls to single-method generic functions or direct methods. There is one additional optimization that good Dylan compilers will make, which is enabled by a particular style of programming. If the final operation in a method is a call to another function (called a *tail call*) then the calling function can jump directly to the called function, rather than using a call-and-return sequence. Thus, the return from the called function returns to its caller’s caller. As an example, consider this ``decode-total-seconds`` method: .. code-block:: dylan define method decode-total-seconds (sixty-unit :: ) => (hours :: , minutes :: , seconds :: ) decode-total-seconds(sixty-unit.total-seconds); end method decode-total-seconds; The inner call to ``decode-total-seconds`` can be a direct jump rather than a function call, because the compiler can infer which method should be called and that the return values already have the correct constraints. Typed generic functions ----------------------- In addition to specifying the types of the parameters and return values of methods, you can specify the types of the parameters and return values of a generic function. You usually restrict the parameter types of a generic function to establish the *contract* of the generic function — that is, to define the domain of arguments that the generic function is intended to handle, and the domain of the values that it will return. If we define a method without also defining a generic function, Dylan creates an implicit generic function with the most general types for each parameter and return value that are compatible with the method. For example, assume that we defined a method for ``next-landing-step``, and did not explicitly create a generic function for it. The method is as follows: .. code-block:: dylan define method next-landing-step (storage :: , aircraft :: ) => (next-class :: false-or(), duration :: false-or()) ... end if; end method next-landing-step; When we define a method without also defining a generic function, the compiler will generate an implicit generic function for us, which, in this case, will be as though we had defined the generic function like this: .. code-block:: dylan define generic next-landing-step (o1 :: , o2 :: ) => (#rest r :: ); In :ref:`nlanding-schedule-file`, where we did define a generic function, we used a simple definition, just documenting the number of arguments, and giving them mnemonic names: .. code-block:: dylan define generic next-landing-step (container, vehicle); Because we did not specify types of the arguments or return values, they default to ````, just as they did in the preceding implicit generic function. Although the generic function that we wrote does prevent us from defining methods with the wrong number of arguments, it does not constrain the types of those arguments or the format or type of return values in any way. A sophisticated compiler may be able to make inferences based on the methods that we define, but we could both aid the compiler and more clearly document the protocol of ``next-landing-step`` by specifying the types of the parameters and return values in the definition of the generic function: .. code-block:: dylan define generic next-landing-step (storage :: , aircraft :: ) => (next-storage :: , elapsed-time :: ); Now, the compiler can help us. If we define a method whose arguments are not a subclass of ```` and a subclass of ```` (for example, if we provided the arguments in the wrong order), the compiler will report the error. Furthermore, the compiler can use the value declaration to detect errors in the return values (for example, if we returned only a single value or returned a value of the wrong type). Finally, the compiler can be asked to issue a warning if there is a subclass of the argument types for which no method is applicable. In addition to establishing a contract, specifying the types of the parameters and return values of generic functions can allow the compiler to make additional inferences, as described in `Type constraints`_ with regard to ``truncate/``. In the absence of other information, the compiler is limited in the optimizations that it can make based solely on the parameter types in the generic function, so it is generally best not to restrict artificially the types of a generic function, but rather to use the restricted types to document the generic function’s protocol. Open generic functions ---------------------- By default, generic functions are *sealed*. When you use ``define generic``, that is the same as using ``define sealed generic``. No other library can add methods to a sealed generic function — not even on new classes that they may introduce. Methods cannot be added to, or removed from, the generic function at run time. The only methods on a sealed generic function are the methods that are defined in the library where the generic function itself is defined. Because of the restrictions on a sealed generic function, the compiler, using type-inference information, can usually narrow the choice of applicable methods for any particular call to the generic function, eliminating most or all of the overhead of run-time dispatching that would normally be expected of a dynamic language. We saw in :doc:`libraries`, that we must define a generic function that is part of a shared protocol using ``define open generic``, so that libraries sharing the protocol can implement the protocol for the classes that they define, by adding methods. If we do not define the generic function to be open, other libraries are prohibited from adding methods to the generic function, which would make it useless as a protocol. Unfortunately, a generic function that is open cannot be optimized. Even when the compiler may be able to infer the exact types of the arguments to the generic function in a particular call, because an open generic function may have methods added or removed, even at run time, the compiler must produce code to handle all these possibilities. Because open generic functions cannot be optimized, you should use them only when necessary. You need to balance the division of your program into libraries against the need to export and open more generic functions if the program is too finely divided. This balance is illustrated by the considerations we made in designing a protocol in :ref:`libraries-protocol-design`. When we chose to split the ``time`` and ``angle`` libraries, we were forced to create the ``say`` protocol library and open the generic function ``say``. In `Sealed domains`_, we show how to regain certain optimizations when you decide that opening a generic function is required. Note that generic functions that are defined implicitly in a library — such as those that are defined when you define only a single method, or those that are defined for slot accessors — are sealed by default. If you expect other libraries to add methods to one of these implicit generic functions, you must define the generic function explicitly to be open using ``define open generic``. Open classes ------------ By default, classes are ``sealed``. When you use ``define class``, that is the same as using ``define sealed class``. Other libraries cannot directly subclass a sealed class — they cannot define new classes that have your sealed class as a direct superclass. The only direct subclasses of the class are those subclasses that are defined in the library where the class itself is defined. Extensive optimization opportunities occur when the methods of a sealed generic function are specialized on sealed classes. In this case, the compiler can usually choose the correct method of the generic function to call at compile time, eliminating any run-time overhead for using a generic function. We saw in :doc:`libraries`, that we must define a class that is a shared substrate, such as ````, using ``define open class``, if the libraries sharing the substrate are expected to subclass the class. If we did not define the class to be open, other libraries would be prevented from subclassing it — which might be reasonable if the substrate were not intended to be extended by subclassing. Unlike an open generic function, an open class does not prevent all optimization. If a generic function has a method applicable to an open class, but the generic function is sealed, then the compiler might still be able to optimize method dispatch if that compiler can infer the types of the arguments to the generic function at a particular call. Sometimes, the dispatch code will be slightly less optimal, because it must allow for arbitrary subclasses, rather than a fixed set of subclasses; in general, however, opening a class is less costly than is opening a generic function. Note that, although you cannot directly subclass a sealed class from another library, you can subclass a sealed class in the library that defines the sealed class. It may not be obvious, but a corollary of this rule of sealing is that you can define an *open subclass* of a sealed class in the library that defines the sealed class. Using a sealed class with an open subclass is one simple way to get both flexibility and efficiency — the classes in the sealed branch will be optimized by the compiler, while the open subclass can be exported for other libraries to build on and extend. Sealed domains -------------- When you define a protocol that is meant to be extended by many libraries, both the base classes and the generic functions that make up the protocol must be open. This simple exigency might make it seem that there is no hope of optimizing such a protocol — however, there is hope. You use the ``define sealed domain`` form to seal selectively subsets or *branches* of the protocol, permitting the compiler to make all the optimizations that would be possible if the classes and generic functions were sealed, but only for the particular subset or branch in question. .. topic:: Advanced topic: Sealed domains are one of the most difficult concepts of the Dylan language to understand fully. It is reasonable to defer careful reading of this section until you are faced with a situation similar to the example — an imported open class and generic function that will be specialized by your library. As an example, consider the ``say`` protocol as used in the ``time`` library. Because the ``say`` generic function is defined to be open, even if the compiler can infer that the argument to ``say`` is a ``