Defining your own data types

Tuples and lists are quite powerful, but for more sophisticated applications, we'll need to define our own types. User-defined types make programs clearer and help the type-checker to help the programmer. We introduce a new data type in a data declaration:

     data vague == yes ++ no ++ maybe ;

data is a reserved word and vague is the name of the new type. == is read as `is defined as' and ++ is read as `or'. yes, no and maybe are the names for the constructor functions of the new type. We can now write function definitions that use these constructors in patterns:

     dec evade : vague -> vague ;
     --- evade ( yes )   <= maybe ;
     --- evade ( maybe ) <= no ;

The constructors can be parameterised with any type of object, including the type that's being defined. We can define types like lists, whose objects are of unlimited size using this kind of recursive definition. As an example, here's a user-defined binary tree that can contain numbers as its leaves:

     data tree == empty ++ tip ( num ) ++ node ( tree # tree ) ;

There are three constructors. empty has no parameters and defines a tree with nothing in it. tip defines a tree in terms of a single num, and node defines a tree in terms of two other trees. Here's a typical tree:

                                      .
                                     / \
                                    /   \
                                ___/     \___
                       ________/             \________
                      /       /               \       \
                     /       /\               /\       \
                    1       /  \             /  \       \
                           /    \           /    \       5
                          2      3         .      4

Here's an example of a function that manipulates trees. It returns the sum of all the numbers contained in one:

     dec sumtree : tree -> num ;
     --- sumtree ( empty )         <= 0 ;
     --- sumtree ( tip ( n ) )     <= n ;
     --- sumtree ( node ( l, r ) ) <= sumtree ( l ) + sumtree ( r ) ;

Unfortunately there's no shorthand for writing tree constants like there is for list constants, so we've got to write them out the long way using constructors. If we want to use sumtree to add up all the numbers in the example tree above, we must type in the expression:

     sumtree ( node ( node ( tip ( 1 ),
                        node ( tip ( 2 ),
                               tip ( 3 ) ) ),
                 node ( node ( empty,
                               tip ( 4 ) ),
                        tip ( 5 ) ) ) ) ;

This isn't really a drawback, because programs that manipulate complex data structures like trees will generally define them using other functions. However, it's very useful to be able to type any kind of constant data structure at the terminal when we're checking out an individual function like sumtree. When we want to test a Pascal program piecemeal, we usually have to write elaborate test harnesses or stubs to generate test data.

Roger Bailey <rb@doc.ic.ac.uk>