9. Data Interface Schemes

A Data Interface Scheme, or DIS, tells Green Card how to translate from a Haskell data type to a C data type, and vice versa.

9.1. Forms of DISs

The syntax of DISs is as follows,

Dis : DisFun [ arg_1 ... arg_n ]         Application
    | Cons   [ arg_1 ... arg_n ]         Constructor, (n >= 0)
    |  Cons  '{' field_1 = dis_1 ,
                 ... ,
                 field_n = dis_n '}'     Record,      (n >= 1)
    | < Var / Var >
             [ arg_1, ... , arg_n]       User defined
                                         marshalling, (n >= 0)
    | 'declare' [cexp Var] 'in' [Dis]
    | Adis

Adis : '(' Dis ')'
     | TypeCast Cexp                Result only
     | TypeCast Var
     | Var                          Bound by '%dis'
     | '(' [ Dis_1, ... , Dis_n] ')'  Tuple n >= 1

Arg  : Adis
     | Cexp
     | Var

DisFun   : Var

TypeCast : Cexp                         C Expression

Var      : Var                          Initial letter lower case

It is designed to be similar to the syntax of Haskell patterns. A DIS takes one of the following forms:

The application of a DIS macro to zero or more arguments.

Like Haskell functions, a DIS macro starts with a lower-case letter. DIS macros are described in Section 9.2. Standard DIS functions include int, float, double; the full set is given in Section 10. For example:

  %fun foo :: This -> Int -> That
  %call (this x y) (int z)
  %code r = c_foo( x, y, z );
  %result (that r)

In this example this and that are DIS functions defined elsewhere.

The application of a Haskell data constructor to zero or more DISs.

For example:

newtype Age = Age Int
%fun foo :: (Age,Age) -> Age
%call (Age (int x), Age (int y))
%code r = foo(x,y);
%result (Age (int r))

As the %call line of this example illustrates, tuples are understood as data constructors, including their special syntax. The labelled fields syntax is also supported, i.e.

data Point = Point { px,py::Int }

%fun foo :: Point -> Point
%call (Point { px = int x, py = int y })
...

The use of records is also the reason for the restriction that simple C expressions can't contain assignment. Without this restriction examples like this would be ambiguous:

%result Foo { a = bar x, b = bar y }

Green Card does not attempt to perform type inference; it simply assumes that any DIS starting with an upper case letter is a data constructor, and that the number of argument DISs matches the arity of the constructor.

The application of a user function to one or more DISs

This form allows you to do user defined marshalling, using a pair Haskell functions. Since DISs are used to either pack/marshall a Haskell value into a form that can be passed to C and unpack values that come back, a pair of Haskell functions is required. For example:

data Nat = Zero | Succ Nat
fromNat :: Nat -> Int
toNat   :: Int -> Nat

%fun square :: T -> T
%call (< fromNat / toNat > (int x))
%code r = square(x);
%result (< fromNat / toNat > (int r))

Here the function fromNat is applied to square's argument, converting it to an integer before it crosses the fence into C. Likewise, the result coming back from C is converted back to type Nat by the function toNat.

The user functions can have any name at all: in fact, the <../..> syntax simply encloses two fragments of arbitrary Haskell to be applied to the succeeding arguments. One may specify a partially applied function, or anything else (excluding the use of the / and > symbols - so lambda abstractions are unfortunately not possible.) The user-defined DIS may of course also take more than one parameter. For example:

data Point = P Dist Vector
polarToCart :: Polar     -> (Int,Int)
cartToPolar :: (Int,Int) -> Polar

%fun mirror :: Polar -> Polar
%call (< polarToCart / cartToPolar > (int x) (int y))
%code y = -y;
%     x = -x;
%result (< polarToCart / cartToPolar > (int x) (int y))

Notice that all the example marshalling functions have so far been pure functions, e.g., fromNat has type Nat -> Int rather than T -> IO Int.) Sometimes you need to write a marshalling function that is internally stateful. When you do, you'll need to inform Green Card of this, so that it can generate code that invokes the marshalling functions correctly. For example:

marshallString   :: String -> IO Addr
unmarshallString :: Addr   -> IO 

%fun setWindowTitle :: String -> IO ()
%call (<< marshallString / unmarshallString >> str)

i.e., stateful marshalling functions are enclosed by double angle brackets.

A C type cast

Occasionally one wishes to declare and use a C variable at a type which slightly differs form the type produced by a standard DIS, although it shares the same machine representation. The declare {cexp} var in dis form can be used to do the necessary type conversion in C. Examples:

%fun foo :: Int -> IO ()
%call (declare {unsigned int} x in (int x))

data T = MkT Int
%fun faz :: T -> IO ()
%call (declare {c_t} x in MkT (int x))

The application of a base DIS to exactly one variable

This is the primitive form of a DIS -- the way all values actually get passed across the Haskell - C boundary. Base DISs denote a fixed set of primitive types known to both C and Haskell, such as int and Int respectively, and consist of the Haskell type name prefixed by %% (e.g., %%Int.) Because the exact set of base DISs may vary slightly between compilers, it is recommended that programmers use the standard DIS macros listed in Section 10 insead. The base form is noted here primarily for completeness.

9.2. DIS macros

It would be unbearably tedious to have to write out complete DISs in every procedure specification, so Green Card supports DIS functions in much the same way that Haskell provides functions. (The big difference is that DIS functions can be used in ``patterns'' -- such as %call statements -- whereas Haskell functions cannot.)

DIS macros allow the programmer to define abbreviations for commonly-occurring DISs. For example:

newtype This = MkThis Int (Float, Float)
%dis this x y z = MkThis (int x) (float y, float z)

Along with the newtype declaration the programmer can write a %dis function definition that defines the DIS function this in the obvious manner.

DIS macros are simply expanded out by Green Card before it generates code. So for example, if we write:

%fun f :: This -> This
%call (this p q r)
...

Green Card will expand the call to this:

%fun f :: This -> This
%call (MkThis (int p) (float q, float r))
...

(In fact, int and float are also DIS macros defined in Green Card's standard DIS prelude, so the %call line is further expanded to: [1]

%fun f :: This -> This
%call (MkThis (I# ({int} p)) (F# ({float} q), F# ({float} r)))
...

The fully expanded calls describe the marshalling code in full detail; you can see why it would be inconvenient to write them out literally on each occasion!)

Notice that DIS macros are automatically bidirectional; that is, they can be used to convert Haskell values to C and vice versa. For example, we can write:

%fun f :: This -> This
%call (MkThis (int p) (float q, float r))
%code int a, b, c;
%     f( p, q, r, &a, &b, &c);
%result (this a b c)

The form of DIS macro definitions, given in Section 9.1, is very simple. The formal parameters can only be variables (not patterns), and the right hand side is simply another DIS. Only first-order DIS macros are permitted.

9.3. User-defined DISs

Sometimes Green Card's primitive DISs (data constructors) are insufficiently expressive. For recursive types, such as lists, it is obviously no good to write a single data constructor.

Green Card therefore provides a ``trap door'' to allow a sufficiently brave programmer to write his or her own marshalling functions. For example:

data T = Zero | Succ T

%fun square :: T -> T
%call (t (int x))
%code r = square( x );
%result (t (int r))

Use of t requires that the programmer define two ordinary Haskell functions, marshall_t to convert from Haskell to C, and unmarshall_t to convert in the other direction. In this example, these functions would have the types:

marshall_t   :: T -> Int
unmarshall_t :: Int -> T

The functions must have precisely these names: ``marshall_'' followed by the name of the DIS, and similarly for unmarshalling. Notice that these marshalling functions have pure types (e.g. marshall_t has type T -> Int rather than T -> IO Int). Sometimes one wants to write a marsalling function that is internally stateful. For example, it might pack a String into a ByteArray, by allocating a MutableByteArray and filling it in with the characters one at a time. This can be done using runST, or even unsafePerformIO. (These are all GHC-centric comments; so far as Green Card is concerned it is simply up to the programmer to supply suitably-typed marshalling functions.)

Green Card distinguishes user-defined DISs from DIS macros by omission: if there is a DIS macro definition for a DIS function f then Green Card treats f as a macro, otherwise it assumes f is a user-defined DIS and generates calls to marshall_t and/or unmarshall_t.

9.4. Marshalling complex structures

The full power of DIS macros becomes apparent when mapping between a structured Haskell type a C struct. For example, to interface the Haskell ColourPoint type with the outside world:

data ColourPoint = CP Int Int Colour
data Colour = Red | Green | Blue | ... deriving ( Enum )

for which we want to map it onto the following C structure:

typedef struct CPoint {
   int x;
   int y;
   enum colour c;
} CPoint;

It requires just two DIS macros to capture the mapping between the two:

%dis colourPoint cp = 
%   declare {CPoint} cp in
%   CP (int {%cp.x}) (int {%cp.y}) (colour {%cp.c})

%dis colour c =
%   declare {enum colour} c in
%   < fromEnum / toEnum > (int c)

Using these, it is then very easy to implement the required interfaces to foreign functions that manipulate coloured points:

%fun translate :: Int -> Int -> ColourPoint -> IO ColourPoint
%call (int xrel) (int yrel) (colourPoint p)
%code p.x += xrel;
%     p.y += yrel;
%     render(&p);
%result (colourPoint {p})

Note that in this example, the return value is actually the same structure as the argument value (destructively updated.) It is for this reason that the p on the %result line is quoted as a C literal - this prevents the declare clause of the DIS macro from generating a second (overlapping) declaration of the variable in C.

9.5. Semantics of DISs

How does Green Card use these DISs to convert between Haskell values and C values? We give an informal algorithm here, although most programmers should hopefully be able to manage without knowing the details.

To convert from Haskell values to C values, guided by a DIS, Green Card does the following:

First, Green Card rewrites all DIS function applications, replacing left hand side by right hand side.
Next, Green Card works from outside in, as follows:
- For a data constructor DIS (in either positional or record form), Green Card generates a Haskell case statement to take the value apart.
- For a user-defined DIS, Green Card calls the DIS's marshall function.
- For a type-cast-with-variable DIS, Green Card does no translation.