This package implements a foreign function interface (FFI) for Tcl.
A FFI provides the ability for Tcl to invoke, with some limitations, arbitrary functions from any dynamically linked shared library without having to write C code.
Calling a C function from a shared library (or DLL, using the term interchangeably) using
- Loading the library into the process by instantiating a ::cffi::Wrapper object.
- Defining commands that wrap the C functions of interest through method calls on the Wrapper object.
- Invoking the wrapped commands as any other Tcl commands.
This page provides an introduction to the package. Details are described in
|Concepts||Describes CFFI functionality.|
|Cookbook||Provides recipes for converting C declarations to CFFI declarations.|
|::cffi||Reference documentation for commands in the |
Source distributions and binaries for some platforms are available from https://sourceforge.net/projects/magicsplat/files/cffi.
Binary distributions can be extracted into any directory present in Tcl's
For instructions on building and installing from source, see the
BUILD.md file in the source distribution. The package can be built with one of two back end libraries that implement FFI functionality:
The selection of the back end library to use is made at build time as described in the build instructions. Note the
dyncall backend has some limitations compared to
The following examples illustrate basic usage of the package. Most examples use Windows as they have standard system libraries in known locations. The usage is identical on other platforms.
After loading the package, the
user32 system DLL which contains the functions of interest is loaded.
This creates a shared library object that can then be used for defining functions implemented within the library. It is recommended to always pass a full path to the shared library though that is not strictly required.
The next step is to define the prototype for the function to get a handle to the desktop window. In C, this would look like
HWND is actually a C pointer typedef.
To call this function from Tcl we first have to define a wrapper for it. Wrappers can be defined with either the function or stdcall methods on the shared library object that implements the function. The two commands are identical on all platforms except 32-bit Windows where
function is used to call functions that use the C calling convention while
stdcall is used to call functions using the stdcall calling convention. On all other platforms, including 64-bit Windows, the two are identical. Since we are demonstrating using Windows, the
stdcall command is used in some of the examples.
The prototype for the above Windows function can then be defined as:
The last argument is the parameter list which in this case is empty. The
pointer.HWND return value is a tagged pointer. As illustrated later, pointer tags help in type safety. The definition could have as well just typed it as
pointer which would be the equivalent of
void* in C but that would lose the ability to check types.
The C function can be called as any other Tcl command.
Note how the return pointer value is tagged as
To retrieve the dimensions of this window we need to call the
GetWindowRect function from the same library. This has a slightly more complex prototype.
In particular, the window dimensions are returned in a
RECT structure whose address is passed in to the function. We thus need to first define a corresponding structure.
A structure is defined simply by a list of alternating field names and types of which
int is one of several numeric types available. The result is a
Struct object which can be used in function declarations, memory allocations and other places.
GetWindowRect function is wrapped using this struct definition.
The parameter list for a function follows the same form as for structs, an alternating list of parameter names and type declarations.
Note how the type for the second parameter,
rect, is specified. In general, type declarations consist of a type optionally followed by a list of annotations that serve multiple purposes. In this case
struct.RECT specifies the data type while
out is an annotation that indicates it is an output parameter for the function and thus to be passed by reference. At the script level, the parameter must then be a variable name, and not a value, into which the output can be stored.
The function is called as:
As seen above,
struct values are automatically decoded into dictionaries with keys corresponding to struct field names. We could have also selected to receive structs as binary values in their native form as shown further on.
char * type in C is more often than not used for null-terminated strings. These may be dealt with in the same fashion as pointers at the script level as discussed in Pointers. However, in most cases it is more convenient to use the
chars types. The difference between the two is that the former corresponds to a C
char * pointer while the latter corresponds to a C
char[N] array. When passed to a function, of course both are passed as
char * pointers but the latter is more convenient for output buffers where the parameter corresponds to a output buffer of some maximum size. The former on the other hand is convenient for input parameters where the size is of no concern.
The following fragment illustrates their use.
In the above declaration the
path argument is declared as an array whose size is given by another parameter
nchars instead of an integer constant. This makes for fewer errors in calling C functions which expect a size and pointer combination of parameters where a larger size than the declared array size may be mistakenly passed.
The above functions can then be simply invoked.
In the above examples, the strings in all cases are encoded in the system encoding before being passed to the C function. Similarly, the output buffers returned by the C function are assumed to be in system encoding and translated into Tcl's internal form accordingly. Both
chars can be tagged with an encoding name. For example, if the functions were defined using
chars.jis0208, the JIS0208 encoding would be used to transform the strings in both directions.
The package has other related types that serve a similar purpose.
unicharstype are analogous to
charsexcept they assume the encoding is that of the
Tcl_UniCharC type and have all sizes specified in
Tcl_UniCharunits as opposed to chars/bytes. This is primarily useful on Windows.
bytestypes are again analogous except they assume binary data (Tcl byte array at the C level).
C functions indicate errors primarily through their return values. In some cases, the return value is the error code while in others it is only a boolean status indicator with detail being available either through another function such as
GetLastError on Windows or a global variable such as
Here is an attempt to change to a non-existing directory.
The return value of
0 here indicates failure. The caller must then specifically check for errors. Moreover, if the return value does not actually indicate cause of error, another call has to be made to
GetLastError etc.. This has multiple issues:
- first, caller has to specifically check for errors,
- second, and more important, by the time a secondary call is made to retrieve
errnoetc. the originally error is likely to have been overwritten.
To deal with the first, a return type can be annotated with conditions that specify expected return values. Since
SetCurrentDirectory returns a non-zero value on success, the function return type may be annotated with
nonzero. Our previous definition of
SetCurrentDirectoryA can instead be written as
Passing a non-existing directory will then raise an error exception.
Note however, that the error message is generic and only indicates the function return value did not meet the expected success criterion. To fix this, the function definition can include an annotation for an error retrieval mechanism:
With the inclusion
lasterror in the type declaration, the error message is much clearer. This also eliminates the second issue mentioned above with the error detail being lost before the function to retrieve it is called.
GetLastError as the retrieval mechanism and is specific to Windows, other annotations for error retrieval are also available. For example, the
errno annotation serves a similar purpose except it retrieves the error based on the
errno facility which is cross-platform and commonly applicable to both the C runtime as well as system calls on Unix.
For cases where the error is library-specific and not derived from the system, the
onerror annotation can be used to customize the handling of error conditions. See the
examples in the source repository.
When a function's return value is used primarily as a status indicator with the actual result being returned through an output parameter, it can be more convenient to return the output parameter value as the result of the wrapped command. For example, consider the earlier definition of
This was in turn invoked as
The return value from the function, the number of characters, is not very useful at the Tcl level except to indicate errors when zero. We can instead have the wrapped command return the actual path as the return value by annotating the buffer with
The invocation now feels more natural. In addition to the
retval annotation for the parameter, the
nonzero error checking annotation was added to the function return value. An error checking annotation is required for use of
default annotation is not required but is a convenience so we do not have to specify a buffer size at the call site.
Pointers are ubiquitous in C. They give C much of its power while also being the source of many bugs. In many cases, pointers can be avoided through the use of the
out parameters and
binary types that use pointers under the covers. Many times though, this is not possible or desirable and raw access to the native storage of the data is needed. The
pointer type provides this access while also attempting to guard against some common errors through multiple mechanisms:
- Pointers can be optionally tagged so a pointer to the wrong resource type is not inadvertently passed where a different one is expected.
- Pointers are marked as safe by default when returned from a function. Safe pointers are registered in an internal table which is checked whenever a pointer is accessed. It is sometimes necessary to bypass this check and the
unsafeattribute is provided for the purpose.
- Null pointers passed as parameters or returned from functions will raise a Tcl exception by default. The
nullokannotation provides for cases where NULL pointers can be legitimately used.
Note a pointer tag is not the same as a data type. For example, you may have a single C structure type
XY containing two numerical fields. You can choose to tag pointers to the structure with two different tags,
Dimensions depending on whether it is used as co-ordinates of a point or as width and height dimensions. The two tags will be treated as different.
The examples below repeat the previous ones, but this time using pointers in place of structs and strings.
First, define the call to
GetWindowRect using pointers. Note we are renaming the function as
GetWindowRectAsPointer so as to distinguish from our previous definition.
Since the pointer value itself is passed by value, notice the
rect parameter in the function definition was not marked as an
out parameter and was passed as a value in the actual call itself.
Unlike the case with the
struct type, memory to hold the structure has now to be explicitly allocated because we are passing raw pointers.
Finally, the structure contents can be extracted.
It is not even strictly necessary to even define a structure at all. Below is yet another way to get dimensions without making use of the
RECT struct and using direct memory allocation.
In the above fragment,
cffi::memory allocate is used to allocate memory tagged as
RECT. Note that this does not require that the struct
RECT have been previously defined. The
cffi::memory tobinary command is then used to convert the allocated memory content to a Tcl binary string.
The use of
struct definitions is to be preferred to this raw memory access for convenience and safety reasons. Still, there are cases, for example variable length structures in memory, where this is required.
As an example of the protection against errors offered by pointer tags, here is an attempt to retrieve the window dimensions where the arguments are passed in the wrong order.
While tags offer protection against type mismatches, another mechanism guards against invalid pointers and double frees. For example, attempting to free memory that we already freed above results in an error being raised.
Similarly, explicitly allocated
struct storage needs to be freed. Attempting to free multiple times will raise an error. The first call below succeeds but the second fails.
Needless to say, these protection mechanisms are far from foolproof.
The above mechanism for detecting invalid pointers can clearly work for memory allocated through the package where it can be internally tracked. But what about allocations done through calls to loaded shared libraries? For this, the package treats any pointer returned from a function or through an output parameter as a valid pointer and registers it internally. Any pointers that are passed as input to a function are by default checked to ensure they were previously registered and an error raised otherwise. The question then remains as to how and when the pointer is marked as invalid. The
dispose type attribute is provided for this purpose.
The following Windows calls to allocate heaps illustrate the use. The C functions are prototyped as
The above C prototypes use Windows type definitions. Rather than translate them into appropriate (32- or 64-bit) native C types, the ::cffi::alias define command, detailed later, allows platform-specific type definitions to be predefined.
On Windows, this will define common types like
DWORD etc. so the above functions can be wrapped as
The pointer value returned from
HeapCreate is by default registered as a valid pointer. When the pointer is passed to
HeapDestroy, it is first validated. The presence of the
dispose attribute will then remove its registration causing any further attempts to use it, for example even to free it, to fail.
Here is another example, this time from Linux. Note the tag
SOMETHING, chosen to reinforce that tags have no semantics in terms of the actual type that the pointer references. A tag of
FILE would have of course been more reflective of the referenced type, but this is not mandated.
As an aside, note the use of the
errno annotation above. An attempt to open a non-existent file for reading will return a NULL pointer which by default causes
cffi to raise an exception. The
errno annotation retrieves the system error for a more readable error message.
There are a couple of situations where the pointer registration mechanism is a hindrance.
- One is when the pointer is acquired by some means other than through a call made through this package.
- Another case is when the pointer is to a resource that is reference counted and whose acquisition may return the same pointer value multiple times.
To deal with the first case, a pointer return type or parameter may have the
unsafe attribute. This will result in bypassing of any pointer registration or checks. So for example, if the heap functions had been defined as below, multiple calls could be made to
HeapDestroy. NOTE: Do not actually try it as your shell will crash as Windows does not itself check pointer validity for
Note that unsafe pointers only bypass registration checks; the pointer tags are still verified if present.
The second case has to do with API calls like
LoadLibrary which we can prototype as
When multiple calls are made to
LoadLibraryA, it returns the same pointer while keeping an internal reference count. The expectation is for the application to make the same number of calls to
FreeLibrary. However, the following sequence of calls fails because the first call to
FreeLibrary unregisters the pointer resulting in failure of the second call to
counted attribute is provided for this use case. When specified, the pointer is still registered but permits multiple registrations and a corresponding number of disposals. To illustrate,
Now multiple registrations are allowed which stay valid until a corresponding number of disposals are done. An additional attempt to free will fail as desired.
Type aliases are a convenience feature to avoid repetition and improve readability. They may be added with the ::cffi::alias define command. As an example, consider our previous prototype for the
This return type declaration is very common for Windows API calls. Instead of repeating this triple for every such call, a new type can be defined for the purpose and used in all prototypes.
This facility is also useful for abstracting platform differences. For example, many windows allocation functions use the C typedef
SIZE_T which translates to either a 64-bit or 32-bit C integer type depending on whether the program was built for 32- or 64-bit Windows. Instead of defining separate prototypes for every function using the type, a single type definition can be used.
As a further convenience, the ::cffi::alias load command defines commonly useful typedefs including cross-platform ones such as
size_t as well as platform-specific ones such as
DWORD etc. on Windows.
There are times when an application is forced to drop down to raw C pointer and memory manipulation. The
pointer command ensembles implement functionality useful for this purpose. See the reference documentation for a description.
The package includes a
help command ensemble that is useful during interactive development. The ::cffi::help function command describes the syntax for a command wrapping a C function.
The ::cffi::help functions command lists wrapped functions matching a pattern.
dyncall backend has the following limitations:
- Structs can only be passed by reference, not value.
- Callbacks into Tcl from C functions are not implemented.
- Functions with variable number of arguments not implemented.