Getting startedTop, Main, Index

This package implements a foreign function interface (FFI) for Tcl based on the cross-platform C library, dyncall from Daniel Adler and Tassilo Philipp.

OverviewTop, Main, Index

A FFI provides the ability for Tcl to invoke, with some limitations, arbitrary functions from any dynamically linked shared library without having to write C code.

Calling a C function from a shared library (or DLL, using the term interchangeably) using cffi involves

  • First loading the library into the process by instantiating a ::cffi::dyncall::Library object.
  • Then defining function prototypes for the C functions of interest in that library through method calls on that object.

The created commands thus created can then be invoked as any other Tcl commands.

This page provides an introduction to the package. Details are described in

  • The Concepts page, which describes how C level constructs are translated to the Tcl script level; for example, type definitions, memory management, pointers etc.
  • The ::cffi page which contains reference documentation for the above
  • The ::cffi::dyncall page which contains reference documentation for loading shared libraries using the dyncall library and declaring function prototypes.
  • Some tutorial material is available at magicsplat and the samples directory.

DownloadsTop, Main, Index

Source distributions and binaries for some platforms are available from https://sourceforge.net/projects/magicsplat/files/cffi.

InstallationTop, Main, Index

Binary distributions can be extracted into any directory present in Tcl's auto_path variable.

To install from source, please see the BUILD.md file in the source distribution.

Quick startTop, Main, Index

The following examples illustrate basic usage of the package. Most examples use Windows as they have standard system libraries in known locations. The usage is identical on other platforms.

Basic callsTop, Main, Index

After loading the package, the user32 system DLL which contains the functions of interest is loaded.

% package require cffi
1.0a7
% cffi::dyncall::Library create user32 [file join $env(windir) system32 user32.dll]
::user32

This creates a shared library object that can then be used for defining functions implemented within the library. It is recommended to always pass a full path to the shared library though that is not strictly required on Windows.

The next step is to define the prototype for the function to get a handle to the desktop window. In C, this would look like

HWND GetDesktopWindow()

where HWND is actually a C pointer typedef.

To call this function from Tcl we first have to define a prototype for it. Prototypes can be defined with either the function or stdcall methods on the shared library object that implements the function. The two commands are identical on all platforms except 32-bit Windows where function is used to call functions that use the C calling convention while stdcall is used to call functions using the stdcall calling convention. On all other platforms, including 64-bit Windows, the two are identical. Since we are demonstrating using Windows, the stdcall command is used in some of the examples.

The prototype for the above Windows function can then be defined as:

% user32 stdcall GetDesktopWindow pointer.HWND {}

The last argument is the parameter list which in this case is empty. The pointer.HWND return value is a tagged pointer. As illustrated later, pointer tags help in type safety. The definition could have as well just typed it as pointer which would be the equivalent of void* in C but that would lose the ability to check types.

The C function can be called as any other Tcl command.

% set win [GetDesktopWindow]
0x0000000000010010^HWND

Note how the return pointer value is tagged as HWND.

StructuresTop, Main, Index

To retrieve the dimensions of this window we need to call the GetWindowRect function from the same library. This has a slightly more complex prototype.

int GetWindowRect(HWND hWnd, RECT *lpRect);

In particular, the window dimensions are returned in a RECT structure whose address is passed in to the function. We thus need to first define a corresponding structure.

% cffi::Struct create RECT {left int top int right int bottom int}
::RECT

A structure is defined simply by a list of alternating field names and types of which int is one of several numeric types available. The result is a Struct object which can be used in function declarations, memory allocations and other places.

The GetWindowRect is defined using this struct definition.

% user32 stdcall GetWindowRect int {hwnd pointer.HWND rect {struct.RECT out}}

The parameter list for a function follows the same form as for structs, an alternating list of parameter names and type declarations. The name of the parameter is irrelevant for invocation but requiring parameter names makes for more informative error messages.

Note how the type for the second parameter, rect, is specified. In general, type declarations consist of a type optionally followed by a list of annotations that serve multiple purposes as we will see below. In this case struct.RECT specifies the data type while out is an annotation that indicates it is an output parameter for the function and thus to be passed by reference. At the script level, the parameter must then be a variable name, and not a value, into which the output can be stored.

The function is called as:

% GetWindowRect $win dimensions
1
% set dimensions
left 0 top 0 right 1920 bottom 1080

As seen above, struct values are automatically decoded into dictionaries with keys corresponding to struct field names. As we see later, we could have also selected to receive structs as binary values in their native form.

Strings and binariesTop, Main, Index

The char * type in C is more often than not used for null-terminated strings. These may be dealt with in the same fashion as pointers at the script level as discussed in Pointers further on. However, in most cases it is more convenient to use the string and chars types. The difference between the two is that the former corresponds to a C char * pointer while the latter corresponds to a C char[N] array. When passed to a function, of course both are passed as char * pointers but the latter is more convenient for output buffers where the parameter corresponds to a output buffer of some maximum size. The former on the other hand is convenient for input parameters where the size is of no concern.

The following fragment illustrates their use.

% cffi::dyncall::Library create kernel kernel32.dll
::kernel
% kernel stdcalls {
    GetCurrentDirectoryA uint {nchars int path {chars[nchars] out}}
    SetCurrentDirectoryA uint {path string}
}

In the above declaration the path argument is declared as an array whose size is given by another parameter nchars instead of an integer constant. This makes for fewer errors in calling C functions which expect a size and pointer combination of parameters where a larger size than the declared array size may be mistakenly passed.

As an aside, the stdcalls method above to define multiple functions in a single call. There is also a corresponding functions method.

The above functions can then be simply invoked.

% GetCurrentDirectoryA 512 buffer
40
% set buffer
D:\src\tcl-cffi\win\Release_AMD64_VC1916
% SetCurrentDirectoryA d:/temp
1
% pwd
D:/Temp
% SetCurrentDirectoryA $buffer
1
% pwd
D:/src/tcl-cffi/win/Release_AMD64_VC1916 

In the above examples, the strings in all cases are encoded in the system encoding before being passed to the C function. Similarly, the output buffers returned by the C function are assumed to be in system encoding and translated into Tcl's internal form accordingly. Both string and chars can be tagged with an encoding name. For example, if the functions were defined using string.jis0208 or chars[512].jis0208, the JIS0208 encoding would be used to transform the strings in both directions.

The package has other related types that server a similar purpose.

  • The unistring and unichars type are analogous to string and chars except they assume the encoding is that of the Tcl_UniChar C type and have all sizes specified in Tcl_UniChar units as opposed to chars/bytes. This is primarily useful on Windows.
  • The binary and bytes types are again analogous except they assume binary data (Tcl byte array at the C level).

Error checkingTop, Main, Index

C functions indicate errors primarily through their return values. In some cases, the return value is the error code while in others it is only a boolean status indicator with detail being available either through another function such as GetLastError on Windows or a global variable such as errno.

Here is an attempt to change to a non-existing directory.

% SetCurrentDirectoryA "nosuchdir"
0

The return value of 0 here indicates failure. The caller must then specifically check for errors. Moreover, if the return value does not actually indicate cause of error, another call has to be made to GetLastError etc.. This has multiple issues:

  • first, caller has to specifically check for errors,
  • second, and more important, by the time a secondary call is made to retrieve errno etc. the originally error is likely to have been overwritten.

To deal with the first, a return type can be annotated with conditions that specify expected return values. Since SetCurrentDirectory returns a non-zero value on success, the function return type may be annotated with nonzero:

% kernel stdcall SetCurrentDirectoryA {int nonzero} {path string}

Passing a non-existing directory will then raise an error exception.

% SetCurrentDirectoryA "nosuchdir"
Invalid value "0". Function returned zero.

Note however, that the error message is generic and only indicates the function return value did not meet the expected success criterion. To fix this, the function definition can include an annotation for an error retrieval mechanism:

% kernel stdcall SetCurrentDirectoryA {int nonzero lasterror} {path string}
% SetCurrentDirectoryA "nosuchdir"
The system cannot find the file specified.
% set errorCode
cffi WINERROR {The system cannot find the file specified. }

With the inclusion lasterror in the type declaration, the error message is much clearer. This also eliminates the second issue mentioned above with the error detail being lost before the call to retrieve it is made.

While lasterror implies GetLastError as the retrieval mechanism and is specific to Windows, other annotation for error retrieval are also available. For example, the errno annotation serves a similar purpose except it retrieves the error based on the errno facility which is cross-platform and commonly applicable to both the C runtime as well as system calls on Unix.

For cases where the error is library-specific and not derived from the system, the onerror annotation can be used to customize the handling of error conditions.

Pointers and memoryTop, Main, Index

Pointers are ubiquitous in C. They give C much of its power while also being the source of many bugs. In many cases, pointers can be avoided through the use of the out parameters and struct, string and binary types that use pointers under the covers. Many times though, this is not possible or desirable and raw access to the native storage of the data is needed. The pointer type provides this access while also attempting to guard against some common errors through multiple mechanisms:

  • pointers can be optionally tagged so a pointer to the wrong resource type is not inadvertently passed where a different one is expected.
  • Pointers are marked as safe by default when returned from a function. Safe pointers are registered in an internal table which is checked whenever a pointer is accessed. It is sometimes necessary to bypass this check and the unsafe attribute is provided for the purpose.

Note a pointer tag is not the same as a data type. For example, you may have a single C structure type XY containing two numerical fields. You can choose to tag pointers to the structure with two different tags, Point and Dimensions depending on whether it is used as co-ordinates of a point or as width/height dimensions. The two tags will be treated as different.

The examples below repeat the previous ones, but this time using pointers in place of structs and strings.

First, define the call to GetWindowRect using pointers. Note we are renaming the function as GetWindowRectAsPointer so distinguish from our previous definition.

% user32 stdcall {GetWindowRect GetWindowRectAsPointer} int {hwnd pointer.HWND rect pointer.RECT}

Since the pointer value itself is passed by value, notice the rect parameter in the function definition was not marked as an out parameter and was passed as a value in the actual call itself.

Unlike the case with the struct type, memory to hold the structure has now to be explicitly allocated because we are passing raw pointers.

% set prect [RECT allocate]
0x00000274be29a510^RECT
% GetWindowRectAsPointer $win $prect
1

Finally, the structure contents can be extracted.

% RECT fromnative $prect
left 0 top 0 right 1920 bottom 1080

It is not even strictly necessary to even define a structure at all. Below is yet another way to get dimensions without making use of the RECT struct and using direct memory allocation.

% set praw [cffi::memory allocate 16 RECT]
0x000001E53EAB8F90^RECT
% GetWindowRectAsPointer $win $praw
1
% binary scan [cffi::memory tobinary $praw 16] iiii left top right bottom
4
% puts [list $left $top $right $bottom]
0 0 1920 1080
% cffi::memory free $praw

In the above fragment, cffi::memory allocate is used to allocate memory tagged as RECT. Note that this does not require that the struct RECT have been previously defined. The cffi::memory tobinary command is then used to convert the allocated memory content to a Tcl binary string.

Needless to say, the use of struct definitions is to be preferred to this raw memory access for convenience and safety reasons. Still, there are cases, for example variable length structures in memory, where this is required.

As an example of the protection against errors offered by pointer tags, here is an attempt to retrieve the window dimensions where the arguments are passed in the wrong order.

% GetWindowRect $prect $win
Value "0x000001E53EA3BEB0^RECT" has the wrong type. Pointer type mismatch.

While tags offer protection against type mismatches, another mechanism guards against invalid pointers and double frees. For example, attempting to free memory that we already freed above results in an error being raised.

% cffi::memory free $praw
Pointer 000001E53EAB8F90^ is not registered.

Similarly, explicitly allocated struct storage needs to be freed. Attempting to free multiple times will raise an error. The first call below succeeds but the second fails.

% RECT free $prect
% RECT free $prect
Pointer 000001E53EA3BEB0^ of type RECT is not registered.

Needless to say, these protection mechanisms are far from foolproof.

The above mechanism for detecting invalid pointers can clearly work for memory allocated through the package where it can be internally tracked. But what about allocations done through calls to loaded shared libraries? For this, the package treats any pointer returned from a function or through an output parameter as a valid pointer and registers it internally. Any pointers that are passed as input to a function are by default checked to ensure they were previously registered and an error raised otherwise. The question then remains as to how and when the pointer is marked as invalid. The dispose type attribute is provided for this purpose.

The following Windows calls to allocate heaps illustrate the use. The C functions are prototyped as

HANDLE HeapCreate(DWORD  flOptions, SIZE_T dwInitialSize, SIZE_T dwMaximumSize);
BOOL HeapDestroy(HANDLE hHeap);

The above C prototypes use Windows type definitions. Rather than translate them into appropriate (32- or 64-bit) native C types, the cffi::alias command, detailed later, allows platform-specific type definitions to be predefined.

cffi::alias load win32

On Windows, this will define common types like DWORD etc. so the functions can be defined as

kernel stdcall HeapCreate pointer.HEAP {opts DWORD initSize SIZE_T maxSize SIZE_T}
kernel stdcall HeapDestroy {BOOL nonzero lasterror} {heapPtr {pointer.HEAP dispose}}

The pointer value returned from HeapCreate is by default registered as a valid pointer. When the pointer is passed to HeapDestroy, it is first validated. The presence of the dispose attribute will then remove its registration causing any further attempts to use it, for example even to free it, to fail.

% set p [HeapCreate 0 1024 1024]
0x00000274c0b50000^HEAP
% HeapDestroy $p
1
% HeapDestroy $p
Pointer 0x00000274c0b50000^HEAP is not registered.

Here is another example, this time from Linux. Note the tag SOMETHING, chosen to reinforce that tags have no semantics in terms of the actual type that the pointer references. A tag of FILE would have of course been more reflective of the referenced type, but this is not mandated.

% cffi::dyncall::Library create myprocess
::myprocess
% myprocess function fopen {pointer.SOMETHING nonzero errno} {path string mode string}
% myprocess function fclose int {file {pointer.SOMETHING dispose}}
% set fileptr [fopen test.txt w]
0x55a273bf32e0^SOMETHING
% fclose $fileptr
0
% fclose $fileptr
Pointer 0x55a273bf32e0^SOMETHING is not registered.

There are a couple of situations where this pointer registration mechanism is a hindrance.

  • One is when the pointer is acquired by some means other than through a call made through this package.
  • Another case is when the pointer is to a resource that is reference counted and whose acquisition may return the same pointer value multiple times.

To deal with the first case, a pointer return type or parameter may have the unsafe attribute. This will result in bypassing of any pointer registration or checks. So for example, if the heap functions had been defined as below, multiple calls could be made to HeapDestroy. NOTE: Do not actually try it as your shell will crash as Windows does not itself check pointer validity for HeapDestroy.

kernel stdcall HeapCreate {pointer.HEAP unsafe} {opts DWORD initSize SIZE_T maxSize SIZE_T}
kernel stdcall HeapDestroy {BOOL nonzero lasterror} {heapPtr {pointer.HEAP unsafe}}

Note that unsafe pointers only bypass registration checks; the pointer tags are still verified.

The second case has to do with API calls like LoadLibrary which we can prototype as

kernel stdcall LoadLibraryA pointer.HMODULE {path string}
kernel stdcall FreeLibrary BOOL {libhandle {pointer.HMODULE dispose}}

When multiple calls are made to this function, it returns the same pointer while keeping an internal reference count. However, registered pointers are expected to be unique by default and the following sequence of calls fails.

% set libptr [LoadLibraryA advapi32.dll]
0x00007FFDD0000000^HMODULE
% set libptr [LoadLibraryA advapi32.dll]
Registered pointer already exists. 

The registered pointer needs to be freed first with FreeLibrary before another call to load the same library is made. This artificial limitation on the use of LoadLibrary can be worked around using the unsafe attribute discussed earlier. However, that would mean there could be more calls made to FreeLibrary than to LoadLibraryA resulting in whatever Windows decided was an appropriate failure mode, most likely random crashes.

The counted attribute is provided for this use case. When specified, the pointer is still registered but permits multiple registrations and a corresponding number of disposals. To illustrate,

kernel stdcall LoadLibraryA {pointer.HMODULE counted} {path string}
kernel stdcall FreeLibrary BOOL {libhandle {pointer.HMODULE dispose}}

Now multiple registrations are allowed which stay valid until a corresponding number of disposals are done.

% set libptr [LoadLibraryA advapi32.dll]
0x00007FFDD0000000^HMODULE
% set libptr2 [LoadLibraryA advapi32.dll]
0x00007FFDD0000000^HMODULE
% FreeLibrary $libptr
1
% FreeLibrary $libptr2
1
% FreeLibrary $libptr
Pointer 00007FFDD0000000 of type HMODULE is not registered.

Type aliasesTop, Main, Index

Type aliases are a convenience feature to avoid repetition and improve readability. They may be added with the ::cffi::alias define command. As an example, consider our previous prototype for the SetCurrentDirectoryA function.

kernel stdcall SetCurrentDirectoryA {int nonzero lasterror} {path string}

This return type declaration is very common for Windows API calls. Instead of repeating this triple for every such call, a new type can be defined for the purpose and used in all prototypes.

cffi::alias define LASTERROR {int nonzero lasterror}
kernel stdcall SetCurrentDirectoryA LASTERROR {path string}

This facility is also useful for abstracting platform differences. For example, many windows allocation functions use the C typedef SIZE_T which translates to either a 64-bit or 32-bit C integer type depending on whether the program was built for 32- or 64-bit Windows. Instead of defining separate prototypes for every function using the type, a single type definition can be used.

if {$tcl_platform(pointerSize) == 8} {
    cffi::type SIZE_T ulonglong
} else {
    cffi::type SIZE_T ulonglong
}
kernel stdcalls {
    HeapCreate pointer.HEAP {opts DWORD initSize SIZE_T maxSize SIZE_T}
    HeapDestroy LASTERROR {heapPtr {pointer.HEAP dispose}}
    HeapAlloc {pointer nonzero} {heapPtr pointer.HEAP flags uint size SIZE_T}
}

As a further convenience, the ::cffi::alias load command defines commonly useful typedefs including cross-platform ones such as size_t as well as platform-specific ones such as HANDLE, DWORD etc. on Windows.

LimitationsTop, Main, Index

The current version has the following limitations:

  • Arrays of non-numeric types are not implemented.
  • Structs cannot be passed or returned by value, only by reference.
  • Callbacks into Tcl from C functions are not implemented.
  • C functions with a variable number of arguments are not implemented.
Document generated by Ruff!