Getting startedTop, Main, Index
This package implements a foreign function interface (FFI) for Tcl.
OverviewTop, Main, Index
A FFI provides the ability for Tcl to invoke, with some limitations, arbitrary functions from any dynamically linked shared library without having to write C code.
Calling a C function from a shared library (or DLL, using the term interchangeably) using cffi
involves
- Loading the library into the process by instantiating a ::cffi::Wrapper object.
- Defining commands that wrap the C functions of interest through method calls on the Wrapper object.
- Invoking the wrapped commands as any other Tcl commands.
This page provides an introduction to the package. Details are described in
Describes CFFI functionality. | |
Provides recipes for converting C declarations to CFFI declarations. | |
Reference documentation for commands in the ::cffi namespace. |
Some tutorial material is available at magicsplat and the examples directory in the repository.
DownloadsTop, Main, Index
Source distributions and binaries for some platforms are available from https://sourceforge.net/projects/magicsplat/files/cffi.
InstallationTop, Main, Index
Binary distributions can be extracted into any directory present in Tcl's auto_path
variable.
For instructions on building and installing from source, see the BUILD.md
file in the source distribution. The package can be built with one of two back end libraries that implement FFI functionality:
The selection of the back end library to use is made at build time as described in the build instructions. Also see Limitations.
Quick startTop, Main, Index
The following examples illustrate basic usage of the package. Most examples use Windows as they have standard system libraries in known locations. The usage is identical on other platforms.
Basic callsTop, Main, Index
After loading the package, the user32
system DLL which contains the functions of interest is loaded.
This creates a shared library object that can then be used for defining functions implemented within the library. It is recommended to always pass a full path to the shared library though that is not strictly required.
The next step is to define the prototype for the function to get a handle to the desktop window. In C, this would look like
where HWND
is actually a C pointer typedef.
To call this function from Tcl we first have to define a wrapper for it. Wrappers can be defined with either the function or stdcall methods on the shared library object that implements the function. The two commands are identical on all platforms except 32-bit Windows where function
is used to call functions that use the C calling convention while stdcall
is used to call functions using the stdcall calling convention. On all other platforms, including 64-bit Windows, the two are identical. Since we are demonstrating using Windows, the stdcall
command is used in some of the examples.
The prototype for the above Windows function can then be defined as:
The last argument is the parameter list which in this case is empty. The pointer.HWND
return value is a tagged pointer. As illustrated later, pointer tags help in type safety. The definition could have as well just typed it as pointer
which would be the equivalent of void*
in C but that would lose the ability to check types.
The C function can be called as any other Tcl command.
Note how the return pointer value is tagged as HWND
.
StructuresTop, Main, Index
To retrieve the dimensions of this window we need to call the GetWindowRect
function from the same library. This has a slightly more complex prototype.
In particular, the window dimensions are returned in a RECT
structure whose address is passed in to the function. We thus need to first define a corresponding structure.
A structure is defined simply by a list of alternating field names and types of which int
is one of several numeric types available. The result is a Struct
object which can be used in function declarations, memory allocations and other places.
The GetWindowRect
function is wrapped using this struct definition.
The parameter list for a function follows the same form as for structs, an alternating list of parameter names and type declarations.
Note how the type for the second parameter, rect
, is specified. In general, type declarations consist of a type optionally followed by a list of annotations that serve multiple purposes. In this case struct.RECT
specifies the data type while out
is an annotation that indicates it is an output parameter for the function and thus to be passed by reference. At the script level, the parameter must then be a variable name, and not a value, into which the output can be stored.
The function is called as:
As seen above, struct
values are automatically decoded into dictionaries with keys corresponding to struct field names. We could have also selected to receive structs as binary values in their native form as shown further on.
Strings and binariesTop, Main, Index
The char *
type in C is more often than not used for null-terminated strings. These may be dealt with in the same fashion as pointers at the script level as discussed in Pointers. However, in most cases it is more convenient to use the string
and chars
types. The difference between the two is that the former corresponds to a C char *
pointer while the latter corresponds to a C char[N]
array. When passed to a function, of course both are passed as char *
pointers but the latter is more convenient for output buffers where the parameter corresponds to a output buffer of some maximum size. The former on the other hand is convenient for input parameters where the size is of no concern.
The following fragment illustrates their use.
In the above declaration the path
argument is declared as an array whose size is given by another parameter nchars
instead of an integer constant. This makes for fewer errors in calling C functions which expect a size and pointer combination of parameters where a larger size than the declared array size may be mistakenly passed.
The above functions can then be simply invoked.
In the above examples, the strings in all cases are encoded in the system encoding before being passed to the C function. Similarly, the output buffers returned by the C function are assumed to be in system encoding and translated into Tcl's internal form accordingly. Both string
and chars
can be tagged with an encoding name. For example, if the functions were defined using string.jis0208
or chars[512].jis0208
, the JIS0208 encoding would be used to transform the strings in both directions.
The package has other related types that serve a similar purpose.
- The
unistring
andunichars
type are analogous tostring
andchars
except they assume the encoding is that of theTcl_UniChar
C type and have all sizes specified inTcl_UniChar
units as opposed to chars/bytes. - The
winstring
andwinchars
type are similarly analogous tostring
andchars
except they assume the encoding is that of theWCHAR
C type (UTF-16) and have all sizes specified inWCHAR
units as opposed to chars/bytes. These types ae specific to Windows and only available there. - The
binary
andbytes
types are again analogous except they assume binary data (Tcl byte array at the C level).
Error checkingTop, Main, Index
C functions indicate errors primarily through their return values. In some cases, the return value is the error code while in others it is only a boolean status indicator with detail being available either through another function such as GetLastError
on Windows or a global variable such as errno
.
Here is an attempt to change to a non-existing directory.
The return value of 0
here indicates failure. The caller must then specifically check for errors. Moreover, if the return value does not actually indicate cause of error, another call has to be made to GetLastError
etc.. This has multiple issues:
- first, caller has to specifically check for errors,
- second, and more important, by the time a secondary call is made to retrieve
errno
etc. the originally error is likely to have been overwritten.
To deal with the first, a return type can be annotated with conditions that specify expected return values. Since SetCurrentDirectory
returns a non-zero value on success, the function return type may be annotated with nonzero
. Our previous definition of SetCurrentDirectoryA
can instead be written as
Passing a non-existing directory will then raise an error exception.
Note however, that the error message is generic and only indicates the function return value did not meet the expected success criterion. To fix this, the function definition can include an annotation for an error retrieval mechanism:
With the inclusion lasterror
in the type declaration, the error message is much clearer. This also eliminates the second issue mentioned above with the error detail being lost before the function to retrieve it is called.
While lasterror
implies GetLastError
as the retrieval mechanism and is specific to Windows, other annotations for error retrieval are also available. For example, the errno
annotation serves a similar purpose except it retrieves the error based on the errno
facility which is cross-platform and commonly applicable to both the C runtime as well as system calls on Unix.
For cases where the error is library-specific and not derived from the system, the onerror
annotation can be used to customize the handling of error conditions. See the examples
in the source repository.
Delegating return valuesTop, Main, Index
When a function's return value is used primarily as a status indicator with the actual result being returned through an output parameter, it can be more convenient to return the output parameter value as the result of the wrapped command. For example, consider the earlier definition of GetCurrentDirectoryA
This was in turn invoked as
The return value from the function, the number of characters, is not very useful at the Tcl level except to indicate errors when zero. We can instead have the wrapped command return the actual path as the return value by annotating the buffer with retval
.
The invocation now feels more natural. In addition to the retval
annotation for the parameter, the nonzero
error checking annotation was added to the function return value. An error checking annotation is required for use of retval
.
The default
annotation is not required but is a convenience so we do not have to specify a buffer size at the call site.
Pointers and memoryTop, Main, Index
Pointers are ubiquitous in C. They give C much of its power while also being the source of many bugs. In many cases, pointers can be avoided through the use of the out
parameters and struct
, string
and binary
types that use pointers under the covers. Many times though, this is not possible or desirable and raw access to the native storage of the data is needed. The pointer
type provides this access while also attempting to guard against some common errors through multiple mechanisms:
- Pointers can be optionally tagged so a pointer to the wrong resource type is not inadvertently passed where a different one is expected.
- Pointers are marked as safe by default when returned from a function. Safe pointers are registered in an internal table which is checked whenever a pointer is passed to a function or accessed via safe commands. It is sometimes necessary to bypass this check and the
unsafe
attribute is provided for the purpose. Pointer fields within structs are always unsafe and have to be accessed using unsafe commands. - Null pointers passed as parameters or returned from functions will raise a Tcl exception by default. The
nullok
annotation provides for cases where NULL pointers can be legitimately used.
Note a pointer tag is not the same as a data type. For example, you may have a single C structure type XY
containing two numerical fields. You can choose to tag pointers to the structure with two different tags, Point
and Dimensions
depending on whether it is used as co-ordinates of a point or as width and height dimensions. The two tags will be treated as different.
The examples below repeat the previous ones, but this time using pointers in place of structs and strings.
First, define the call to GetWindowRect
using pointers. Note we are renaming the function as GetWindowRectAsPointer
so as to distinguish from our previous definition.
Since the pointer value itself is passed by value, notice the rect
parameter in the function definition was not marked as an out
parameter and was passed as a value in the actual call itself.
Unlike the case with the struct
type, memory to hold the structure has now to be explicitly allocated because we are passing raw pointers.
Finally, the structure contents can be extracted.
It is not even strictly necessary to even define a structure at all. Below is yet another way to get dimensions without making use of the RECT
struct and using direct memory allocation.
In the above fragment, cffi::memory allocate
is used to allocate memory tagged as RECT
. Note that this does not require that the struct RECT
have been previously defined. The cffi::memory tobinary
command is then used to convert the allocated memory content to a Tcl binary string.
The use of struct
definitions is to be preferred to this raw memory access for convenience and safety reasons. Still, there are cases, for example variable length structures in memory, where this is required.
As an example of the protection against errors offered by pointer tags, here is an attempt to retrieve the window dimensions where the arguments are passed in the wrong order.
While tags offer protection against type mismatches, another mechanism guards against invalid pointers and double frees. For example, attempting to free memory that we already freed above results in an error being raised.
Similarly, explicitly allocated struct
storage needs to be freed. Attempting to free multiple times will raise an error. The first call below succeeds but the second fails.
Needless to say, these protection mechanisms are far from foolproof.
The above mechanism for detecting invalid pointers can clearly work for memory allocated through the package where it can be internally tracked. But what about allocations done through calls to loaded shared libraries? For this, the package treats any pointer returned from a function or through an output parameter as a valid pointer and registers it internally. Any pointers that are passed as input to a function are by default checked to ensure they were previously registered and an error raised otherwise. The question then remains as to how and when the pointer is marked as invalid. The dispose
type attribute is provided for this purpose.
The following Windows calls to allocate heaps illustrate the use. The C functions are prototyped as
The above C prototypes use Windows type definitions. Rather than translate them into appropriate (32- or 64-bit) native C types, the ::cffi::alias define command, detailed later, allows platform-specific type definitions to be predefined.
On Windows, this will define common types like DWORD
etc. so the above functions can be wrapped as
The pointer value returned from HeapCreate
is by default registered as a valid pointer. When the pointer is passed to HeapDestroy
, it is first validated. The presence of the dispose
attribute will then remove its registration causing any further attempts to use it, for example even to free it, to fail.
Here is another example, this time from Linux. Note the tag SOMETHING
, chosen to reinforce that tags have no semantics in terms of the actual type that the pointer references. A tag of FILE
would have of course been more reflective of the referenced type, but this is not mandated.
As an aside, note the use of the errno
annotation above. An attempt to open a non-existent file for reading will return a NULL pointer which by default causes cffi
to raise an exception. The errno
annotation retrieves the system error for a more readable error message.
There are a couple of situations where the pointer registration mechanism is a hindrance.
- One is when the pointer is acquired by some means other than through a call made through this package.
- Another case is when the pointer is to a resource that is reference counted and whose acquisition may return the same pointer value multiple times.
To deal with the first case, a pointer return type or parameter may have the unsafe
attribute. This will result in bypassing of any pointer registration or checks. So for example, if the heap functions had been defined as below, multiple calls could be made to HeapDestroy
. NOTE: Do not actually try it as your shell will crash as Windows does not itself check pointer validity for HeapDestroy
.
Note that unsafe pointers only bypass registration checks; the pointer tags are still verified if present.
The second case has to do with API calls like LoadLibrary
which we can prototype as
When multiple calls are made to LoadLibraryA
, it returns the same pointer while keeping an internal reference count. The expectation is for the application to make the same number of calls to FreeLibrary
. However, the following sequence of calls fails because the first call to FreeLibrary
unregisters the pointer resulting in failure of the second call to FreeLibrary
.
The counted
attribute is provided for this use case. When specified, the pointer is still registered but permits multiple registrations and a corresponding number of disposals. To illustrate,
Now multiple registrations are allowed which stay valid until a corresponding number of disposals are done. An additional attempt to free will fail as desired.
Type aliasesTop, Main, Index
Type aliases are a convenience feature to avoid repetition and improve readability. They may be added with the ::cffi::alias define command. As an example, consider our previous prototype for the SetCurrentDirectoryA
function.
This return type declaration is very common for Windows API calls. Instead of repeating this triple for every such call, a new type can be defined for the purpose and used in all prototypes.
This facility is also useful for abstracting platform differences. For example, many windows allocation functions use the C typedef SIZE_T
which translates to either a 64-bit or 32-bit C integer type depending on whether the program was built for 32- or 64-bit Windows. Instead of defining separate prototypes for every function using the type, a single type definition can be used.
As a further convenience, the ::cffi::alias load command defines commonly useful typedefs including cross-platform ones such as size_t
as well as platform-specific ones such as HANDLE
, DWORD
etc. on Windows.
Memory utilitiesTop, Main, Index
There are times when an application is forced to drop down to raw C pointer and memory manipulation. The memory
and pointer
command ensembles implement functionality useful for this purpose. See the reference documentation for a description.
IntrospectionTop, Main, Index
The package includes a help
command ensemble that is useful during interactive development. The ::cffi::help function command describes the syntax for a command wrapping a C function.
The ::cffi::help functions command lists wrapped functions matching a pattern.
LimitationsTop, Main, Index
Each backend has some limitations in terms of passing of structs by value.
- The
libffi
backend cannot pass or return structs by value if they or a nested field contains arrays. - The
dyncall
backend only supports passing and return of structs by value on x64 platforms.