ConceptsTop, Main, Index
This page describes some general concepts behind CFFI. Basic knowledge of the package as described in Quick start is assumed. For a more direct mapping of C declarations to CFFI declarations see Cookbook.
ScopesTop, Main, Index
To avoid conflicts arising from the same name being used in different in packages layered on CFFI, program elements like type aliases, enumerations, prototypes and pointer tags are defined within an enclosing scope. The scope is named after the Tcl namespace in which the defining command is invoked.
For example, assuming the libgit2
and libzip
namespaces are used for wrapping shared libraries of the same name, the following definition for the STATUS
alias used in two (imagined) functions would not be in conflict as they have different scopes.
When a program element name is referenced from another definition, if the name is not fully qualified it is first looked up in the scope of the definition. If not found, it is looked up in the global space. In the above definition of libzip_open
, the STATUS
alias is resolved in the libzip
scope. If it had not been defined there, the global scope would be checked. To refer to a name in any other scope, it must be fully qualified, for example ::libgit2::STATUS
.
Note that although CFFI scopes are named after Tcl namespaces, they are not directly tied to them. For example, deleting will a Tcl namespace will not cause the scope of the same name to disappear.
Type declarationsTop, Main, Index
A type declaration consists of a data type followed by zero or more annotations that further specify handling of values of the type. For example, the nonzero
annotation on a function return type indicates a return value of zero should be treated as an error.
For example, a type declaration for a parameter might look like
where the base data type is an integer, the default
annotation specifies a value to be used if no argument is supplied and byref
indicates that the value is actually passed by reference.
Type declarations appear in three different contexts:
- As the return type from a function
- As a parameter in a function declaration
- As a field in a struct
The permitted data types and annotations are dependent on the context in which the type declaration appears.
Type annotationsTop, Main, Index
The table below summarizes the available type annotations and the types and contexts in which they are allowed.
bitmask | The parameter, function return or field value is treated as an integer formed by a bitwise-OR of a list of integer values. |
byref | The parameter or function return value is passed or returned by reference. |
counted | The parameter or function return is a reference counted pointer whose validity is checked. See Pointer safety. |
default | Specifies a default value to use for a parameter or field. |
discard | Specifies that an function return value be discarded and empty result returned. |
dispose | The parameter is a pointer that is unregistered irrespective of function return status. See Pointer safety. |
disposeonsuccess | The parameter is a pointer that is disposed only if function returns successfully. See Pointer safety. |
enum | The parameter, return value or field is an enumeration. |
errno | If the function return value indicates an error condition, the error code is available via the C RTL errno variable. |
in | Marks a parameter passed to a function as input to the function. See Input and output parameters. |
inout | Marks a parameter passed to a function as both input and output. See Input and output parameters. |
lasterror | If the function return value indicates an error condition, the error code is available via the Windows GetLastError API. |
multisz | The value is a concatenation of multiple nul-terminated strings with an empty string indicating the end. (Windows MULTI_SZ format) |
nonnegative | Raise an exception if the function return value is negative. |
nonzero | Raise an exception if the function return values is zero. |
novaluechecks | Disable value validity checks. The checks depend on the value type. See discussion of specific types for its semantics. |
nullifempty | Treat an empty string or struct dictionary value passed as function argument or struct field as a NULL pointer. See Strings as NULL pointers. In the case of out or inout parameters, specifies that if the variable name passed as the argument is the empty string, a NULL pointer should be passed as the function argument. |
nullok | Deprecated. Alias for novaluechecks . |
onerror | Specifies an error handler if a function return value indicates an error condition. |
out | Marks a parameter as output-only from a function. See Input and output parameters. |
pinned | The parameter or function return is a reference pinned pointer whose validity is checked. See Pointer safety. |
positive | Raise an exception if a function return value is negative or zero. |
retval | Marks the parameter as an output parameter whose value is to be returned as the result of the function. See Output parameters as function result. |
storealways | Treat an output parameter as valid regardless of any error indications from the function call. |
storeonerror | Treat an output parameter as valid only in the presence of error indication from a function call. |
structsize | Default a field value to the size of the containing struct. |
unsafe | Do not do any pointer validation on a parameter, return value or field. See Pointer safety. |
winerror | Treat the function return value as a Windows status code. |
zero | Raise an exception if a function return value is not zero. |
Later sections will further detail usage of the above.
Data typesTop, Main, Index
CFFI data types correspond to C types and may be
- the
void
type - scalars, such as integers and pointers
- arrays and structs
string
,unistring
,winstring
types which are actually scalar pointers at the C level but treated as nul-terminated character strings by CFFI.
The type info, type size and type count commands may be used to obtain information about a type.
The void typeTop, Main, Index
This corresponds to the C void
type and is only permitted as the return type of a function. Note that the C void *
type is declared as a pointer type.
Integer typesTop, Main, Index
The following integer types are supported.
schar | C signed char |
uchar | C unsigned char |
short | C signed short |
ushort | C unsigned short |
int | C signed int |
uint | C unsigned int |
long | C signed long |
ulong | C unsigned long |
longlong | C signed long long |
ulonglong | C unsigned long long |
Floating point typesTop, Main, Index
The following floating point types are supported.
float | C float |
double | C double |
ArraysTop, Main, Index
Arrays are declared as
TYPE[N]
where N
is a positive integer indicating the number of elements in an array of values of type TYPE
. At the script level, arrays are represented as Tcl lists.
Dynamically sized arraysTop, Main, Index
Additionally, within parameter declarations, N
may also be the name of a parameter within the same function declarations. In this case, the array is sized dynamically depending on the value of the referenced parameter at the time the call is made. This is useful in the common case of a function writing to a buffer. For example, consider the Win32 API for generating random numbers
Here pbBuffer
is really a pointer to an array that is of size dwLen
. Assuming the ADVAPI32.DLL
DLL has already been wrapped as advapi32
and Win32 type aliases loaded, one might define the CFFI wrapper as
However this can can lead to corruption if the function is mistakenly called with the dwLen
argument greater than 512. A safer way to define the function is
This ensures a buffer of the correct size is passed to the function based on the length passed in the call.
If the size argument used by a dynamic array type is passed as 0, an error is raised unless the type declaration includes a novaluechecks
annotation. In that case, the array pointer argument is passed as NULL to the function.
In the above function, the dwLen
parameter was an in
parameter containing the size of the buffer and the function filled the entire buffer. In many cases, a function is passed a pointer to the size of the buffer. It then overwrites the location with the actual count stored in the buffer. In such cases, the parameter should be declared with the inout
annotation. When the function is called the size of the buffer passed is that specified by the corresponding variable argument. On return, the number of elements returned is that actually returned by the wrapped function.
Arrays as stringsTop, Main, Index
C arrays are generally represented as a list at the script level. So in the above example, the value stored in pbBuffer
would be seen in Tcl as a list of unsigned 8-bit integer values. Sometimes this list representation is not the most appropriate or convenient.
For example, the returned data from CryptGenRandom
might be better handled as a binary string. CFFI provides the types bytes
, chars
, unichars
and winchars
that are defined as arrays but treat C arrays of 8-bit values as strings instead. See Strings and Binary strings for more information.
PointersTop, Main, Index
Pointers are declared in one of the following forms:
The first is the equivalent of a void*
C pointer. The second form associates the pointer type with a tag.
Pointer values are currently represented in the form
where the tag is optional as in declarations. Applications must not rely on this specific representation as it is subject to change. Instead the pointer
ensemble command set should be used to manipulate pointers. In particular, the ::cffi::pointer make command constructs a pointer from a memory address and tag. The ::cffi::pointer address and ::cffi::pointer tag do the reverse.
Pointer tagsTop, Main, Index
A pointer tag is used to provide for some measure of type safety. Tags can be associated with pointer values as well as pointer type declarations. The tag attached to a pointer value must match the tag for the struct
field it is assigned to or the function parameter it is passed as. Otherwise an error is raised. Tags also provide a typing mechanism for function pointers. This is described in Prototypes and function pointers.
Note however that, although similar, pointer tags are orthogonal to the type system. Any tag may be associated with a pointer type or value, irrespective of the underlying C pointer type.
Tags for pointer types are defined in the corresponding struct
field or function parameter declarations. Pointer values are associated with the tags of the type through which they are created, qualified with a scope. For example, the pointer returned by a function declared in the global namespace
LIB function get_path pointer.PATH {}
will be tagged with ::PATH
. On the other hand if the function was declared within a namespace ns
pointers returned from the function would be tagged with ::ns::PATH
. Furthermore, the tag in the definition may be fully qualified as
in which case returned pointers have the same exact tag. Note the scope ns2
need not even correspond to a Tcl namespace.
Pointers can only be assigned to a struct
field or passed as a parameter if the corresponding pointer type in the struct field or parameter definition has the same tag. If there is no tag specifed for the pointer field or parameter, it will accept pointer values with any tag analogous to a C void *
pointer.
Casting pointersTop, Main, Index
Normally a pointer with a tag is not accepted as a function argument or struct field if it differs from the tag in the declaration. There are two exceptions to this:
- a pointer declaration with no tag is treated as a
void*
and will accept pointer values with any tag. - a pointer declaration with a tag will accept pointers with tags that are declared as castable to it. This is similar to pointers to subclasses being accepted as pointers to superclasses.
This pointer castable command enables this second feature. For example,
will result in any pointer value with tag Rectangle
being accepted wherever the tag Shape
is accepted. Note this implies transitivity.
A pointer may also be cast explicitly to one with a different tag with the pointer cast command. This requires that the existing tag is castable to the new tag. So given the above example,
pointer.Rectangle
will be implicitly accepted as apointer.Shape
- a
pointer.Shape
value can be explicitly cast topointer.Rectangle
. The reverse is also possible but not needed because for the first point. - a
pointer.Circle
cannot be directly cast topointer.Rectangle
or vice versa.
In case the pointer is a safe (registered) pointer, explicit casts change the tag associated with the registered pointer.
For debugging and troubleshooting purposes, the pointer castables command may be used to list the tags that are castable and their mappings.
Pointer safetyTop, Main, Index
Pointer type checking via tags does not protect against errors related to invalid pointers, double frees etc. To provide some level of protection against these types of errors, pointers returned from functions, either as return values or through output parameters are by default registered in an internal table. These are referred to as safe pointers. Any pointer use is then checked for registration and an error raised if it is not found.
Pointers that have been registered are unregistered when they are passed to a C function as an argument for a parameter that has been annotated with the dispose
or disposeonsuccess
annotation.
The following fragment illustrates safe pointers. The fragment assumes a wrapper object crtl
for the C runtime library has already been created.
The pointer returned by malloc
is automatically registered. When the free
function is invoked, its argument is checked for registration. Moreover, because the free
function's ptr
parameter has the dispose
annotation, it is unregistered before the function is called. The second call to free
therefore fails as desired.
The disposeonsuccess
annotation is similar to dispose
except that if the function return type includes error check annotations, the pointer is unregistered only if the return value passes the error checks.
Reference counted pointers
Some C API's return the same resource pointer multiple times while internally maintaining a reference count. Examples are dlopen
on Linux or LoadLibrary
and COM API's on Windows. Such pointers need to be declared with the counted
attribute. This works similarly to the default safe pointers except that the same pointer value can be registered multiple times. Correspondingly, the pointer can be accessed until the same number of calls are made to a function that disposes of the pointer. The Linux example below illustrates this.
Note the same pointer value was returned from both calls. We can then call dlclose
multiple times but not more than the number of times dlopen
was called.
Unsafe pointers
C being C, there are many situations where pointers are generated and passed around in a somewhat ad hoc manner with no clear ownership. For such situations where safe and counted pointers can raise exceptions that are false positives, pointer declarations can be annotated as unsafe
. Return values from functions and output parameters with this annotation will not be registered as safe pointers. Conversely, input parameters with this designation will not be checked for registration.
In addition to the implicit registration of pointers, applications can explicitly control pointer registration or with the ::cffi::pointer check, ::cffi::pointer safe, ::cffi::pointer counted and ::cffi::pointer dispose commands.
Pinned pointers
The ::cffi::pointer pin command may be used to permanently register a pointer as safe. The pointer will remain registered irrespective of any ::cffi::pointer dispose calls and can only be unregistered with ::cffi::pointer invalidate.
The pinned
attribute on a pointer return declaration or output parameter has the same effect. The pointer is registered and will remain so until invalidated with ::cffi::pointer invalidate.
Pinned pointers have no associated tag and any pointers with the same address components as a pinned pointer will be considered registered.
The primary use of pinned pointers is for API's that make use of "pseudo handles" that are always valid.
Invalid pointer valuesTop, Main, Index
At the script level, a null pointers is any pointer whose address component is 0. The token NULL
may also be used for this purpose.
Null pointers have their own safety checks and are independent of the pointer registration mechanisms described above. By default, a function result that is a null pointer is treated as an error and triggers the function's error handling mechanisms. Similarly, an attempt to pass a null pointer to a function or store it as a field value in a C struct will raise an exception. This can be overridden by including the novaluechecks
annotation on the function return, parameter or structure fields type definition. For return values of type string
, unistring
and winstring
with this annotation, an empty string is returned when the called function returns NULL. In case of structs that are returned by reference, a novaluechecks
annotation will map a NULL return value to a struct with default values for all fields. If any field does not have a default, an error is raised.
Note that when returned as output parameters from a function, either directly or embedded as struct field, null pointers are permitted even without the novaluechecks
annotation.
Memory operationsTop, Main, Index
Pointers are ofttimes returned by functions but more often than not the referenced memory has to be allocated and passed in to functions. Some type constructs like strings and structs hide this at the script level but there are times when direct access to the memory content addressed by pointers is desired.
The memory
command ensemble provides such functionality. The commands ::cffi::memory allocate and ::cffi::memory free provide memory management facilities. Access to the content is available through ::cffi::memory tobinary and ::cffi::memory frombinary commands which convert to and from Tcl binary strings. The ::cffi::memory get and ::cffi::memory set commands provide type-aware access to read and write memory.
As an alternative to the memory
command, the arena
command implements a memory arena in which frames can be allocated with ::cffi::arena pushframe. Memory blocks can then be allocated within the last allocated frame using ::cffi::arena allocate. These blocks are all freed when the frame is deallocated with ::cffi::arena popframe. Multiple calls can be made to ::cffi::arena pushframe with ::cffi::arena popframe freeing the last allocated frame. In effect this behaves like a software stack and is useful for short-lived storage as it is faster and results in less memory fragmentation than the heap based memory
command.
StringsTop, Main, Index
Strings in C are generally represented in memory as a sequence of null terminated bytes in some specific encoding. They may be declared either as a char *
or as an array of char
where the size of the array places a limit on the maximum length.
At the script level, these can be declared in multiple ways:
pointer | As discussed in the previous section, this is a pointer to raw memory. To access the underlying string, the memory referenced by the pointer has to be converted into a Tcl string value with the ::cffi::memory tostring command. |
string.ENCODING | Values declared using this type are still pointers at the C level but are converted to and from Tcl strings implicitly at the C API interface itself using the specified encoding. If .ENCODING is left off, the system encoding is used. |
unistring | This is similar to string.ENCODING except the values are Tcl_UniChar* at the C level and the encoding is implicitly the one used by Tcl for the Tcl_UniChar data type. |
winstring | This is similar to string.ENCODING except the values are WCHAR at the C level and the encoding is implicitly the one UTF-16 as used by the Windows API. This type is only present on Windows. |
chars.ENCODING | The value is an array of characters at the C level. The type must always appear as an array, for example, chars.utf-8[10] and not as a scalar chars.utf-8 . In this as well, conversion to and from Tcl strings is implicit using the specified encoding, which again defaults to the system encoding. Following standard C rules, arrays are passed by reference as function arguments and thus an declaration of chars[10] would also be passed into a function as a char* . Within a struct definition on the other hand, it would be stored as an array. |
unichars | The value is an array of Tcl_UniChar characters and follows the same rules as chars except that the encoding is always that used by Tcl for the Tcl_UniChar type. |
winchars | The value is an array of WCHAR characters and follows the same rules as chars except that the encoding is UTF-16 as used in the Windows API. This type is only present on Windows. |
The choice of using pointer
, string
(or unistring
, winstring
), or chars
(or unichars
, winchars
) depends on the C declaration and context as well as convenience.
- Function parameters of type
char*
that are purely input are best declared asstring
,unistring
orwinstring
. - Function parameters that are actually output buffers in which the called function stores the output string value are best declared as
chars[]
,unichars[]
orwinchars[]
. Generally these have an associated parameter which indicates the buffer size. In such cases the output parameter can be declared as (for example)chars[nchars]
wherenchars
is the name of the parameter containing the buffer size. - Function output parameters that are stored by the called function as pointer to strings can be declared as
out
parameters of typestring
,unistring
orwinstring
in the limited case where the stored pointer does not need to be disposed of (e.g. a pointer to a statically allocated string is being returned). In the general case, these parameters have to be declared as pointers so they can be freed or otherwise disposed. - Function return values cannot be declared as
chars
,unichars
orwinchars
as C itself does not support array return values. Generally, functions typed as returningchar *
need to be declaring as returningpointer
as the pointers have to be explicitly managed. Only in the specific cases where the returned pointer is static or does not need to be disposed of for some other reason, the return value can be typed asstring
,unistring
orwinstring
.
The examples below illustrate use cases for each of the above to wrap these directory related functions.
The first function, get_current_dir_name
returns a pointer to malloc'ed memory that must be freed. We cannot use the string
type for implicit conversion to strings because we need access to the raw pointer so it can be freed. We are thus forced to stick to the use of pointers. Our CFFI wrapper would be defined as (assuming libc is wrapper object)
We need the free
function because as stated by the get_current_dir_name
man page, the returned pointer is malloc'ed and has to be freed by the application. (Note the use of dispose
in the parameter declaration as described in Pointer safety.)
The actual use of the function would involve explicit pointer handling.
The second function getcwd
requires the caller to supply the buffer into which the directory path will be written. The buffer size the function expects is not a constant but rather given by the value of the size
argument. While this function could also be wrapped using pointers and explicitly allocated memory, it is much simpler to use the chars
type to supply a buffer.
Two notable points about this definition: first, the use of dynamic arrays for parameters as described in Dynamically sized arrays. Second, the return type is string
because the pointer returned by getcwd
is the same as the pointer passed in and since CFFI is automatically managing that memory, there is no need to get a hold of the raw pointer.
This simplifies the usage, for the return value as well as output argument:
The final example only involves passing in a path to the chdir
function. Since we are dealing with only passing a constant string, this is the simplest case. Just defining the parameter as string
suffices.
Usage is also straightforward.
MULTI_SZ stringsTop, Main, Index
Some Windows API's make use of the MULTI_SZ
string type which is a string consisting of a sequence of nul-terminated strings in memory followed by an additional nul (i.e. an empty string indicates the end). This can be mapped to the winstring
and winchars
type by annotating the declaration with the multisz
attribute. At the script level, these are represented as a list of strings.
Strings as NULL pointersTop, Main, Index
Some API's allow for char*
pointer parameters fields to be NULL. If these are wrapped as one of the string type string
, unistring
, winstring
or one of the character array types chars
, unichars
or winchars
, the nullifempty
annotation can be used to specify that empty string values should be passed or stored as NULL pointers as opposed to pointers to an empty string.
Binary stringsTop, Main, Index
While the string
, unistring
, winstring
, chars
, unichars
and winchars
types deal with character strings, the types binary
or bytes
serve a similar purpose for dealing with binary data - a sequence of bytes in memory. The binary
type translates to a C unsigned char *
type where the memory is treated as a Tcl binary string (byte array). Similarly, the bytes
type is analogous to the chars
type except it declares a size array of bytes, not characters in an encoding. These types are converted between Tcl values and C values with the Tcl_GetByteArrayFromObj
and Tcl_NewByteArrayFromObj
functions.
Consider the wrapper for the CryptGenRandom
function that we saw earlier.
When this function is called as
the random bytes are returned in the variable data
as a list of 100 integer values in the range 0-255.
Most applications of random data would probably prefer this be a binary string instead. The function wrapper would therefore be better defined as
Now the above call to the function would result in variable data
containing a binary string of length 100.
While the bytes
type corresponds to chars
, the binary
type corresponds to string
. The underlying C type is actually a pointer, not an array. Because there is no inherent length indicator as there is for string
type which is nul-terminated, binary
can only be used in type declaration for input parameters to a function and in no other context. The function receives the data as retrieved by Tcl's Tcl_GetByteArrayFromObj function.
As for chars
, the bytes
type can also be annotated with nullifempty
in which case binaries strings of zero length are passed as NULL pointers. Without the annotation, a binary string of zero length will raise an error.
In the case of the binary
type, the nullifempty
annotation is superfluous. Zero length binary strings typed as binary
are always passed as NULL pointers.
StructsTop, Main, Index
C structs are wrapped through the ::cffi::Struct class. This encapsulates the layout of the struct and provides various methods for manipulation. A structure layout is a list of alternating field name and type declarations. An example of a definition would be
A struct field may be of any type except void
and binary
. In addition, fields that are string
, unistring
or winstring
impose certain limitations. They can be used in structs that are passed in and out of functions as arguments but cannot be allocated from the heap using methods like allocate
, tonative
etc.
As for function parameters, field types can have associated annotations. For example, the above definition may be changed to assign default values to fields.
Annotations that may be applied to field type declarations include
- unsafe may be applied but is superfluous as pointers in struct fields are implicitly marked unsafe as these are more often than not internal pointers in C and not explictly passed across interfaces. For the same reason the counted annotation is ignored if present.
- the novaluechecks annotation for pointer types may be present to indicate no value checks (e.g. for NULL) are applied to the field.
- enum and bitmask for integer types
- nullifempty for
string
,unistring
andwinstring
types. default
. This specifies a default field value if no value is supplied for the field.structsize
. This annotation is specific to field type declarations and results in fields being automatically initialized to the size of the struct if no value is supplied in the dictionary value for the struct. This annotation cannot be used together thedefault
annotation. Note that when structs are used as parameters, fields with this annotation are initialized even when the parameter is anout
parameter. This is commonly useful in Win32 APIs where output parameters still need a structure size field initialized before passing into the API.- The
errno
,winerror
,lasterror
andonerror
annotations may be specified for fields but are ignored. This is to allow sharing of type aliases between field declarations and function return type declarations.
Once defined, structs can be referenced in function prototypes and in other structs as struct.STRUCTNAME
, for example struct.Point
. Referencing is scope-based. If the struct name is not fully qualified, it is looked up in the current Tcl namespace and then in the global scope.
At the script level, C struct values are represented as dictionaries with field names as dictionary keys. An exception is raised if any field is missing unless the field declaration has a default
annotation or the struct is defined with the -clear
option which defaults all fields to a zero value.
Alternatively, structs can also be manipulated as native C structs in memory using raw pointers and explicit transforms. For example,
NOTE: structs that are manipulated as raw structs in memory cannot contain fields of type string
, unistring
and winstring
. They must use raw pointers and explicitly manage their target memory.
The package provides other methods to access fields and otherwise manipulate native structs in memory. See ::cffi::Struct.
Packed structsTop, Main, Index
Compilers allow various means for changing the padding and alignment of fields in a struct; for example, the pack
pragma in MSC. Correspondingly the -pack
option may be used to define the equivalent structure in CFFI. The use and its effect is illustrated below.
Note that packed structs cannot be passed by value to functions. This is a limitation of the underlying backends, libffi
and dyncall
. This combination of passing packed structs by value is very unlikely to happen in practice.
Variable sized structsTop, Main, Index
C allows the definition of structs of variable size where the last field in the struct is a Variable Length Array (VLA), an array whose length is not fixed. Using C99's syntax, an example would be
The equivalent definition in CFFI is
When converting to native form, the actual size of the values
array will be as contained in the count
field.
Variable size structs have the following restrictions some of which parallel C99:
- The VLA must be the last field in the struct (C99)
- The VLA must not be the only field in the struct (C99)
- Arrays of variable size structs are not supported (C99)
- the length specifier for a VLA must be an field in the same struct.
- Variable size structs can be nested provided the inner struct is the last field in the outer one. Nesting is not permitted in C99 but is supported by some compilers.
- Certain operations that modify variable length structs are not permitted. This is to prevent memory faults resulting from inadvertent size changes. See the documentation for each ::cffi::Struct method for details about these limitations.
UnionsTop, Main, Index
C unions are wrapped through the ::cffi::Union class analogous to the ::cffi::Struct class for defining C structs. An example definition of a union would be
The type is then refernced as union.U
.
Unlike structs, the content of a union is not well defined and depends on some discriminator outside of the union itself. Moreover, underlying FFI libraries libffi
and dyncall
do not directly support passing of unions to and from functions. Thus the union type has certain limitations:
- The
union
type can only be used as the type of a field in a struct or another union or parameters passed by reference. It cannot be used for parameters passed by value or function return types. - The
-pack
option is not available when defining a union. - Unlike struct values which are dictionaries, the value of a union at the script level is opaque. The
encode
anddecode
methods should be used to convert to and from these opaque values.
Below is a usage example using the above union.
Note as above that garbage is returned if a field other than what was stored is retrieved.
If a struct S is defined as
allocating the struct and retrieving field values would be done as
Note the difference between how i
is passed versus u
. Modifying the union and storing a different field would look like
As in C care must be taken that the same field is retrieved from a union as was stored in it.
UUID'sTop, Main, Index
The uuid
type maps to the UUID structure on Windows and uuid_t
on Unix-y platforms. Although the type could also have been modeled using a struct
or bytes
array, a separate type allows for a more natural string representation than either of those.
Caution: The string representation used is platform-dependent for compatibility with applications on that platform (COM in particular). When sharing UUID's between platforms, use of the binary form may be preferable. The type tobinary
command can be used for this purpose.
Parameters of type uuid
can only be passed by reference.
Type aliasesTop, Main, Index
Type aliases provide a convenient way to bind data types and one or more annotations. They can then be used in type declarations in the same manner as the built-in types.
In addition to avoiding repetition, type aliases facilitate abstraction. For example, many Windows API's have an output parameter that is typed as a fixed size buffer of length MAX_PATH characters. A type alias OUTPUT_PATH
defined as
can be used in function and struct field declarations.
Similarly, type aliases can be used to hide platform differences. For example, in the following function prototype,
SIZE_T
is an alias that resolves to either uint
or ulonglong
depending on whether the platform is 32- or 64-bit.
Various points to note about type aliases:
- A type alias must begin with an alphabetic character, an underscore or a colon. Subsequent characters may be one of these or a digit.
- Type aliases can be nested, i.e. one alias may be defined in terms of another.
- When a type alias is used in a declaration, additional annotations may be specified. These are merged with those included in the type alias definition.
- Aliases in a declaration may also have an array size specified. This will override the array size (if any) specified in the alias itself.
- Type aliases are scoped. If the alias name in a definition is not fully qualified, it is qualified with the name of the current Tcl namespace. If an alias name is not fully qualified on use, it is looked up using the current Tcl namespace as the scope, the global scope and the
::cffi
scope in that order.
For convenience, the package provides the ::cffi::alias load command which defines some standard C type aliases like size_t
as well as some platform-specific type aliases such as HANDLE
on Windows. These are all loaded in the ::cffi
scope.
Currently defined type aliases can be listed with the ::cffi::alias list command and removed with ::cffi::alias delete.
EnumerationsTop, Main, Index
Enumerations allow the use of symbolic constants in place of integral values passed as arguments to functions or on assignment to struct fields. Their primary purpose is similar to preprocessor #define
constants and enum
types in C. They are defined and otherwise managed through the cffi::enum
command ensemble. The fragment below provides an example.
Alternatives to the ::cffi::enum define command used above include ::cffi::enum sequence and ::cffi::enum flags which are convenient for defining sequential values and bit masks respectively.
Enumeration can also be used in literal form where they are directly expressed in the type definition. For example, the cmark_render_html
function could also be defined as below without the CMARK_OPTS
named enumeration.
When combined with the bitmask
annotation, bitmasks can be symbolically represented as a list.
When enumeration types are returned from a function or through output parameters or struct fields, they are returned as integers, not mapped to enumeration member names. The ::cffi::enum name or ::cffi::enum unmask commands can be used to accomplish for the purpose if desired.
For additional commands related to enumerations see ::cffi::enum.
FunctionsTop, Main, Index
To invoke a function in a DLL or shared library, the library must first be loaded through the creation of a ::cffi::Wrapper object. The ::cffi::Wrapper.function and ::cffi::Wrapper.stdcall methods of the object can then be used to create Tcl commands that wrap individual functions implemented in the library.
Calling conventionsTop, Main, Index
The 32-bit Windows platform uses two common calling conventions for functions: the default C calling convention and the stdcall calling convention which is used by most system libraries. These differ in terms of parameter and stack management and it is crucial that the correct convention be used when defining the corresponding FFI.
- The ::cffi::Wrapper.function method should be used for declaring C functions that use the default C calling convention.
- The ::cffi::Wrapper.stdcall method should be used for declaring C functions that use the stdcall calling convention.
Other than use of the two separate methods for definition, there is no difference in terms of the function prototype used for definition or the method of invocation.
Note that this difference in calling convention is only applicable to 32-bit Windows. For other platforms, including 64-bit Windows, stdcall
behaves in identical fashion to function
.
Function wrappersTop, Main, Index
The function wrapping methods function and stdcall have the following syntax:
where DLLOBJ
is the object wrapping a shared library, FNNAME
is the name of the function (and an optional Tcl alias) within the library, RETTYPE
is the function return type declaration and PARAMS
is a list of alternating parameter names and type declarations. The type declarations may include annotations that control behaviour and conversion between Tcl and C values.
The C function may then be invoked as FNNAME
like any other Tcl command.
Return typesTop, Main, Index
A function return declaration is a type or type alias followed by zero or more annotations. The resolved type must not be void
or an array including chars
, unichars
, winchars
, binary
and bytes
. Note pointers to these are permitted.
In the case of string
, unistring
and winstring
types, the script level return values are constructed by dereferencing the returned pointer as character strings. Since the underlying pointer is not available, any storage cannot be freed and these types should only be used as the return type in cases where that is not needed (for example, when the function returns pointers to static strings).
Returning structs from functions is only supported by the libffi
backend.
Return annotationsTop, Main, Index
The following annotations may be follow the type in a return type declaration.
- The enum annotation may be used for integer types. The integer return value from the function will be returned by the command as the corresponding enumeration member name and as the integer value itself if the enumeration does not have a matching member.
- The
bitmask
annotation may be used for integer types. This only has effect if theenum
annotation is also present. In that case the returned value from the mapped command is a list of enumeration member names matching the bits set in the returned value followed by the original integer value. - The error checking annotations
zero
,nonzero
,nonnegative
,positive
may be specified for integer types. If present, any function return value that does not satisfy the annotation will be treated as an error. See Error handling for more. - The error reporting annotations
errno
,lasterror
,winerror
andonerror
may be specified for integer types and with the exception ofwinerror
forpointer
,string
,unistring
andwinstring
types. (Remember thatstring
,unistring
andwinstring
are both pointers under the covers.) For integer types they require one of the above error checking annotations to also be present to have effect. - The
unsafe
andcounted
pointer safety annotations may be specified for pointer types. By default, pointers returned from functions are registered as safe pointers. Thecounted
annotation registers them as reference counted safe pointers. Pointers returned with theunsafe
annotation are not registered at all. - The
byref
annotation can be used with any type when the function return value is a pointer to that type. If specified, the returned pointer is implicitly dereferenced and a value of the target type of the pointer is returned. Note however that the original pointer returned is not accessible at the script level and so this should only be used when that is acceptable, e.g. the pointer is to static or internal storage that does not need to be freed. - The
discard
annotation indicates the result of a function be discarded. An empty string is returned instead. This is convenient in the case of functions that return a boolean value indicating success or failure. Thediscard
annotation can then be used with the error checking annotations to either raise an exception in the case of failures or discard the result in case of success.
ParametersTop, Main, Index
The PARAMS
argument in a function prototype is a list of alternating parameter name and parameter type declaration elements. A parameter type declaration may begin with any supported type except void
and may be followed a sequence of optional type annotations.
Input and output parametersTop, Main, Index
Parameters of a function may be used to pass data to the function (pure input parameters), get data back from the function (pure output parameters) or both. CFFI parameter type declarations denote these with the in
, out
and inout
annotations respectively. If none of these annotations are present, the parameter defaults to an implicit in
annotation.
In addition arguments may be passed to the function either by value or by reference where the pointer to the value is passed. Parameters that are pure input are normally passed by value. In some cases, functions take even pure input arguments by reference, (for example large structures). In such cases, the CFFI parameter declaration should have the byref
annotation to indicate that a pointer to the value should be passed and not the value itself. Note that arrays are always passed by reference in C so array types do not need to be explicitly annotated with byref
as they default to that in any case.
In the case of in
parameters, at the time of calling the function the argument must be specified as a Tcl value even when the byref
annotation is present. The passing through a pointer to the reference is implicit.
NOTE: In the case of string
, unistring
and winstring
, in
parameters correspond to char *
and Tcl_UniChar *
respectively, while in byref
map to char **
and Tcl_UniChar **
.
Parameters that are out
or inout
are always passed by reference irrespective of whether the byref
annotation is present or not. The argument to the function must be specified as the name of a variable in the caller's context. For inout
parameters, the variable must exist and contain a valid value for the parameter type. For out
parameters, the variable need not exist. In both cases, on return from the function the output value stored in the parameter by the function will be stored in the variable. Note that inout
cannot be used with string
, unistring
and winstring
types while neither out
nor inout
can be used with binary
.
There are some subtleties with respect to error handling that are relevant to output parameters and must be accounted for in declarations. See Errors and output parameters for more on this.
Output parameters as function resultTop, Main, Index
Many functions return values as pairs with the function return value being a status or error code and the actual function result being returned as an output parameter. In such cases, the retval
annotation on the output parameter can be used to return it as the result of the wrapped command.
The retval
annotation
- implies the
out
andbyref
annotations and cannot be combined with thein
orinout
annotations. - can be placed on at most one parameter declaration for a function
- the function return value must be
void
or an integral type - for integral return types, one of the error checking annotation for integer types must also be present. These are used for checking the original return value from the C function as always, and not the parameter output value.
The parameter annotated with retval
does not appear in the wrapped command signature (i.e. it is not supplied as an argument in the invocation).
The return value will be the parameter output value only if the function's native return value passes the error checks. Otherwise, an exception is raised as usual.
See Delegating return values for an example of retval
usage.
Parameter annotationsTop, Main, Index
The following annotations may follow the type in a parameter type declaration:
- The
in
,out
,retval
andinout
annotations as described in the previous section. - The
byref
annotation specifies that argument is to be passed by reference (the function actually takes a pointer to the actual value) and not by value. This only has effect for input parameters as parameters without
andinout
annotations always have arguments passed by reference irrespective of whether thebyref
annotation is present or not. Arrays are also always passed by reference even if they are input only. - The
unsafe
,counted
,dispose
anddisposeonsuccess
annotations may be specified for pointer types. By default, pointer values passed in forin
andinout
parameters are checked for validity. Conversely, by defaultout
andinout
pointers returned from the function are registered as valid safe pointers. Pointer types annotated withcounted
behave similarly except they are registered as reference counted safe pointers instead of normal safe pointers. On the other hand, theunsafe
annotation disables all safety related mechanisms. The arguments are neither checked for validity, nor registered as safe pointers. Thedispose
anddisposeonsuccess
annotations are only valid forin
andinout
parameters. They mark the parameter as holding a pointer that will be freed by the function and cause CFFI to unregister the pointer (modulo reference counting if applicable). The difference betweendispose
anddisposeonsuccess
is that the latter will only unregister the pointer if the function returns without any error indication. For more on pointer safety mechanisms, see Pointer safety. - The
enum
annotation may be used for integer types. It has an associated argument that specifies an Enum, either a defined name or a dictionary literal. Forin
andinout
parameters, this allows enumeration member names to be used in lieu of integers though the latter are also accepted. Forout
andinout
parameters, the integer value stored by the function is returned to script level as the enumeration member name if a mapping exists and as the original integer otherwise. - The
bitmask
annotation may be used for integer types. Forin
andinout
parameters with this annotation accept a list of integer values and perform a bit-wise OR operation on these passing the result to the function. If the parameter also has theenum
annotation the list may contain enumeration member names as well. Correspondingly, the output values forout
andinout
are converted to a list of enumeration member names with the last element being the integer value itself. This annotation should be used with enumerations whose values are bit flags. - The
default
annotation may be used for pure input parameters. The associated value is passed to the function if an argument is not explicitly supplied. The annotation comprises of a list of two elements, the first being the annotationdefault
and the second being the value to use. As for Tcl procs, if a default is specified for a parameter, all subsequent parameters must also have a default specified. - For
in
parameters, thenullifempty
annotation is available only for typesstring
,unistring
,winstring
,binary
andstruct
. If present, a NULL pointer is passed into the C function if the passed argument is an empty string in the case ofstring
,unistring
andwinstring
, and an empty dictionary in the case ofstruct
. This facility is useful for API's where NULL pointers signify default options. Note that thebinary
type always hasnullifempty
implied even if not explicitly specified. Forout
andinout
parameters,nullifempty
may be specified for other types as well. In this case, if the name of the variable passed to hold the output is the empty string, the corresponding function argument pointer is passed as NULL. - The
storeonerror
andstorealways
annotations are only applicable when eitherout
orinout
annotations are present. These control storage of output parameters in the presence of errors. See Errors and output parameters.
Structs as parametersTop, Main, Index
In the case of parameters that are structs, the input argument for the parameter when the function is called should be a dictionary value. Conversely, output parameter results are returned as a dictionary of the same form.
Variable size structs have some restriction as function return types and parameters. They cannot be the return type for a function and cannot be passed by value. Additionally, when passed by reference they must be in
or inout
parameters. They cannot be out
parameters as the field containing the size of the VLA array is part of the struct and must be passed in to the function.
Error handlingTop, Main, Index
C functions generally indicate errors through their return value. Details of the error are either in the return value itself or intended to be retrieved by some other mechanism such as errno
.
One way to deal with this at the script level is to simply check the return value (generally an integer or pointer) and take appropriate action. This has two downsides. The first is that error conditions in Tcl are almost always signalled by raising an exception rather than through a return status mechanism so checking status on every call is not very idiomatic. The second, perhaps more important, downside is that the detail behind the error, stored in errno
or available via GetLastError()
on Windows, is more often than not lost by the time the Tcl interpreter returns to the script level.
Error annotationsTop, Main, Index
Additional sets of type annotations are provided to solve these issues. The first set of annotations is used to define the error check conditions to be applied to function return values. The second set is used to specify how the error detail is to be retrieved.
The following annotations for error checking can be used for integer return types.
zero | The value must be zero. |
nonzero | The value must be non-zero. |
nonnegative | The value must be zero or greater. |
positive | The value must be greater than zero. |
The return value from every call to the function is then checked as to whether it satisfies the condition. Failure to do so is treated as an error condition.
An error condition is also generated when a function returning a pointer returns an null pointer or, on Windows only, the INVALID_HANDLE_VALUE
preprocessor constant. This is also true for string
, unistring
and winstring
return types as well as struct
types that are returned by reference since those are all pointers beneath the covers. This treatment of null pointers as errors can be overridden with the the novaluechecks
annotation. If this annotation is specified and the function returns a NULL pointer,
- for
pointer
types, the NULL pointer is returned to the caller - for
string
,unistring
andwinstring
types, an empty string is returned to the caller - for
struct byref
types, a dictionary with default field values is returned. If any field does not have a default specified in the struct definition, an error is raised.
An error condition arising from one of the error checking annotations or a null pointer results in an exception being generated unless the onerror
annotation is specified (see below). However, the default error message generated is generic and does not provide detail about why the error occured. The following error retrieval annotations specify how detail about the error is to be obtained.
errno | The POSIX error is stored in errno . The error message is generated using the C runtime strerror function. Note: This annotation should only be used if the wrapped function uses the same C runtime as the cffi extension. The Tcl errorCode variable is set to a list comprising of CFFI , ERRNO , the POSIX name for the error (e.g. ENOENT ), the numeric error code, and the error message. |
lasterror | (Windows only). The error code and message is retrieved using the Windows GetLastError and FormatMessage functions. The Tcl errorCode variable is set to a list comprising of CFFI , WIN32 , the numeric error code, and the error message. |
winerror | (Windows only). The numeric return value is itself the Windows error code and the error message is generated with FormatMessage . The Tcl errorCode variable is set to a list comprising of CFFI , WIN32 , the numeric error code, and the error message. This annotation can only be used with the zero error checking annotation. |
These annotations can be applied only to integer types except that the errno
and lasterror
annotations can be used with pointer types as well.
In addition to the above built-in error handlers, the onerror
annotation provides a means for customizing error handling when the error is from a library and not a system error. The annotation takes an additional argument which is a command prefix to be invoked when an error checking annotation is triggered. When this command prefix is invoked, a dictionary with the call information is passed. The dictionary contains the following keys:
Result | The return value from the function that triggered the error handler. |
In | A nested dictionary mapping all in and inout parameter names to the values passed in to the called function. |
Out | A dictionary mapping all inout and out parameter names to the values returned on output by the function. These only include output parameters marked as storealways or storeonerror . |
Command | The Tcl command for which the error handler was triggered. This key will not be present if the function was invoked with an address through the ::cffi::call command. |
The result of the handler execution is returned as the function call result and may be a normal result or a raised exception. The handler may use upvar
for access to the calling script's context including any input or output arguments to the original function call.
This onerror
facility may be used to ignore errors, provide default values as well as raise exceptions with more detailed library-specific information. Note that the use of a onerror
handler that returns normally is not the same as not specifying any error checking annotations because the function return is still treated as an error condition in terms of the output variables as described in Errors and output parameters.
NOTE: Although the errno
, lasterror
, winerror
and onerror
annotations have effect only with respect to function return values, they can also be specified for parameters and struct fields where they are silently ignored. This is to permit the same type alias (e.g. status codes) to be used in all three declaration contexts.
One final annotation related to error handling is saveerrors
. This is provided to deal with functions that do not return an error status but rely on the caller checking errno
or GetLastError()
after the call. This annotation can be added to any return type. If present, the errno
value and GetLastError()
value on Windows are saved internally after the CFFI call and can then be retrieved with the ::cffi::savederrors command.
Errors and output parametersTop, Main, Index
An important consideration in the presence of errors is how the called function deals with output (including input-output) parameters. There are three possibilities:
- The function only writes to the output parameter on success
- The function always writes to the output parameter
- The function only writes to the output parameter on error, for example an error code.
The distinction is particularly crucial for non-scalar output. Output parameters that have not been written to may result in corruption or crashes if the memory is accessed for conversion to Tcl script level values.
By default, script level output variables are only written to when the error checks pass (including the case where none are specified). This is the first case above. If the storealways
annotation is specified for a parameter, it is stored irrespective of whether an error check failed or not. This is the second case. Finally, the storeonerror
annotation targets the third case. The output parameter is stored only if an error check fails.
Note that an error checking annotation must be present for any of these to have an effect.
Varargs functionsTop, Main, Index
Some C functions take a variable number of arguments. Wrappers for these are defined as for normal functions except that the last parameter definition should be the string ...
. In addition, when the wrapped command is invoked, the varargs arguments should be passed as a pair consisting of a type declaration and a value. This is illustrated in the example below wrapping the snprintf
function.
Note how the varargs arguments are passed as type, value pairs. Just as programming in C, extreme care has to be taken to pass the right argument types.
Varargs arguments have certain restrictions:
- The can only be input arguments so type attributes
out
,inout
andretval
are not allowed. - Integer types must be of size at least that of
int
. Floating point types must bedouble
. These come from C type promotion rules for varargs functions.
Prototypes and function pointersTop, Main, Index
The function wrapping methods function and stdcall described earlier bind a function type definition consisting of the return type and parameters with the address of a function as specified by its name. For some uses, it is useful to be able to independently specify the function type information independent of the function address. The ::cffi::prototype function and ::cffi::prototype stdcall commands are provided for this purpose. They take a very similar form to the corresponding methods:
where RETTYPE
and PARAMS
are as described in Function wrappers. The commands result in the creation of a function prototype NAME
which can be used as tags for pointers to functions. The ::cffi::call command can then be used to invoke the pointer target.
For example, consider the following C fragment
This would be translated into CFFI as
InterfacesTop, Main, Index
Interfaces in CFFI are intended to wrap COM interfaces in Windows and described in that context below. However they can be used for similar structures outside of COM and on other platforms. The underlying C type is a structure whose first field is a pointer to a table of functions, the vtable (virtual function table). Each of these functions accept a pointer to the structure as their first parameter. The methods are invoked through this virtual table. A limited form of inheritance is supported in this model and also exposed through CFFI.
An interface in CFFI is defined with ::cffi::Interface. The COM IUnknown
interface for example would be defined as
This creates the IUnknown
interface as an ensemble command to which methods can be added with either methods
for functions that use the _cdecl
calling convention or stdmethods
for functions that use the _stdcall
calling convention. COM uses the latter, so in CFFI the methods can be defined as
Points to note:
- The order of method definitions must be exactly the same as in the corresponding COM interface definition.
- The first parameter to the method is not explicitly declared in each method. It is implicitly typed as a pointer with a tag being the interface name,
pointer.IUnknown
in the above example. - The name of the method for releasing the COM object should be passed with the
-disposemethod
option. COM interface pointers are generally reference counted and should be annotated withcounted
in CFFI. Correspondingly, the implicit pointer in the method indicated by-disposemethod
will be annotated withdispose
indicating the reference count on the pointer registration should be decremented.
The above will result in definition of Tcl commands of the form InterfaceName.MethodName
, e.g. IUnknown.QueryInterface
, IUnknown.AddRef
and so on. The first parameter to any method should be a pointer to the COM object, the remaining being the parameters as defined in the method definition.
Interface inheritanceTop, Main, Index
In some cases, one interface may inherit from another. In CFFI, this is indicated using the -inherit
option to Interface
. Additional methods can be defined on the derived interface in the usual fashion.
Methods defined in the inherited interface, IUnknown.AddRef
etc., as well as methods defined in IDispatch
can be invoked on objects supporting the IDispatch
interface.
Example:
TBD
CallbacksTop, Main, Index
Some C functions take a parameter that is a pointer to a function that is then invoked by the called outer function, often in iterative fashion passing elements of some data set in turn. Wrapping such functions involves the following steps:
- Definition of a prototype as described in the previous section. This must match the declaration of the callback function. There are certain restrictions placed on the parameter types that can be used with callbacks. These are listed in the ::cffi::callback reference.
- Definition of the outer function with the callback parameter type set as a pointer to the function
- Creation of the callback function pointer via the ::cffi::callback command that wraps a Tcl command that should be invoked as the callback
- Invoking the outer function
- Freeing the callback function pointer with ::cffi::callback free when no longer needed. Note it may be used multiple times before freeing.
Callbacks cannot take a variable number of arguments.
Warning: CFFI callbacks can only be used when the called function invokes them before returning. They are not suitable in cases where the callback is called at a later time after the function returns. Doing so will likely result in a crash.
Use of callbacks is illustrated below for the ftw
function available on some platforms to iterate through files and directories. The C declaration of the function is
The second argument fn
to this function is a pointer to a callback function that will be called for every file under the directory specified by the first argument.
To wrap this function with CFFI, first a prototype is defined that matches the declaration for the fn
parameter.
Then the ftw
function is itself wrapped with the callback argument referencing the prototype.
Next the callback function pointer is created.
The ftw
function can then be invoked with this callback function pointer.
Finally, the callback pointer can be freed assuming we will not need it again.
It is useful to know that the callback command is invoked in the Tcl context from which the outer function was invoked. For example, if we wanted to collect file names instead of printing them out, we could collect them in a variable.
The above example also shows that the second argument to cffi::callback
is a command prefix, not necessarily a single-word command, to which the arguments from the callback invocation itself are appended.