Justification for usecase grouping

Axes

When Is Link Time

static linking, load time, dlopen.

All load time usecases can be “promoted” to dlopen. For the table, I will not consider static linking, as this is already possible.

Symbol Presence

Required: all expected symbols are assumed to be present, allowed to crash otherwise.
Optional: Behavior changes depending on the presence of symbols.

TODO: Consider merging this with pairing trust, as they are highly correlated.

The implementation of weak behavior depends on the linking time. Symbol optionality implies that the linked binary may vary, but it may be a fixed set of versions (such as glibc versions). For the combination of dlopen and strong symbols, the loaded module can assume that its environment will provide the symbols, the symbols provided by the loaded module are mediated by the handle anyway.

Pairing trust

[!IMPORTANT] See next section for components, this section is being reworked

Different deployments have different relationships with their environment. Some applications fully control all the attendant binaries, and can rely on the binaries to be the expected ones. Others, like plugins or game mods, have no guarantees on which binaries are referenced, and need a verification method. Linux package managers fall in between, they have a full view of installed binaries, library dependencies, and versions, but often want to support externally packaged programs (to a point). This has safety implications, as a trusted environment can canonize one compiler version and precise library versions.

In the table, this is the “pairing” column.

An untrusted pairing means that the compiler cannot guarantee anything about the binaries, other than that they are compiled with a Rust compiler. In the case of an untrusted pairing, some validation must be performed at link-time. Which binaries are compatible in the untrusted case depends on the stability and extensiveness of the ABI.

Exact

All the modules are compiled together, and the compiler has a full view of all the code in every module. If a module is swapped out for a different version, it can be assumed that every module depending on it is also replaced. This also means that all modules use the same compiler version, and a stable ABI is not needed.

Safety can be solved easily. The global view means that the compiler can use the existing safety checks. As all modules are compiled together, the symbols can be supplemented by a build ID or source code hash to ensure that the loaded module is the expected version. This also means that compiler unchecked contracts are consistent with programmer expectation.

Bounded

Often the runtime environment is versioned such that programs are backwards compatible between (minor) versions. In this case, the programmer can assume that loading libpng will yield a set of functions compatible with the expected signatures. There is additionally the desire to update the dependencies to compatible versions without recompiling all dependents.

This creates more of a challenge than exact matching. The symbols should contain enough information to ensure soundness, but ideally no information that would differ between compatible versions. Such a balance is challenging to strike; this document will assume that full soundness should be maintained, and compatibility is best-effort with potential warnings when breaking compatibility.

A more detailed explanation of what various package managers and distributions consider secure will follow.

There are a variety of techniques that can be employed in this case.

Unbounded

In the unbounded case, no environmental guarantee can be assumed. This happens when components may be interchanged arbitrarily without a mechanistic versioning system. This is the case with game mods, libraries as configuration, and many more.

By loosening the environmental guarantees, it becomes harder to preserve safety while allowing all desirable behavior. Namely, the compiler version cannot be pinned, and multiple versions of a library may be loaded at the same time.

The solution to the compiler version issue is to use a stable ABI, such as repr(c) or CrABI.

Multiple versions of a library being present is already a problem that the Rust compiler deals with. Namely if a program is dependent on two libraries that depend on the same library, but with incompatible version bounds. In these cases, all the types of resulting from the libraries are annotated with their type, and considered incompatible (even if they are structurally the same). This is such a common occurrence that the semver trick was developed. For dynamically loaded dependencies, all types resulting from a different loaded unit should be considered incompatible by default. Some libraries may commit to stable internal ABIs for some types, and some types may be considered ABI stable by default (such as structs with only public fields that are not marked unsafe). This becomes tricky in the dlopen case, as the type should be parameterized by the handle.

It may be tempting to include the all such type information in the symbol, and refuse to match functions with a different ABI, but this would go against prior expectations. The concept of an “opaque” (pointer) type is commonly understood in the C/C++ communities, and often used for forwards-compatibility. The idea is that a type is described only nominally, and can only be interacted with by passing it to functions of the library. As such, the client code does not need any knowledge of the internals of the type, and can use any version of the library, even if the internal representation of the type changes radically. The danger is that if multiple versions of the library are loaded, the opaque type is not interchangeably between them.

This danger also presents itself in the case of normal dynamic loading when not all symbols are guaranteed to be present. If both versions of the library export the exact same symbols, one version would shadow the other and soundness would be preserved. If the client uses a symbol exported by only one version of the library, then the ordering of loading symbols matters. If both versions export symbols that the other does not—as is the case with some crate features—then ordering is not enough to disambiguate the opaque types. The trouble is that the symbols can neither be versioned nor unversioned. Versioning the symbols would force all clients to chose a specific version of the library, and lose the utility of opaque types. Not versioning the symbols would cause ambiguity between different implementations of the opaque type.

Insecure

Programs may load modules from an untrusted source, while maintaining the security of the environment. As such, the client cannot assume that the module was compiled correctly or does not perform undesirable behavior. This category is known as “secure linking” and the effect can be achieved in multiple ways.

If the runtime has access to the source code, the client can guarantee that it is compiled correctly, namely by compiling it itself. This technique is used in interpreted languages (where the source code is available anyway) such as a web-browser loading arbitrary JavaScript. Java and c# are both runtimes that check and compile any module that is loaded, although the intermediate bytecode is used, rather than the original source code. Both of these runtimes use their verification knowledge to guarantee sandboxing and security of loaded modules. This process is slow, and puts restrictions on the optimizations that can be performed, as time spent optimizing adds latency to starting the program.

The compilation and verification steps can also be split by using proof-carrying code (PCC). With PCC, the compiled code is complemented by a proof that the compiled code is “correct”. The client then only needs to confirm the given proof, and then the code can be executed unmodified. This technique has seen some academic interest, and some industrial use in speeding up WASM verification.

Rather than verify that the loaded module satisfies some bounds, the actions can be restricted by the runtime. Operating system processes are the biggest example of this. A process is allowed to perform arbitrary actions on its own memory, but is restricted from damaging other processes. For this, the operating system uses hardware support (such as the memory protection unit) to efficiently restrain user code. A userspace application can also make use of this protection by forking untrusted code to another process with less privledges.

Pairing issues

The basic principle with pairing is that a program should fail to link if it would have failed to compile. This means that any type error or borrowchecker violation should be prohibited. In principle, this can be solved by compiling the programs together or by providing the complete library code when compiling programs. Such a regime may be acceptable to some—but would prohibit loading newer versions of a library or loading novel modules. This section enumerates the fault lines along which loaded modules may want to differ from the strict expectation.

Compiler version

If the same compiler version can be guaranteed, many problems (that are out of scope for my thesis) are solved, namely, a stable ABI.

Lifetimes

Although the types should match exactly (in accordance with nominal typing), the lifetimes of functions may differ slightly between implementations. A newer version of a library may relax the lifetime restrictions on functions. Such relaxations are backwards-compatible changes in both the API and ABI, so they would ideally also result in compatible linking.

This can be solved in three ways.

No lifetime changes: exact matching in code version.
Common happy medium: a header file (or other format) provides a maximal expectation and minimal obligation. May cause problems in edge cases.
Link-time verification: every program that dynamically loads libraries ships with a (minimal version) of borrowck.

Internal representation of a type

Many aspects of a type may change in API-compatible ways, we will term such changes “implementation details”. These can include changing a private field or changing the alignment (unless explicitly specified). Although these changes require no modification to the source code using them, the binary instructions are usually incompatible.

In some cases these changes are a loosening of the contract, like requiring a smaller alignment, others require different handling of the types. Such incompatibilities require special preparation if a binary wants to be compatible with both. Many of these aspects are described in this Swift ABI post. Primarily, the opaquer the pointer used, the more compatible the ABI can be.

Module matching

Even if the program itself never relies on any implementation details and handles only opaque pointers, incompatibilities may still leak through the namespace. A program may have two versions of the library loaded that are API compatible (and should compile together) but expect different internal representations. Such an issue is likely to arise when loading modules that vendor their own dependencies. As such, unsafe types used in different modules should be considered incompatible by default. This becomes especially problematic when a global has an unsafe type.

The matching problem can be solved by parameterizing an unsafe type by the handle that originated it when using dlopen, but load-time linking is a more problematic situation. As all functions are put into the global namespace, the function signatures lose their module provenance. One may hope that all functions related to a type originate from the same binary, but this is a slim hope. Libraries may statically include their functions on said type, or a new version of the library may have added new functions not shadowed by an old version.

Generics

If monomorphization is used, the body of a generic function becomes part of the interface. This may seem overly restrictive, but drawing a looser line would be highly error-prone. As a generic function often calls non-public functions, those functions cannot change between versions. The implementation of trait-defined functions may also rely on implementation details, as described above. If this problem is circumvented by using dynamic dispatch on those functions, the benefit of monomorphization largely disappears, and the whole function can be dynamically dispatched.

Traits

Most “non-major breaking” changes are safe with object code. The “possibly-breaking” changes are due to the compiler inferring different implementations of the function, but the object code has already done such disambiguation.

Dynamic dispatch

Other issues can creep in. Adding a function changes the vtable order, breaking compatibility unless the dynamic dispatch ABI matches on the function name. This can be solved in three ways. The first is to disallow new methods to be defined on traits, breaking ABI significantly more often than API. An alternative is that the dispatch table is treated as an opaque pointer, and is only dereferenced in the library (possibly through stubs). Lastly, the compiler can generate different functions and vtables for older versions, increasing binary size.

Unsafe functions

Making an unsafe function safe is not an API-breaking change, but would change the symbol of the function. If unsafeness is encoded as a simple boolean, simply exporting safe functions with two symbols (a safe and unsafe one) would solve the problem. Otherwise, if unsafe function symbols are burdened with additional information, exporting a wrapper would suffice.

Changing the unsafe requirements/guarantees. TODO: ?hash doc comment? TODO: contracts?

Unloading

Soundlu unloading a module requires effort from both the loader and the loaded. TODO: loader and loaded are ambiguous phrasing The running program must guarantee that it contains no references into the unloaded namespace. The module to be unloaded cannot assume that every part of the program will remain available.

A module that is to be unloaded must restrict itself in various ways. The 'static lifetime cannot be used, which includes “constant promotion” (a reference to a constant value is implicitly of 'static lifetime) Global variables can be destructed, but this requires modifying the compiler. Thread-local variables cannot have non-trivial destructors, as the destructors are extected to be called at the end of the thread, but at that point the destructors are no longer available.

When a program unloads a module, it must also restrict itself. The program cnnot have any references into the module. This includes non-trivial destructors. The program should also not have any references into memory allocated by the module, unless the use of the same allocator can be guaranteed.

Old Table

Pairing	Link time	Symbols	Unloading	Use-case	Existing Solution
Exact	Load	Strong	No	Managed world (`rustup`)	Extant symbol mangling
Exact	Load	Strong	Yes	N/A: unloading requires `dlopen`	∅
Exact	Load	Weak	No	N/A: Exact incompatible with weak	∅
Exact	Load	Weak	Yes	N/A: unloading requires `dlopen` N/A: Exact incompatible with weak	∅
Exact	`dlopen`	Strong	No	In-tree modules (tree-sitter grammars)	Extant symbol mangling
Exact	`dlopen`	Strong	Yes	In-tree modules (kernel modules)	Extant symbol mangling
Exact	`dlopen`	Weak	No	N/A: Exact incompatible with weak	∅
Exact	`dlopen`	Weak	Yes	N/A: Exact incompatible with weak	∅
Bounded	Load	Strong	No	Unmanaged binaries using system libraries, library multi-versioning	Export RFC
Bounded	Load	Strong	Yes	N/A: unloading requires `dlopen`	∅
Bounded	Load	Weak	No	Feature selection/detection	???
Bounded	Load	Weak	Yes	N/A: unloading requires `dlopen`	∅
Bounded	`dlopen`	Strong	No	Lazy loading dependencies, library multi-versioning	???
Bounded	`dlopen`	Strong	Yes	FreeBSD kernel modules	???
Bounded	`dlopen`	Weak	No	???	???
Bounded	`dlopen`	Weak	Yes	???	???
Unbounded	Load	Strong	No	Library interposition/replacement	???
Unbounded	Load	Strong	Yes	N/A: unloading requires `dlopen`	∅
Unbounded	Load	Weak	No	`LD_PRELOAD` as behavior selection	???
Unbounded	Load	Weak	Yes	N/A: unloading requires `dlopen`	∅
Unbounded	`dlopen`	Strong	No	N/A: a failure to load should not panic the whole process automatically	∅
Unbounded	`dlopen`	Strong	Yes	N/A: a failure to load should not panic the whole process automatically	∅
Unbounded	`dlopen`	Weak	No	Native plugins, no unloading	???
Unbounded	`dlopen`	Weak	Yes	Linux kernel modules	???
Insecure	Load	Strong	No	???	???
Insecure	Load	Strong	Yes	???	???
Insecure	Load	Weak	No	???	???
Insecure	Load	Weak	Yes	???	???
Insecure	`dlopen`	Strong	No	???	???
Insecure	`dlopen`	Strong	Yes	???	???
Insecure	`dlopen`	Weak	No	???	???
Insecure	`dlopen`	Weak	Yes	???	???

Unloading a module only makes sense with dlopen. If a module loaded at startup needs to be unloaded at some later point, an explicit handle is needed. Calling dlopen with a filename that does not contain slashes will use the same search behavior as standard loading.

Weak symbols only make sense with untrusted or bounded pairings. A change in which symbols are available would imply that the binaries are not matched exactly to eachother.

The Linux kernel does not provide any stability guarantees for out-of-tree kernel modules, therefore it is placed with the exact binary matching requirement. Any kernel API (with the notable exception of syscalls) is allowed to change at any point, and kernel modules must be (re-)compiled against the headers of the current version. FreeBSD kernel modules are binary compatible within the same minor version.

Header files

TODO: include pub non-export bodies.

Header files are used to separate the interface and the implementation. The header file describes the functions and globals available to other translation units. When compiling a dependency, the header file serves to establish that the public interface matches what the module implements. When compiling a dependent, the header file provides enough information to compile the module, with the promise that the implementation will follow. Often, header files require the programmer to repeat information (namely the signatures of functions) that already exists in the source file. This is both an annoyance (a header file can often be derived from a source file, if the publicness of a symbol is known) and a guard (changing the API/ABI requires modifying two places). Some languages automatically derive header files from source code, with the risk of unexpected changes in the interface.

Header files offer three advantages for separate compilation. Firstly, by relegating the ABI spec to a separate (set of) file(s), the compilation of a dependant can happen without knowledge of the implementation source. Secondly, a header file is (often) a commitment to a certain ABI, with changes to it being readily apparent. Thirdly, a header can include a compromise of the interface. This is especially useful for lifetimes, which are brittle and a partial ordering rather than exact matching. A header file can specify the minimum lifetime that the dependency must uphold, and the maximum lifetime that the dependent may assume.

A breaking API change is always considered a breaking ABI change

Different API signatures may have the same binary representation. One example would be changing a u8 to a Option<NonZero<u8>>. These are guaranteed to have the same representation and can be used interchangeably in FFI. They are however not API compatible, as the type wrapper must be constructed explicitly. I would consider this change to also break ABI, as the expectation of usage has changed. This is a specific version of the structural/nominal typing debate, and as Rust uses nominal typing I believe that the ABI should too.

, but the reverse is not true. There is already a notion of breaking minor changes, and cargo-semver-checks only considers breaking “lint-clean” client code to be a breaking change. In the ABI world, all breaking changes can break running programs, not just annoy developers. Many (non-breaking) minor API changes are not ABI compatible (e.g. changing a private field in a struct). As such, an ABI version should be distinct from or layered on top of the API version. Some breaking, minor changes are ABI compatible (e.g. adding new inherent items) because all inferred parts are elaborated during compilation.

For some changes that would break the ABI, the breakage can be resolved mechanistically. As an example, loosening the lifetime requirements would change the lifetimes encoded in the symbol. It is desirable to be able to use the looser lifetime restrictions in newer code, but also keep older code compatible. By exporting the function under both its old and new lifetime signatures (as specified in the header file), the resulting binary remains backwards-compatible. The old symbol could then be removed later as a breaking ABI change. Another strategy would be to not encode lifetime information in the symbol, and rather commit to run- or link-time verification of lifetime compatibility.

The implementation of any public, generic, monomorphized function is part of the ABI, and should be included in the header file.

For the (currently unstable) contracts API, all contracts are part of the ABI, and should be included in the header file. The strengthening of a contract is an API change (and therefore an ABI change). The weakening of a contract may cause the symbol to change (if the contract is part of it), but compatibility can be recovered by exporting the function under both symbols.

The main complication in unsafe languages, is that there is no guarantee that the header files being compiled against match the module being linked against.