The three levels of Rust types

Rust types can be considered at three levels of abstraction: raw, equipped, and semantic. * Raw: The runtime representation of the type * Equipped: The type together with its associated functions/methods * Semantic: The not-checked-by-compiler parts.

In dependently typed systems, the semantic types are (often) expressed as explicit proof objects attached to types. Two types with different proofs but the same data are considered runtime-identical, and runtime identity functions serve to “weaken” a type without overhead. In these cases, the raw type is the same, but with different explicit semantics.

Rust does not have dependent types, instead relying on the programmer and safety comments to maintain soundness of unsafe operations. This means that the compiler does not have access to the implicitly stated safety contracts. Instead, the unsafe operations are encapsulated by an environment that guarantees their safety.

Through any combination of explicit checking, SAFETY comments, and private/unsafe fields

Of interest to this project is then how these (invisible to the compiler) contracts and invariants should be expressed to the linker.

In general, a library user (hereafter “programmer”) does not care about the implementation details of a library system. And in the case of private fields, the programmer is not allowed to care. Implementation details (such as struct layout) would be needed for direct/inlined field accesses and splitting borrows. Moreover, most user-defined types in Rust are nominal, and are therefore distinguished not by their contents but by their name and namespace. For types with no private/unsafe fields, a standard name mangling scheme can be used.

Specifying a specific implementation (i.e. working on the semantic level) also leaves much to be desired. For one, it ties the application to a specific library version. Moreover, the compiler does not know the semantic interface, and must therefore rely on arbitrary identifiers.

The troubles with linking at the equipped level

This leaves linking on the equipped level. Here, we consider a type only through the functions/methods that operate on it. A type itself is completely opaque except for specifically guaranteed characteristics. We essentially treat it as C would an opaque pointer. This does leave one problem, namely guaranteeing that all the functions have the same semantic conceptualization of the type.

Below the two usages of dynamic libraries are explicated. Both have native support from the system linker, although that may not be the best way to use them.

Statically defined dynamic dependencies (`LD_PRELOAD`)

In this case, the dependencies are statically known, but dynamically loaded/resolved. The program should not change semantics compared to statically including a dependency, if the same version is loaded. As the dependencies are statically known, the types can be decorated with their import origin and can be prevented from crossing into different sos. The problem of ensuring that the functions imported from a module do in fact come out of the same module does still remain.

The Windows system linker/loader identifies a symbol by both the DLL and the symbol name, solving the above problem.

Unix-like systems only use the symbol for resolution, but the guarantee may be recovered by using the dynamic loading facilities. Using dlopen(3) without a fully qualified path will use the same search path for the object as the normal dynamic loader. The advantage is that dlopen(3) returns a handle through which the symbols in the object can be accessed. Calling dlsym with the handle guarantees that it will resolve into the object. Therefore, by intermediating the symbol resolution through the handles the guarantee that the functions share a semantic conceptualization can be guaranteed without binding to a specific version. The resolution only needs to be performed once per function.

Run-time defined dynamic linking (`dlopen(3)`)

In this case, the programmer loads arbitrary shared objects using dlopen or similar. The trouble here is that, in the absence of dynamic typing, all handles are the same type, and a resolved symbol cannot be decorated from where it came.

Can add dynamic typing layer. Can make all calls to functions with arguments that are possible semantic types unsafe. Can export functions with two symbols, a semantic-specific (safe) version, and a semantic-generic (unsafe) version.

Adding dynamic types

Have .so export function that returns the arbitrary/random semantic-specific prefix. All opaque pointers out of a handle are dynamically decorated with that prefix. Before calling a function with a semantic argument, check the prefixes against eachother.

Calling dynamically loaded unsafe functions

The programmer must have some way of recovering the safety invariants that the function requires. Including if other functions need to be called before.

The three levels of Rust types

The troubles with linking at the equipped level

Statically defined dynamic dependencies (LD_PRELOAD)

Run-time defined dynamic linking (dlopen(3))

Adding dynamic types

Calling dynamically loaded unsafe functions

Statically defined dynamic dependencies (`LD_PRELOAD`)

Run-time defined dynamic linking (`dlopen(3)`)