When designing APIs in Rust which are exposed to other languages, there are some important design principles which are contrary to normal Rust API design:
1. All Encapsulated types should be *owned* by Rust, *managed* by the user, and *opaque*.
2. All Transactional data types should be *owned* by the user, and *transparent*.
3. All library behavior should be functions acting upon Encapsulated types.
4. All library behavior should be encapsulated into types not based on structure, but *provenance/lifetime*.
## Motivation
Rust has built-in FFI support to other languages.
It does this by providing a way for crate authors to provide C-compatible APIs through different ABIs (though that is unimportant to this practice).
2. Avoid the API dictating internal unsafety on the Rust side as much as possible.
3. Keep the potential for memory unsafety and Rust `undefined behaviour` as small as possible.
Rust code must trust the memory safety of the foreign language beyond a certain point.
However, every bit of `unsafe` code on the Rust side is an opportunity for bugs, or to exacerbate `undefined behaviour`.
For example, if a pointer provenance is wrong, that may be a segfault due to invalid memory access.
But if it is manipulated by unsafe code, it could become full-blown heap corruption.
The Object-Based API design allows for writing shims that have good memory safety characteristics, and a clean boundary of what is safe and what is `unsafe`.
The POSIX standard defines the API to access an on-file database, known as [DBM](https://web.archive.org/web/20210105035602/https://www.mankier.com/0p/ndbm.h).
It is an excellent example of an "object-based" API.
/* THIS API IS A BAD IDEA! For real applications, use object-based design instead. */
}
```
This API loses a key piece of information: the lifetime of the iterator must not exceed the lifetime of the `Dbm` object that owns it.
A user of the library could use it in a way which causes the iterator to outlive the data it is iterating on, resulting in reading uninitialized memory.
This example written in C contains a bug that will be explained afterwards:
```C
int count_key_sizes(DBM *db) {
/* DO NOT USE THIS FUNCTION. IT HAS A SUBTLE BUT SERIOUS BUG! */
datum key;
int len = 0;
if (!dbm_iter_new(db)) {
dbm_close(db);
return -1;
}
int l;
while ((l = dbm_iter_next(owner, &key)) >= 0) { // an error is indicated by -1
free(key.dptr);
len += key.dsize;
if (l == 0) { // end of the iterator
dbm_close(owner);
}
}
if l >= 0 {
return -1;
} else {
return len;
}
}
```
This bug is a classic. Here's what happens when the iterator returns the end-of-iteration marker:
1. The loop condition sets `l` to zero, and enters the loop because `0 >= 0`.
2. The length is incremented, in this case by zero.
3. The if statement is true, so the database is closed. There should be a break statement here.
4. The loop condition executes again, causing a `next` call on the closed object.
The worst part about this bug?
If the Rust implementation was careful, this code will work most of the time!