<buttonid="sidebar-toggle"class="icon-button"type="button"title="Toggle Table of Contents"aria-label="Toggle Table of Contents"aria-controls="sidebar">
<inputtype="search"name="search"id="searchbar"name="searchbar"placeholder="Search this book ..."aria-controls="searchresults-outer"aria-describedby="searchresults-header">
<p>When developing programs, we have to solve many problems. A program can be viewed as a solution to a problem. It can also be viewed as a collection of solutions to many different problems. All of these solutions work together to solve a bigger problem.</p>
<p>There are many problems that share the same form. Due to the fact that Rust is not object-oriented design patterns vary with respect to other object-oriented programming languages. While the details are different, since they have the same form they can be solved using the same fundamental methods.</p>
<p>If you have a question or an idea regarding certain content but you want to have feedback of fellow community members
and you think it may not be appropriate to file an issue open a discussion in our <ahref="https://github.com/rust-unofficial/patterns/discussions">discussion board</a>.</p>
<h2><aclass="header"href="#writing-a-new-article"id="writing-a-new-article">Writing a new article</a></h2>
<p>Before writing a new article please check in one of the following resources if there is an existing discussion or if someone is already working on that topic:</p>
<p>If you don't find an issue regarding your topic and you are sure it is not more feasible to open a thread in the <ahref="https://github.com/rust-unofficial/patterns/discussions">discussion board</a>
please open a new issue, so we can discuss about the ideas and future content of the article together and maybe
give some feedback/input on it.</p>
<p>When writing a new article it's recommended to copy the <ahref="https://github.com/rust-unofficial/patterns/blob/master/template.md">pattern template</a> into the
appropriate directory and start editing it. You may not want to fill out every section and remove it or you might want to add extra sections.</p>
<p>Consider writing your article in a way that has a low barrier of entry so also <ahref="https://github.com/rust-lang/rustlings">Rustlings</a> can follow
and understand the thought process behind it. So we can encourage people to use these patterns early on.</p>
<p>We encourage you to write idiomatic Rust code that builds in the <ahref="https://play.rust-lang.org/">playground</a>.</p>
<p>If you use links to blogposts or in general content that is not to be sure existing in a few years (e.g. pdfs) please take a snapshot
with the <ahref="https://web.archive.org/">Wayback Machine</a> and use the link to that snapshot in your article.</p>
<p>Don't forget to add your new article to the <code>SUMMARY.md</code> to let it be rendered to the book.</p>
<p>Please make <code>Draft Pull requests</code> early so we can follow your progress and can give early feedback (see the following section).</p>
<h2><aclass="header"href="#check-the-article-locally"id="check-the-article-locally">Check the article locally</a></h2>
<p>Before submitting the PR launch the commands <code>mdbook build</code> to make sure that the book builds and <code>mdbook test</code> to make sure that
<p>To make sure the files comply with our Markdown style we use <ahref="https://github.com/igorshubovych/markdownlint-cli">markdownlint-cli</a>.
To spare you some manual work to get through the CI test you can use the following commands to automatically fix most of the emerging problems when writing Markdown files.</p>
<p>Please <strong>don't force push</strong> commits in your branch, in order to keep commit history and make it easier for us to see changes between reviews.</p>
<p>Make sure to <code>Allow edits of maintainers</code> (under the text box) in the PR so people can actually collaborate on things or fix smaller issues themselves.</p>
<p><ahref="https://en.wikipedia.org/wiki/Programming_idiom">Idioms</a> are commonly used styles and patterns largely agreed upon by a community. They are guidelines. Writing idiomatic code allows other developers to understand what is happening because they are familiar with the form that it has.</p>
<p>The computer understands the machine code that is generated by the compiler. The language is therefore mostly beneficial to the developer. So, since we have this abstraction layer, why not put it to good use and make it simple?</p>
<p>Remember the <ahref="https://en.wikipedia.org/wiki/KISS_principle">KISS principle</a>: "Keep It Simple, Stupid". It claims that "most systems work best if they are kept simple rather than made complicated; therefore, simplicity should be a key goal in design, and unnecessary complexity should be avoided".</p>
<blockquote>
<p>Code is there for humans, not computers, to understand.</p>
<p>Using a target of a deref coercion can increase the flexibility of your code when you are deciding which argument type to use for a function argument.
In this way, the function will accept more input types.</p>
<p>This is not limited to slice-able or fat pointer types. In fact you should always prefer using the <strong>borrowed type</strong> over <strong>borrowing the owned type</strong>. E.g., <code>&str</code> over <code>&String</code>, <code>&[T]</code> over <code>&Vec<T></code>, or <code>&T</code> over <code>&Box<T></code>.</p>
<p>Using borrowed types you can avoid layers of indirection for those instances where the owned type already provides a layer of indirection. For instance, a <code>String</code> has a layer of indirection, so a <code>&String</code> will have two layers of indrection.
We can avoid this by using <code>&str</code> instead, and letting <code>&String</code> coerce to a <code>&str</code> whenever the function is invoked.</p>
<p>For this example, we will illustrate some differences for using <code>&String</code> as a function argument versus using a <code>&str</code>, but the ideas apply as well to using <code>&Vec<T></code> versus using a <code>&[T]</code> or using a <code>&T</code> versus a <code>&Box<T></code>.</p>
<p>Consider an example where we wish to determine if a word contains three consecutive vowels.
We don't need to own the string to determine this, so we will take a reference.</p>
It's likely that you may say to yourself: that doesn't matter, I will never be using a <code>&'static str</code> as an input anways (as we did when we used <code>"Ferris"</code>).
Even ignoring this special example, you may still find that using <code>&str</code> will give you more flexibility than using a <code>&String</code>.</p>
<p>Let's now take an example where someone gives us a sentence, and we want to determine if any of the words in the sentence has a word that contains three consecutive vowels.
We probably should make use of the function we have already defined and simply feed in each word from the sentence.</p>
This is because string slices are a <code>&str</code> and not a <code>&String</code> which would require an allocation to be converted to <code>&String</code> which is not implicit, whereas converting from <code>String</code> to <code>&str</code> is cheap and implicit.</p>
<li><ahref="https://doc.rust-lang.org/reference/type-coercions.html">Rust Language Reference on Type Coercions</a></li>
<li>For more discussion on how to handle <code>String</code> and <code>&str</code> see <ahref="https://web.archive.org/web/20201112023149/https://hermanradtke.com/2015/05/03/string-vs-str-in-rust-functions.html">this blog series (2015)</a> by Herman J. Radtke III.</li>
</ul>
<h1><aclass="header"href="#concatenating-strings-with-format"id="concatenating-strings-with-format">Concatenating strings with <code>format!</code></a></h1>
<p>Many types in Rust have a <ahref="idioms/ctor.html">constructor</a>. However, this is <em>specific</em> to the
type; Rust cannot abstract over "everything that has a <code>new()</code> method". To
allow this, the <ahref="https://doc.rust-lang.org/stable/std/default/trait.Default.html"><code>Default</code></a> trait was conceived, which can be used with
containers and other generic types (e.g. see <ahref="https://doc.rust-lang.org/stable/std/option/enum.Option.html#method.unwrap_or_default"><code>Option::unwrap_or_default()</code></a>).
Notably, some containers already implement it where applicable.</p>
<p>Not only do one-element containers like <code>Cow</code>, <code>Box</code> or <code>Arc</code> implement
<code>Default</code> for contained <code>Default</code> types, one can automatically
<code>#[derive(Default)]</code> for structs whose fields all implement it, so the more
types implement <code>Default</code>, the more useful it becomes.</p>
<p>On the other hand, constructors can take multiple arguments, while the
<code>default()</code> method does not. There can even be multiple constructors with
different names, but there can only be one <code>Default</code> implementation per type.</p>
<li>The <ahref="idioms/ctor.html">constructor</a> idiom is another way to generate instances that may or may
not be "default"</li>
<li>The <ahref="https://doc.rust-lang.org/stable/std/default/trait.Default.html"><code>Default</code></a> documentation (scroll down for the list of implementors)</li>
<h1><aclass="header"href="#memreplace-to-keep-owned-values-in-changed-enums"id="memreplace-to-keep-owned-values-in-changed-enums"><code>mem::replace</code> to keep owned values in changed enums</a></h1>
<ahref="idioms/../patterns/RAII.html">RAII guards</a> can benefit from tight control over lifetimes.</li>
<li>For conditionally filled <code>Option<&T></code>s of (mutable) references, one can
initialize an <code>Option<T></code> directly and use its <ahref="https://doc.rust-lang.org/std/option/enum.Option.html#method.as_ref"><code>.as_ref()</code></a> method to get an
<p>This code in inferior to the original in two respects:</p>
<ol>
<li>There is much more <code>unsafe</code> code, and more importantly, more invariants it must uphold.</li>
<li>Due to the extensive arithmetic required, there is a bug in this version that cases Rust <code>undefined behaviour</code>.</li>
</ol>
<p>The bug here is a simple mistake in pointer arithmetic: the string was copied, all <code>msg_len</code> bytes of it.
However, the <code>NUL</code> terminator at the end was not.</p>
<p>The Vector then had its size <em>set</em> to the length of the <em>zero padded string</em> -- rather than <em>resized</em> to it, which could have added a zero at the end. As a result, the last byte in the Vector is uninitialized memory.
When the <code>CString</code> is created at the bottom of the block, its read of the Vector will cause <code>undefined behaviour</code>!</p>
<p>Like many such issues, this would be difficult issue to track down.
Sometimes it would panic because the string was not <code>UTF-8</code>, sometimes it would put a weird character at the end of the string, sometimes it would just completely crash.</p>
<p>Rust has built-in support for C-style strings with its <code>CString</code> and <code>CStr</code> types.
However, there are different approaches one can take with strings that are being sent to a foreign function call from a Rust function.</p>
<p>The best practice is simple: use <code>CString</code> in such a way as to minimize <code>unsafe</code> code.
However, a secondary caveat is that <em>the object must live long enough</em>, meaning the lifetime should be maximized.
In addition, the documentation explains that "round-tripping" a <code>CString</code> after modification is UB, so additional work is necessary in that case.</p>
<p>This code will result in a dangling pointer, because the lifetime of the <code>CString</code> is not extended by the pointer creation, unlike if a reference were created.</p>
<p>Another issue frequently raised is that the initialization of a 1k vector of zeroes is "slow".
However, recent versions of Rust actually optimize that particular macro to a call to <code>zmalloc</code>, meaning it is as fast as the operating system's ability to return zeroed memory (which is quite fast).</p>
<p><code>Option</code> can be viewed as a container that contains either zero or one elements. In particular, it implements the <code>IntoIterator</code> trait, and as such can be used with generic code that needs such a type.</p>
<p>Since <code>Option</code> implements <code>IntoIterator</code>, it can be used as an argument to <ahref="https://doc.rust-lang.org/std/iter/trait.Extend.html#tymethod.extend"><code>.extend()</code></a>:</p>
let mut logicians = vec!["Curry", "Kleene", "Markov"];
logicians.extend(turing);
// equivalent to
if let Some(turing_inner) = turing {
logicians.push(turing_inner);
}
<spanclass="boring">}
</span></code></pre></pre>
<p>If you need to tack an <code>Option</code> to the end of an existing iterator, you can pass it to <ahref="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.chain"><code>.chain()</code></a>:</p>
let logicians = vec!["Curry", "Kleene", "Markov"];
for logician in logicians.iter().chain(turing.iter()) {
println!("{} is a logician", logician);
}
<spanclass="boring">}
</span></code></pre></pre>
<p>Note that if the <code>Option</code> is always <code>Some</code>, then it is more idiomatic to use <ahref="https://doc.rust-lang.org/std/iter/fn.once.html"><code>std::iter::once</code></a> on the element instead.</p>
<p>Also, since <code>Option</code> implements <code>IntoIterator</code>, it's possible to iterate over it using a <code>for</code> loop. This is equivalent to matching it with <code>if let Some(..)</code>, and in most cases you should prefer the latter.</p>
<p><ahref="https://doc.rust-lang.org/std/iter/fn.once.html"><code>std::iter::once</code></a> is an iterator which yields exactly one element. It's a more readable alternative to <code>Some(foo).into_iter()</code>.</p>
</li>
<li>
<p><ahref="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.filter_map"><code>Iterator::filter_map</code></a> is a version of <ahref="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.flat_map"><code>Iterator::flat_map</code></a>, specialized to mapping functions which return <code>Option</code>.</p>
</li>
<li>
<p>The <ahref="https://crates.io/crates/ref_slice"><code>ref_slice</code></a> crate provides functions for converting an <code>Option</code> to a zero- or one-element slice.</p>
</li>
<li>
<p><ahref="https://doc.rust-lang.org/std/option/enum.Option.html">Documentation for <code>Option<T></code></a></p>
</li>
</ul>
<h1><aclass="header"href="#pass-variables-to-closure"id="pass-variables-to-closure">Pass variables to closure</a></h1>
<p>Adding a field to a struct is a mostly backwards compatible change.
However, if a client uses a pattern to deconstruct a struct instance, they might name all the fields in the struct and adding a new one would break that pattern.
The client could name some of the fields and use <code>..</code> in the pattern, in which case adding another field is backwards compatible.
Making at least one of the struct's fields private forces clients to use the latter form of patterns, ensuring that the struct is future-proof.</p>
<p>The downside of this approach is that you might need to add an otherwise unneeded field to the struct.
You can use the <code>()</code> type so that there is no runtime overhead and prepend <code>_</code> to the field name to avoid the unused field warning.</p>
<p>If Rust allowed private variants of enums, we could use the same trick to make adding a variant to an enum backwards compatible.
The problem there is exhaustive match expressions.
A private variant would force clients to have a <code>_</code> wildcard pattern.
A common way to implement this instead is using the <ahref="https://doc.rust-lang.org/reference/attributes/type_system.html">#[non_exhaustive]</a> attribute.</p>
<p>Instead of typing all of this boiler plate to create an <code>Connection</code> and <code>Request</code> it is easier to just create a wrapping dummy function which takes them as arguments:</p>
<p><strong>Note</strong> in the above example the line <code>assert!(response.is_ok());</code> will not actually run while testing because it is inside of a function which is never invoked.</p>
<p>As example is in a function, the code will not be tested. (Though it still will checked to make sure it compiles when running a <code>cargo test</code>)
So this pattern is most useful when need <code>no_run</code>. With this, you do not need to add <code>no_run</code>.</p>
<p>If they are, an alternative can be to create a public method to create a dummy instance which is annotated with <code>#[doc(hidden)]</code> (so that users won't see it).
Then this method can be called inside of rustdoc because it is part of the crate's public API.</p>
<p><ahref="https://en.wikipedia.org/wiki/Software_design_pattern">Design patterns</a> are "general reusable solutions to a commonly occurring problem within a given context in software design".
Design patterns are a great way to describe some of the culture and 'tribal knowledge' of programming in a language.
Design patterns are very language-specific - what is a pattern in one language may be unnecessary in another due to a language feature, or impossible to express due to a missing feature.</p>
<p>If overused, design patterns can add unnecessary complexity to programs. However, they are a great way to share intermediate and advanced level knowledge about a programming language.</p>
<h2><aclass="header"href="#design-patterns-in-rust-1"id="design-patterns-in-rust-1">Design patterns in Rust</a></h2>
<p>Rust has many very unique features. These features give us great benefit by removing whole classes of problems. Some of them are also patterns that are <em>unique</em> to Rust.</p>
<p>If you're not familiar with it, YAGNI is an acronym that stands for <code>You Aren't Going to Need It</code>. It's an important software design principle to apply as you write code.</p>
<blockquote>
<p>The best code I ever wrote was code I never wrote.</p>
</blockquote>
<p>If we apply YAGNI to design patterns, we see that the features of Rust allow us to throw out many patterns. For instance, there is no need for the <ahref="https://en.wikipedia.org/wiki/Strategy_pattern">strategy pattern</a> in Rust because we can just use <ahref="https://doc.rust-lang.org/book/traits.html">traits</a>.</p>
<p>TODO: Maybe include some code to illustrate the traits.</p>
<li><ahref="https://web.archive.org/web/20210104103100/https://doc.rust-lang.org/1.12.0/style/ownership/builders.html">Description in the style guide</a></li>
<li><ahref="https://crates.io/crates/derive_builder">derive_builder</a>, a crate for automatically implementing this pattern while avoiding the boilerplate.</li>
<li><ahref="patterns/../idioms/ctor.html">Constructor pattern</a> for when construction is simpler.</li>
<li><ahref="https://web.archive.org/web/20210104103000/https://rust-lang.github.io/api-guidelines/type-safety.html#c-builder">Builders enable construction of complex values (C-BUILDER)</a> from the Rust API guidelines</li>
<h1><aclass="header"href="#compose-structs-together-for-better-borrowing"id="compose-structs-together-for-better-borrowing">Compose structs together for better borrowing</a></h1>
<p>Writing FFI code is an entire course in itself.
However, there are several idioms here that can act as pointers, and avoid traps for inexperienced users of unsafe Rust.</p>
<p>This section contains design patterns that may be useful when doing FFI.</p>
<ol>
<li>
<p><ahref="patterns/./ffi-export.html">Object-Based API</a> design that has good memory safety characteristics, and a clean boundary of what is safe and what is unsafe</p>
</li>
<li>
<p><ahref="patterns/./ffi-wrappers.html">Type Consolidation into Wrappers</a> - group multiple Rust types together into an opaque "object"</p>
<p>When designing APIs in Rust which are exposed to other languages, there are some important design principles which are contrary to normal Rust API design:</p>
<ol>
<li>All Encapsulated types should be <em>owned</em> by Rust, <em>managed</em> by the user, and <em>opaque</em>.</li>
<li>All Transactional data types should be <em>owned</em> by the user, and <em>transparent</em>.</li>
<li>All library behavior should be functions acting upon Encapsulated types.</li>
<li>All library behavior should be encapsulated into types not based on structure, but <em>provenance/lifetime</em>.</li>
<p>Rust has built-in FFI support to other languages.
It does this by providing a way for crate authors to provide C-compatible APIs through different ABIs (though that is unimportant to this practice).</p>
<p>Well-designed Rust FFI follows C API design principles, while compromising the design in Rust as little as possible. There are three goals with any foreign API:</p>
<ol>
<li>Make it easy to use in the target language.</li>
<li>Avoid the API dictating internal unsafety on the Rust side as much as possible.</li>
<li>Keep the potential for memory unsafety and Rust <code>undefined behaviour</code> as small as possible.</li>
</ol>
<p>Rust code must trust the memory safety of the foreign language beyond a certain point.
However, every bit of <code>unsafe</code> code on the Rust side is an opportunity for bugs, or to exacerbate <code>undefined behaviour</code>.</p>
<p>For example, if a pointer provenance is wrong, that may be a segfault due to invalid memory access.
But if it is manipulated by unsafe code, it could become full-blown heap corruption.</p>
<p>The Object-Based API design allows for writing shims that have good memory safety characteristics, and a clean boundary of what is safe and what is <code>unsafe</code>.</p>
<p>The POSIX standard defines the API to access an on-file database, known as <ahref="https://web.archive.org/web/20210105035602/https://www.mankier.com/0p/ndbm.h">DBM</a>. It is an excellent example of an "object-based" API.</p>
<p>Here is the definition in C, which hopefully should be easy to read for those involved in FFI.
The commentary below should help explaining it for those who miss the subtleties.</p>
<p>This API defines two types: <code>DBM</code> and <code>datum</code>.</p>
<p>The <code>DBM</code> type was called an "encapsulated" type above.
It is designed to contain internal state, and acts as an entry point for the library's behavior.</p>
<p>It is completely opaque to the user, who cannot create a <code>DBM</code> themselves since they don't know its size or layout.
Instead, they must call <code>dbm_open</code>, and that only gives them <em>a pointer to one</em>.</p>
<p>This means all <code>DBM</code>s are "owned" by the library in a Rust sense. The internal state of unknown size is kept in memory controlled by the library, not the user.
The user can only manage its life cycle with <code>open</code> and <code>close</code>, and perform operations on it with the other functions.</p>
<p>The <code>datum</code> type was called a "transactional" type above. It is designed to facilitate the exchange of information between the library and its user.</p>
<p>The database is designed to store "unstructured data", with no pre-defined length or meaning.
As a result, the <code>datum</code> is the C equivalent of a Rust slice: a bunch of bytes, and a count of how many there are.
The main difference is that there is no type information, which is what <code>void</code> indicates.</p>
<p>Keep in mind that this header is written from the library's point of view.
The user likely has some type they are using, which has a known size.
But the library does not care, and by the rules of C casting, any type behind a pointer can be cast to <code>void</code>.</p>
<p>As noted earlier, this type is <em>transparent</em> to the user. But also, this type is <em>owned</em> by the user.
This has subtle ramifications, due to that pointer inside it.
The question is, who owns the memory that pointer points to?</p>
<p>The answer for best memory safety is, "the user".
But in cases such as retrieving a value, the user does not know how to allocate it correctly (since they don't know how long the value is).
In this case, the library code is expected to use the heap that the user has access to -- such as the C library <code>malloc</code> and <code>free</code> -- and then <em>transfer ownership</em> in the Rust sense.</p>
<p>This may all seem speculative, but this is what a pointer means in C.
It means the same thing as Rust: "user defined lifetime."
The user of the library needs to read the documentation in order to use it correctly.
That said, there are some decisions that have fewer or greater consequences if users do it wrong.
Minimizing those is what this best practice is about, and the key is to <em>transfer ownership of everything that is transparent</em>.</p>
/* THIS API IS A BAD IDEA! For real applications, use object-based design instead. */
}
</code></pre>
<p>This API loses a key piece of information: the lifetime of the iterator must not exceed the lifetime of the <code>Dbm</code> object that owns it.
A user of the library could use it in a way which causes the iterator to outlive the data it is iterating on, resulting in reading uninitialized memory.</p>
<p>This example written in C contains a bug that will be explained afterwards:</p>
/* DO NOT USE THIS FUNCTION. IT HAS A SUBTLE BUT SERIOUS BUG! */
datum key;
int len = 0;
if (!dbm_iter_new(db)) {
dbm_close(db);
return -1;
}
int l;
while ((l = dbm_iter_next(owner, &key)) >= 0) { // an error is indicated by -1
free(key.dptr);
len += key.dsize;
if (l == 0) { // end of the iterator
dbm_close(owner);
}
}
if l >= 0 {
return -1;
} else {
return len;
}
}
</code></pre>
<p>This bug is a classic. Here's what happens when the iterator returns the end-of-iteration marker:</p>
<ol>
<li>The loop condition sets <code>l</code> to zero, and enters the loop because <code>0 >= 0</code>.</li>
<li>The length is incremented, in this case by zero.</li>
<li>The if statement is true, so the database is closed. There should be a break statement here.</li>
<li>The loop condition executes again, causing a <code>next</code> call on the closed object.</li>
</ol>
<p>The worst part about this bug?
If the Rust implementation was careful, this code will work most of the time!
If the memory for the <code>Dbm</code> object is not immediately reused, an internal check will almost certainly fail, resulting in the iterator returning a <code>-1</code> indicating an error.
But occasionally, it will cause a segmentation fault, or even worse, nonsensical memory corruption!</p>
<p>None of this can be avoided by Rust.
From its perspective, it put those objects on its heap, returned pointers to them, and gave up control of their lifetimes. The C code simply must "play nice".</p>
<p>The programmer must read and understand the API documentation.
While some consider that par for the course in C, a good API design can mitigate this risk.
The POSIX API for <code>DBM</code> did this by <em>consolidating the ownership</em> of the iterator with its parent:</p>
<p>However, this design choice also has a number of drawbacks, which should be considered as well.</p>
<p>First, the API itself becomes less expressive.
With POSIX DBM, there is only one iterator per object, and every call changes its state.
This is much more restrictive than iterators in almost any language, even though it is safe.
Perhaps with other related objects, whose lifetimes are less hierarchical, this limitation is more of a cost than the safety.</p>
<p>Second, depending on the relationships of the API's parts, significant design effort may be involved.
Many of the easier design points have other patterns associated with them:</p>
<ul>
<li>
<p><ahref="patterns/./ffi-wrappers.html">Wrapper Type Consolidation</a> groups multiple Rust types together into an opaque "object"</p>
</li>
<li>
<p><ahref="patterns/../idioms/ffi-errors.html">FFI Error Passing</a> explains error handling with integer codes and sentinel return values (such as <code>NULL</code> pointers)</p>
</li>
<li>
<p><ahref="patterns/../idioms/ffi-accepting-strings.html">Accepting Foreign Strings</a> allows accepting strings with minimal unsafe code, and is easier to get right than <ahref="patterns/../idioms/ffi-passing-strings.html">Passing Strings to FFI</a></p>
</li>
</ul>
<p>However, not every API can be done this way.
It is up to the best judgement of the programmer as to who their audience is.</p>
<h1><aclass="header"href="#type-consolidation-into-wrappers"id="type-consolidation-into-wrappers">Type Consolidation into Wrappers</a></h1>
<p>This pattern is designed to allow gracefully handling multiple related types, while minimizing the surface area for memory unsafety.</p>
<p>One of the cornerstones of Rust's aliasing rules is lifetimes.
This ensures that many patterns of access between types can be memory safe, data race safety included.</p>
<p>However, when Rust types are exported to other languages, they are usually transformed into pointers.
In Rust, a pointer means "the user manages the lifetime of the pointee." It is their responsibility to avoid memory unsafety.</p>
<p>Some level of trust in the user code is thus required, notably around use-after-free which Rust can do nothing about.
However, some API designs place higher burdens than others on the code written in the other language.</p>
<p>The lowest risk API is the "consolidated wrapper", where all possible interactions with an object are folded into a "wrapper type", while keeping the Rust API clean.</p>
<p>Often, wrapping types is quite difficult, and sometimes a Rust API compromise would make things easier.</p>
<p>As an example, consider an iterator which does not efficiently implement <code>nth()</code>.
It would definitely be worth putting in special logic to make the object handle iteration internally, or to support a different access pattern efficiently that only the Foreign Function API will use.</p>
<h3><aclass="header"href="#trying-to-wrap-iterators-and-failing"id="trying-to-wrap-iterators-and-failing">Trying to Wrap Iterators (and Failing)</a></h3>
<p>To wrap any type of iterator into the API correctly, the wrapper would need to do what a C version of the code would do: erase the lifetime of the iterator, and manage it manually.</p>
<p>Suffice it to say, this is <em>incredibly</em> difficult.</p>
<p>Here is an illustration of just <em>one</em> pitfall.</p>
<p>A first version of <code>MySetWrapper</code> would look like this:</p>
<p>With <code>transmute</code> being used to extend a lifetime, and a pointer to hide it, it's ugly already.
But it gets even worse: <em>any other operation can cause Rust <code>undefined behaviour</code></em>.</p>
<p>Consider that the <code>MySet</code> in the wrapper could be manipulated by other functions during iteration, such as storing a new value to the key it was iterating over.
The API doesn't discourage this, and in fact some similar C libraries expect it.</p>
<p>A simple implementation of <code>myset_store</code> would be:</p>
<pre><codeclass="language-rust ignore">pub mod unsafe_module {
// other module content
pub fn myset_store(
myset: *mut MySetWrapper,
key: datum,
value: datum) -> libc::c_int {
/* DO NOT USE THIS CODE. IT IS UNSAFE TO DEMONSTRATE A PROLBEM. */
let myset: &mut MySet = unsafe { // SAFETY: whoops, UB occurs in here!
&mut (*myset).myset
};
/* ...check and cast key and value data... */
match myset.store(casted_key, casted_value) {
Ok(_) => 0,
Err(e) => e.into()
}
}
}
</code></pre>
<p>If the iterator exists when this function is called, we have violated one of Rust's aliasing rules.
According to Rust, the mutable reference in this block must have <em>exclusive</em> access to the object.
If the iterator simply exists, it's not exclusive, so we have <code>undefined behaviour</code>! <supclass="footnote-reference"><ahref="#1">1</a></sup></p>
<p>To avoid this, we must have a way of ensuring that mutable reference really is exclusive.
That basically means clearing out the iterator's shared reference while it exists, and then reconstructing it.
In most cases, that will still be less efficient than the C version.</p>
<p>Some may ask: how can C do this more efficiently?
The answer is, it cheats. Rust's aliasing rules are the problem, and C simply ignores them for its pointers.
In exchange, it is common to see code that is declared in the manual as "not thread safe" under some or all circumstances.
In fact, <ahref="https://manpages.debian.org/buster/manpages/attributes.7.en.html">The GNU C library has an entire lexicon dedicated to concurrent behavior!</a></p>
<p>Rust would rather make everything memory safe all the time, for both safety and optimizations that C code cannot attain.
Being denied access to certain shortcuts is the price Rust programmers need to pay.</p>
<p>For the C programmers out there scratching their heads, the iterator need not be read <em>during</em> this code cause the UB.
The exclusivity rule also enables compiler optimizations which may cause inconsistent observations by the iterator's shared reference (e.g. stack spills or reordering instructions for efficiency).
These observations may happen <em>any time after</em> the mutable reference is created.</p>
<p>Here, <code>Bar</code> might be some public, generic type and <code>T1</code> and <code>T2</code> are some internal types. Users of our module shouldn't know that we implement <code>Foo</code> by using a <code>Bar</code>, but what we're really hiding here is the types <code>T1</code> and <code>T2</code>, and how they are used with <code>Bar</code>.</p>
<li><ahref="https://doc.rust-lang.org/book/ch19-04-advanced-types.html?highlight=newtype#using-the-newtype-pattern-for-type-safety-and-abstraction">Advanced Types in the book</a></li>
<p><ahref="https://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization">RAII</a> stands for "Resource Acquisition is Initialisation" which is a terrible
name. The essence of the pattern is that resource initialisation is done in the
constructor of an object and finalisation in the destructor. This pattern is
extended in Rust by using an RAII object as a guard of some resource and relying
on the type system to ensure that access is always mediated by the guard object.</p>
<li>This can lead to "dependency hell", when a project depends on multiple conflicting versions of a crate at the same time.
For example, the <code>url</code> crate has both versions 1.0 and 0.5.
Since the <code>Url</code> from <code>url:1.0</code> and the <code>Url</code> from <code>url:0.5</code> are different types, an HTTP client that uses <code>url:0.5</code> would not accept <code>Url</code> values from a web scraper that uses <code>url:1.0</code>.</li>
<p>The <ahref="https://crates.io/crates/ref_slice"><code>ref_slice</code></a> crate provides functions for converting <code>&T</code> to <code>&[T]</code>.</p>
<p>The <ahref="https://crates.io/crates/url"><code>url</code></a> crate provides tools for working with URLs.</p>
<p>The <ahref="https://crates.io/crates/num_cpus"><code>num_cpus</code></a> crate provides a function to query the number of CPUs on a machine.</p>
<p>If you have <code>unsafe</code> code, create the smallest possible module that can uphold the needed invariants to build a minimal safe interface upon the unsafety.
Embed this into a larger module that contains only safe code and presents an ergonomic interface.
Note that the outer module can contain unsafe functions and methods that call directly into the unsafe code.
<li>The <ahref="https://docs.rs/toolshed"><code>toolshed</code></a> crate contains its unsafe operations in submodules, presenting a safe interface to users.</li>
<li><code>std</code>s <code>String</code> class is a wrapper over <code>Vec<u8></code> with the added invariant that the contents must be valid UTF-8.
The operations on <code>String</code> ensure this behavior.
However, users have the option of using an <code>unsafe</code> method to create a <code>String</code>, in which case the onus is on them to guarantee the validity of the contents.</li>
<p>An <ahref="https://en.wikipedia.org/wiki/Anti-pattern">anti-pattern</a> is a solution to a "recurring problem that is usually ineffective and risks being highly counterproductive".
<ahref="https://doc.rust-lang.org/rustc/lints/levels.html#capping-lints">--cap-lints</a>. The <code>--cap-lints=warn</code> command line argument, turns all <code>deny</code>
lint errors into warnings. But be aware that <code>forbid</code> lints are stronger than
<code>deny</code> hence the 'forbid' level cannot be overridden to be anything lower than
an error. As a result <code>forbid</code> lints will still stop compilation.</p>
<p>Whoa! This is really different! What's going on here? Remember that with declarative programs we are describing <strong>what</strong> to do, rather than <strong>how</strong> to do it.
<code>fold</code> is a function that <ahref="https://en.wikipedia.org/wiki/Function_composition">composes</a> functions. The name is a convention from Haskell.</p>
<p>Here, we are composing functions of addition (this closure: <code>|a, b| a + b)</code>) with a range from 1 to 10.
The <code>0</code> is the starting point, so <code>a</code> is <code>0</code> at first.
<code>b</code> is the first element of the range, <code>1</code>. <code>0 + 1 = 1</code> is the result.
So now we <code>fold</code> again, with <code>a = 1</code>, <code>b = 2</code> and so <code>1 + 2 = 3</code> is the next result.
This process continues until we get to the last element in the range, <code>10</code>.</p>
<h2><aclass="header"href="#a-brief-overview-over-common-design-principles"id="a-brief-overview-over-common-design-principles">A brief overview over common design principles</a></h2>
<p>most systems work best if they are kept simple rather than made complicated; therefore, simplicity should be a key goal in design, and unnecessary complexity should be avoided</p>
<h2><aclass="header"href="#a-hrefhttpsenwikipediaorgwikilaw_of_demeterlaw-of-demeter-loda"id="a-hrefhttpsenwikipediaorgwikilaw_of_demeterlaw-of-demeter-loda"><ahref="https://en.wikipedia.org/wiki/Law_of_Demeter">Law of Demeter (LoD)</a></a></h2>
<p>a given object should assume as little as possible about the structure or properties of anything else (including its subcomponents), in accordance with the principle of "information hiding"</p>
<h2><aclass="header"href="#a-hrefhttpsenwikipediaorgwikidesign_by_contractdesign-by-contract-dbca"id="a-hrefhttpsenwikipediaorgwikidesign_by_contractdesign-by-contract-dbca"><ahref="https://en.wikipedia.org/wiki/Design_by_contract">Design by contract (DbC)</a></a></h2>
<p>software designers should define formal, precise and verifiable interface specifications for software components, which extend the ordinary definition of abstract data types with preconditions, postconditions and invariants</p>
<p>bundling of data with the methods that operate on that data, or the restricting of direct access to some of an object's components. Encapsulation is used to hide the values or state of a structured data object inside a class, preventing unauthorized parties' direct access to them.</p>
<p>“Functions should not produce abstract side effects...only commands (procedures) will be permitted to produce side effects.” - Bertrand Meyer: Object Oriented Software Construction</p>
<h2><aclass="header"href="#a-hrefhttpsenwikipediaorgwikiprinciple_of_least_astonishmentprinciple-of-least-astonishment-polaa"id="a-hrefhttpsenwikipediaorgwikiprinciple_of_least_astonishmentprinciple-of-least-astonishment-polaa"><ahref="https://en.wikipedia.org/wiki/Principle_of_least_astonishment">Principle of least astonishment (POLA)</a></a></h2>
<p>a component of a system should behave in a way that most users will expect it to behave. The behavior should not astonish or surprise users</p>
<p>“The designer of a module should strive to make all information about the module part of the module itself.” - Bertrand Meyer: Object Oriented Software Construction</p>
<p>“All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation.” - Bertrand Meyer: Object Oriented Software Construction</p>
<p>“Whenever a software system must support a set of alternatives, one and only one module in the system should know their exhaustive list.” - Bertrand Meyer: Object Oriented Software Construction</p>
<p>“Whenever a storage mechanism stores an object, it must store with it the dependents of that object. Whenever a retrieval mechanism retrieves a previously stored object, it must also retrieve any dependent of that object that has not yet been retrieved.” - Bertrand Meyer: Object Oriented Software Construction</p>