Change fence type, add more elaborate description

pull/4/head
Andre Richter 7 years ago
parent 135a72ed98
commit 036d6f0893

@ -51,10 +51,26 @@ and adding the dependency to `Cargo.toml`.
When signaling the GPU about a new mailbox message, we need to take care that
mailbox buffer setup has really finished. Both setting up mailbox contents and
signaling the GPU is done with store operations to memory (RAM and MMIO). There
is an unlikely chance that the compiler reorders instructions, resulting in
signaling the GPU _before_ all of the contents have been written to the mailbox
buffer. We prevent this by inserting a [compiler fence].
signaling the GPU is done with store operations to independent memory locations
(RAM and MMIO). Since compilers are free to reorder instructions without
control-flow or data-dependencies for optimization purposes, we need to take
care that signaling the GPU really takes place _after_ all of the contents have
been written to the mailbox buffer.
One way to do this would be to define the whole mailbox buffer as `volatile`, as
well as the location that we write to to signal the GPU. The compiler is not
allowed to reorder memory operations tagged with the `volatile` keyword with
each other. But this is not needed here. We don't care if the compiler optimizes
the buffer setup code as long as signaling the GPU takes place afterwards.
Therefore, we prevent premature signaling by inserting an explicit [compiler
fence] after the buffer preparation code. Since we signal the CPU by calling
another function, the fence would only be effective if that function was a)
inlined and b) the inlined instructions then reordered with buffer setup
code. Otherwise the compiler has to assume that the called function has
dependencies on previous memory operations and not reorder here. Although there
is little chance that the reordering scenario happens, I'll leave the fence
there nonetheless for academic purposes :-)
Please note that such reordering might also be done by CPUs that feature
[out-of-order execution]. Lucky us, although the Rasperry Pi 3 features

Binary file not shown.

@ -57,7 +57,7 @@ fn main() {
// Insert a compiler fence that ensures that all stores to the
// mbox buffer are finished before the GPU is signaled (which is
// done by a store operation as well).
compiler_fence(Ordering::SeqCst);
compiler_fence(Ordering::Release);
// send the message to the GPU and receive answer
let serial_avail = match mbox.call(mbox::channel::PROP) {

Binary file not shown.

@ -59,7 +59,7 @@ fn main() {
// Insert a compiler fence that ensures that all stores to the
// mbox buffer are finished before the GPU is signaled (which is
// done by a store operation as well).
compiler_fence(Ordering::SeqCst);
compiler_fence(Ordering::Release);
// send the message to the GPU and receive answer
let serial_avail = match mbox.call(mbox::channel::PROP) {

@ -81,7 +81,7 @@ impl Uart {
// Insert a compiler fence that ensures that all stores to the
// mbox buffer are finished before the GPU is signaled (which
// is done by a store operation as well).
compiler_fence(Ordering::SeqCst);
compiler_fence(Ordering::Release);
if mbox.call(mbox::channel::PROP).is_err() {
return Err(UartError::MailboxError); // Abort if UART clocks couldn't be set

Binary file not shown.

@ -81,7 +81,7 @@ impl Uart {
// Insert a compiler fence that ensures that all stores to the
// mbox buffer are finished before the GPU is signaled (which
// is done by a store operation as well).
compiler_fence(Ordering::SeqCst);
compiler_fence(Ordering::Release);
if mbox.call(mbox::channel::PROP).is_err() {
return Err(UartError::MailboxError); // Abort if UART clocks couldn't be set

Loading…
Cancel
Save