.\" $NetBSD: membar_ops.3,v 1.10 2022/04/09 23:32:52 riastradh Exp $ .\" .\" Copyright (c) 2007, 2008 The NetBSD Foundation, Inc. .\" All rights reserved. .\" .\" This code is derived from software contributed to The NetBSD Foundation .\" by Jason R. Thorpe. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED .\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS .\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS .\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN .\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE .\" POSSIBILITY OF SUCH DAMAGE. .\" .Dd March 30, 2022 .Dt MEMBAR_OPS 3 .Os .Sh NAME .Nm membar_ops , .Nm membar_acquire , .Nm membar_release , .Nm membar_producer , .Nm membar_consumer , .Nm membar_datadep_consumer , .Nm membar_sync .Nd memory ordering barriers .\" .Sh LIBRARY .\" .Lb libc .Sh SYNOPSIS .In sys/atomic.h .\" .Ft void .Fn membar_acquire "void" .Ft void .Fn membar_release "void" .Ft void .Fn membar_producer "void" .Ft void .Fn membar_consumer "void" .Ft void .Fn membar_datadep_consumer "void" .Ft void .Fn membar_sync "void" .Sh DESCRIPTION The .Nm family of functions prevent reordering of memory operations, as needed for synchronization in multiprocessor execution environments that have relaxed load and store order. .Pp In general, memory barriers must come in pairs \(em a barrier on one CPU, such as .Fn membar_release , must pair with a barrier on another CPU, such as .Fn membar_acquire , in order to synchronize anything between the two CPUs. Code using .Nm should generally be annotated with comments identifying how they are paired. .Pp .Nm affect only operations on regular memory, not on device memory; see .Xr bus_space 9 and .Xr bus_dma 9 for machine-independent interfaces to handling device memory and DMA operations for device drivers. .Pp Unlike C11, .Em all memory operations \(em that is, all loads and stores on regular memory \(em are affected by .Nm , not just C11 atomic operations on .Vt _Atomic\^ Ns -qualified objects. .Bl -tag -width abcd .It Fn membar_acquire Any load preceding .Fn membar_acquire will happen before all memory operations following it. .Pp A load followed by a .Fn membar_acquire implies a .Em load-acquire operation in the language of C11. .Fn membar_acquire should only be used after atomic read/modify/write, such as .Xr atomic_cas_uint 3 . For regular loads, instead of .Li "x = *p; membar_acquire()" , you should use .Li "x = atomic_load_acquire(p)" . .Pp .Fn membar_acquire is typically used in code that implements locking primitives to ensure that a lock protects its data, and is typically paired with .Fn membar_release ; see below for an example. .It Fn membar_release All memory operations preceding .Fn membar_release will happen before any store that follows it. .Pp A .Fn membar_release followed by a store implies a .Em store-release operation in the language of C11. .Fn membar_release should only be used before atomic read/modify/write, such as .Xr atomic_inc_uint 3 . For regular stores, instead of .Li "membar_release(); *p = x" , you should use .Li "atomic_store_release(p, x)" . .Pp .Fn membar_release is typically paired with .Fn membar_acquire , and is typically used in code that implements locking or reference counting primitives. Releasing a lock or reference count should use .Fn membar_release , and acquiring a lock or handling an object after draining references should use .Fn membar_acquire , so that whatever happened before releasing will also have happened before acquiring. For example: .Bd -literal -offset abcdefgh /* thread A -- release a reference */ obj->state.mumblefrotz = 42; KASSERT(valid(&obj->state)); membar_release(); atomic_dec_uint(&obj->refcnt); /* * thread B -- busy-wait until last reference is released, * then lock it by setting refcnt to UINT_MAX */ while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0) continue; membar_acquire(); KASSERT(valid(&obj->state)); obj->state.mumblefrotz--; .Ed .Pp In this example, .Em if the load in .Fn atomic_cas_uint in thread B witnesses the store in .Fn atomic_dec_uint in thread A setting the reference count to zero, .Em then everything in thread A before the .Fn membar_release is guaranteed to happen before everything in thread B after the .Fn membar_acquire , as if the machine had sequentially executed: .Bd -literal -offset abcdefgh obj->state.mumblefrotz = 42; /* from thread A */ KASSERT(valid(&obj->state)); \&... KASSERT(valid(&obj->state)); /* from thread B */ obj->state.mumblefrotz--; .Ed .Pp .Fn membar_release followed by a store, serving as a .Em store-release operation, may also be paired with a subsequent load followed by .Fn membar_acquire , serving as the corresponding .Em load-acquire operation. However, you should use .Xr atomic_store_release 9 and .Xr atomic_load_acquire 9 instead in that situation, unless the store is an atomic read/modify/write which requires a separate .Fn membar_release . .It Fn membar_producer All stores preceding .Fn membar_producer will happen before any stores following it. .Pp .Fn membar_producer has no analogue in C11. .Pp .Fn membar_producer is typically used in code that produces data for read-only consumers which use .Fn membar_consumer , such as .Sq seqlocked snapshots of statistics; see below for an example. .It Fn membar_consumer All loads preceding .Fn membar_consumer will complete before any loads after it. .Pp .Fn membar_consumer has no analogue in C11. .Pp .Fn membar_consumer is typically used in code that reads data from producers which use .Fn membar_producer , such as .Sq seqlocked snapshots of statistics. For example: .Bd -literal struct { /* version number and in-progress bit */ unsigned seq; /* read-only statistics, too large for atomic load */ unsigned foo; int bar; uint64_t baz; } stats; /* producer (must be serialized, e.g. with mutex(9)) */ stats->seq |= 1; /* mark update in progress */ membar_producer(); stats->foo = count_foo(); stats->bar = measure_bar(); stats->baz = enumerate_baz(); membar_producer(); stats->seq++; /* bump version number */ /* consumer (in parallel w/ producer, other consumers) */ restart: while ((seq = stats->seq) & 1) /* wait for update */ SPINLOCK_BACKOFF_HOOK; membar_consumer(); foo = stats->foo; /* read out a candidate snapshot */ bar = stats->bar; baz = stats->baz; membar_consumer(); if (seq != stats->seq) /* try again if version changed */ goto restart; .Ed .It Fn membar_datadep_consumer Same as .Fn membar_consumer , but limited to loads of addresses dependent on prior loads, or .Sq data-dependent loads: .Bd -literal -offset indent int **pp, *p, v; p = *pp; membar_datadep_consumer(); v = *p; consume(v); .Ed .Pp .Fn membar_datadep_consumer is typically paired with .Fn membar_release by code that initializes an object before publishing it. However, you should use .Xr atomic_store_release 9 and .Xr atomic_load_consume 9 instead, to avoid obscure edge cases in case the consumer is not read-only. .Pp .Fn membar_datadep_consumer does not guarantee ordering of loads in branches, or .Sq control-dependent loads \(em you must use .Fn membar_consumer instead: .Bd -literal -offset indent int *ok, *p, v; if (*ok) { membar_consumer(); v = *p; consume(v); } .Ed .Pp Most CPUs do not reorder data-dependent loads (i.e., most CPUs guarantee that cached values are not stale in that case), so .Fn membar_datadep_consumer is a no-op on those CPUs. .It Fn membar_sync All memory operations preceding .Fn membar_sync will happen before any memory operations following it. .Pp .Fn membar_sync is a sequential consistency acquire/release barrier, analogous to .Li "atomic_thread_fence(memory_order_seq_cst)" in C11. .Pp .Fn membar_sync is typically paired with .Fn membar_sync . .Pp .Fn membar_sync is typically not needed except in exotic synchronization schemes like Dekker's algorithm that require store-before-load ordering. If you are tempted to reach for it, see if there is another way to do what you're trying to do first. .El .Sh DEPRECATED MEMORY BARRIERS The following memory barriers are deprecated. They were imported from Solaris, which describes them as providing ordering relative to .Sq lock acquisition , but the documentation in .Nx disagreed with the implementation and use on the semantics. .Bl -tag -width abcd .It Fn membar_enter Originally documented as store-before-load/store, this was instead implemented as load-before-load/store on some platforms, which is what essentially all uses relied on. Now this is implemented as an alias for .Fn membar_sync everywhere, meaning a full load/store-before-load/store sequential consistency barrier, in order to guarantee what the documentation claimed .Em and what the implementation actually did. .Pp New code should use .Fn membar_acquire for load-before-load/store ordering, which is what most uses need, or .Fn membar_sync for store-before-load/store ordering, which typically only appears in exotic synchronization schemes like Dekker's algorithm. .It Fn membar_exit Alias for .Fn membar_release . This was originally meant to be paired with .Fn membar_enter . .Pp New code should use .Fn membar_release instead. .El .Sh SEE ALSO .Xr atomic_ops 3 , .Xr atomic_loadstore 9 , .Xr bus_dma 9 , .Xr bus_space 9 .Sh HISTORY The .Nm membar_ops functions first appeared in .Nx 5.0 . .Pp The data-dependent load barrier, .Fn membar_datadep_consumer , first appeared in .Nx 7.0 . .Pp The .Fn membar_acquire and .Fn membar_release functions first appeared, and the .Fn membar_enter and .Fn membar_exit functions were deprecated, in .Nx 10.0 .