tracing_mutex/lib.rs
1//! Mutexes can deadlock each other, but you can avoid this by always acquiring your locks in a
2//! consistent order. This crate provides tracing to ensure that you do.
3//!
4//! This crate tracks a virtual "stack" of locks that the current thread holds, and whenever a new
5//! lock is acquired, a dependency is created from the last lock to the new one. These dependencies
6//! together form a graph. As long as that graph does not contain any cycles, your program is
7//! guaranteed to never deadlock.
8//!
9//! # Panics
10//!
11//! The primary method by which this crate signals an invalid lock acquisition order is by
12//! panicking. When a cycle is created in the dependency graph when acquiring a lock, the thread
13//! will instead panic. This panic will not poison the underlying mutex.
14//!
15//! This conflicting dependency is not added to the graph, so future attempts at locking should
16//! succeed as normal.
17//!
18//! # Structure
19//!
20//! Each module in this crate exposes wrappers for a specific base-mutex with dependency trakcing
21//! added. This includes [`stdsync`] which provides wrappers for the base locks in the standard
22//! library, and more depending on enabled compile-time features. More back-ends may be added as
23//! features in the future.
24//!
25//! # Feature flags
26//!
27//! `tracing-mutex` uses feature flags to reduce the impact of this crate on both your compile time
28//! and runtime overhead. Below are the available flags. Modules are annotated with the features
29//! they require.
30//!
31//! - `backtraces`: Enables capturing backtraces of mutex dependencies, to make it easier to
32//! determine what sequence of events would trigger a deadlock. This is enabled by default, but if
33//! the performance overhead is unacceptable, it can be disabled by disabling default features.
34//!
35//! - `lockapi`: Enables the wrapper lock for [`lock_api`][lock_api] locks
36//!
37//! - `parkinglot`: Enables wrapper types for [`parking_lot`][parking_lot] mutexes
38//!
39//! - `experimental`: Enables experimental features. Experimental features are intended to test new
40//! APIs and play with new APIs before committing to them. As such, breaking changes may be
41//! introduced in it between otherwise semver-compatible versions, and the MSRV does not apply to
42//! experimental features.
43//!
44//! # Performance considerations
45//!
46//! Tracing a mutex adds overhead to certain mutex operations in order to do the required
47//! bookkeeping. The following actions have the following overhead.
48//!
49//! - **Acquiring a lock** locks the global dependency graph temporarily to check if the new lock
50//! would introduce a cyclic dependency. This crate uses the algorithm proposed in ["A Dynamic
51//! Topological Sort Algorithm for Directed Acyclic Graphs" by David J. Pearce and Paul H.J.
52//! Kelly][paper] to detect cycles as efficently as possible. In addition, a thread local lock set
53//! is updated with the new lock.
54//!
55//! - **Releasing a lock** updates a thread local lock set to remove the released lock.
56//!
57//! - **Allocating a lock** performs an atomic update to a shared counter.
58//!
59//! - **Deallocating a mutex** temporarily locks the global dependency graph to remove the lock
60//! entry in the dependency graph.
61//!
62//! These operations have been reasonably optimized, but the performance penalty may yet be too much
63//! for production use. In those cases, it may be beneficial to instead use debug-only versions
64//! (such as [`stdsync::Mutex`]) which evaluate to a tracing mutex when debug assertions are
65//! enabled, and to the underlying mutex when they're not.
66//!
67//! For ease of debugging, this crate will, by default, capture a backtrace when establishing a new
68//! dependency between two mutexes. This has an additional overhead of over 60%. If this additional
69//! debugging aid is not required, it can be disabled by disabling default features.
70//!
71//! [paper]: https://whileydave.com/publications/pk07_jea/
72//! [lock_api]: https://docs.rs/lock_api/0.4/lock_api/index.html
73//! [parking_lot]: https://docs.rs/parking_lot/0.12.1/parking_lot/
74#![cfg_attr(docsrs, feature(doc_cfg))]
75use std::cell::RefCell;
76use std::fmt;
77use std::marker::PhantomData;
78use std::ops::Deref;
79use std::ops::DerefMut;
80use std::sync::Mutex;
81use std::sync::MutexGuard;
82use std::sync::OnceLock;
83use std::sync::PoisonError;
84use std::sync::atomic::AtomicUsize;
85use std::sync::atomic::Ordering;
86
87#[cfg(feature = "lock_api")]
88#[cfg_attr(docsrs, doc(cfg(feature = "lockapi")))]
89#[deprecated = "The top-level re-export `lock_api` is deprecated. Use `tracing_mutex::lockapi::raw` instead"]
90pub use lock_api;
91#[cfg(feature = "parking_lot")]
92#[cfg_attr(docsrs, doc(cfg(feature = "parkinglot")))]
93#[deprecated = "The top-level re-export `parking_lot` is deprecated. Use `tracing_mutex::parkinglot::raw` instead"]
94pub use parking_lot;
95
96use graph::DiGraph;
97use reporting::Dep;
98use reporting::Reportable;
99
100mod graph;
101#[cfg(any(feature = "lock_api", feature = "lockapi"))]
102#[cfg_attr(docsrs, doc(cfg(feature = "lock_api")))]
103#[cfg_attr(
104 all(not(docsrs), feature = "lockapi", not(feature = "lock_api")),
105 deprecated = "The `lockapi` feature has been renamed `lock_api`"
106)]
107pub mod lockapi;
108#[cfg(any(feature = "parking_lot", feature = "parkinglot"))]
109#[cfg_attr(docsrs, doc(cfg(feature = "parking_lot")))]
110#[cfg_attr(
111 all(not(docsrs), feature = "parkinglot", not(feature = "parking_lot")),
112 deprecated = "The `parkinglot` feature has been renamed `parking_lot`"
113)]
114pub mod parkinglot;
115mod reporting;
116pub mod stdsync;
117pub mod util;
118
119thread_local! {
120 /// Stack to track which locks are held
121 ///
122 /// Assuming that locks are roughly released in the reverse order in which they were acquired,
123 /// a stack should be more efficient to keep track of the current state than a set would be.
124 static HELD_LOCKS: RefCell<Vec<usize>> = const { RefCell::new(Vec::new()) };
125}
126
127/// Dedicated ID type for Mutexes
128///
129/// # Unstable
130///
131/// This type is currently private to prevent usage while the exact implementation is figured out,
132/// but it will likely be public in the future.
133struct MutexId(usize);
134
135impl MutexId {
136 /// Get a new, unique, mutex ID.
137 ///
138 /// This ID is guaranteed to be unique within the runtime of the program.
139 ///
140 /// # Panics
141 ///
142 /// This function may panic when there are no more mutex IDs available. The number of mutex ids
143 /// is `usize::MAX - 1` which should be plenty for most practical applications.
144 pub fn new() -> Self {
145 // Counter for Mutex IDs. Atomic avoids the need for locking.
146 static ID_SEQUENCE: AtomicUsize = AtomicUsize::new(0);
147
148 ID_SEQUENCE
149 .fetch_update(Ordering::SeqCst, Ordering::SeqCst, |id| id.checked_add(1))
150 .map(Self)
151 .expect("Mutex ID wraparound happened, results unreliable")
152 }
153
154 pub fn value(&self) -> usize {
155 self.0
156 }
157
158 /// Get a borrowed guard for this lock.
159 ///
160 /// This method adds checks adds this Mutex ID to the dependency graph as needed, and adds the
161 /// lock to the list of
162 ///
163 /// # Panics
164 ///
165 /// This method panics if the new dependency would introduce a cycle.
166 pub fn get_borrowed(&self) -> BorrowedMutex {
167 self.mark_held();
168 BorrowedMutex {
169 id: self,
170 _not_send: PhantomData,
171 }
172 }
173
174 /// Mark this lock as held for the purposes of dependency tracking.
175 ///
176 /// # Panics
177 ///
178 /// This method panics if the new dependency would introduce a cycle.
179 pub fn mark_held(&self) {
180 let opt_cycle = HELD_LOCKS.with(|locks| {
181 if let Some(&previous) = locks.borrow().last() {
182 let mut graph = get_dependency_graph();
183
184 graph.add_edge(previous, self.value(), Dep::capture).err()
185 } else {
186 None
187 }
188 });
189
190 if let Some(cycle) = opt_cycle {
191 panic!("{}", Dep::panic_message(&cycle))
192 }
193
194 HELD_LOCKS.with(|locks| locks.borrow_mut().push(self.value()));
195 }
196
197 pub unsafe fn mark_released(&self) {
198 HELD_LOCKS.with(|locks| {
199 let mut locks = locks.borrow_mut();
200
201 for (i, &lock) in locks.iter().enumerate().rev() {
202 if lock == self.value() {
203 locks.remove(i);
204 return;
205 }
206 }
207
208 // Drop impls shouldn't panic but if this happens something is seriously broken.
209 unreachable!("Tried to drop lock for mutex {:?} but it wasn't held", self)
210 });
211 }
212
213 /// Execute the given closure while the guard is held.
214 pub fn with_held<T>(&self, f: impl FnOnce() -> T) -> T {
215 // Note: we MUST construct the RAII guard, we cannot simply mark held + mark released, as
216 // f() may panic and corrupt our state.
217 let _guard = self.get_borrowed();
218 f()
219 }
220}
221
222impl Default for MutexId {
223 fn default() -> Self {
224 Self::new()
225 }
226}
227
228impl fmt::Debug for MutexId {
229 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
230 write!(f, "MutexID({:?})", self.0)
231 }
232}
233
234impl Drop for MutexId {
235 fn drop(&mut self) {
236 get_dependency_graph().remove_node(self.value());
237 }
238}
239
240/// `const`-compatible version of [`crate::MutexId`].
241///
242/// This struct can be used similarly to the normal mutex ID, but to be const-compatible its ID is
243/// generated on first use. This allows it to be used as the mutex ID for mutexes with a `const`
244/// constructor.
245///
246/// This type can be largely replaced once std::lazy gets stabilized.
247struct LazyMutexId {
248 inner: OnceLock<MutexId>,
249}
250
251impl LazyMutexId {
252 pub const fn new() -> Self {
253 Self {
254 inner: OnceLock::new(),
255 }
256 }
257}
258
259impl fmt::Debug for LazyMutexId {
260 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
261 write!(f, "{:?}", self.deref())
262 }
263}
264
265impl Default for LazyMutexId {
266 fn default() -> Self {
267 Self::new()
268 }
269}
270
271impl Deref for LazyMutexId {
272 type Target = MutexId;
273
274 fn deref(&self) -> &Self::Target {
275 self.inner.get_or_init(MutexId::new)
276 }
277}
278
279/// Borrowed mutex ID
280///
281/// This type should be used as part of a mutex guard wrapper. It can be acquired through
282/// [`MutexId::get_borrowed`] and will automatically mark the mutex as not borrowed when it is
283/// dropped.
284///
285/// This type intentionally is [`!Send`](std::marker::Send) because the ownership tracking is based
286/// on a thread-local stack which doesn't work if a guard gets released in a different thread from
287/// where they're acquired.
288#[derive(Debug)]
289struct BorrowedMutex<'a> {
290 /// Reference to the mutex we're borrowing from
291 id: &'a MutexId,
292 /// This value serves no purpose but to make the type [`!Send`](std::marker::Send)
293 _not_send: PhantomData<MutexGuard<'static, ()>>,
294}
295
296/// Drop a lock held by the current thread.
297///
298/// # Panics
299///
300/// This function panics if the lock did not appear to be handled by this thread. If that happens,
301/// that is an indication of a serious design flaw in this library.
302impl Drop for BorrowedMutex<'_> {
303 fn drop(&mut self) {
304 // Safety: the only way to get a BorrowedMutex is by locking the mutex.
305 unsafe { self.id.mark_released() };
306 }
307}
308
309/// Get a reference to the current dependency graph
310fn get_dependency_graph() -> impl DerefMut<Target = DiGraph<usize, Dep>> {
311 static DEPENDENCY_GRAPH: OnceLock<Mutex<DiGraph<usize, Dep>>> = OnceLock::new();
312
313 DEPENDENCY_GRAPH
314 .get_or_init(Default::default)
315 .lock()
316 .unwrap_or_else(PoisonError::into_inner)
317}
318
319#[cfg(test)]
320mod tests {
321 use rand::seq::SliceRandom;
322 use rand::thread_rng;
323
324 use super::*;
325
326 #[test]
327 fn test_next_mutex_id() {
328 let initial = MutexId::new();
329 let next = MutexId::new();
330
331 // Can't assert N + 1 because multiple threads running tests
332 assert!(initial.0 < next.0);
333 }
334
335 #[test]
336 fn test_lazy_mutex_id() {
337 let a = LazyMutexId::new();
338 let b = LazyMutexId::new();
339 let c = LazyMutexId::new();
340
341 let mut graph = get_dependency_graph();
342 assert!(graph.add_edge(a.value(), b.value(), Dep::capture).is_ok());
343 assert!(graph.add_edge(b.value(), c.value(), Dep::capture).is_ok());
344
345 // Creating an edge c → a should fail as it introduces a cycle.
346 assert!(graph.add_edge(c.value(), a.value(), Dep::capture).is_err());
347
348 // Drop graph handle so we can drop vertices without deadlocking
349 drop(graph);
350
351 drop(b);
352
353 // If b's destructor correctly ran correctly we can now add an edge from c to a.
354 assert!(
355 get_dependency_graph()
356 .add_edge(c.value(), a.value(), Dep::capture)
357 .is_ok()
358 );
359 }
360
361 /// Test creating a cycle, then panicking.
362 #[test]
363 #[should_panic]
364 fn test_mutex_id_conflict() {
365 let ids = [MutexId::new(), MutexId::new(), MutexId::new()];
366
367 for i in 0..3 {
368 let _first_lock = ids[i].get_borrowed();
369 let _second_lock = ids[(i + 1) % 3].get_borrowed();
370 }
371 }
372
373 /// Fuzz the global dependency graph by fake-acquiring lots of mutexes in a valid order.
374 ///
375 /// This test generates all possible forward edges in a 100-node graph consisting of natural
376 /// numbers, shuffles them, then adds them to the graph. This will always be a valid directed,
377 /// acyclic graph because there is a trivial order (the natural numbers) but because the edges
378 /// are added in a random order the DiGraph will still occassionally need to reorder nodes.
379 #[test]
380 fn fuzz_mutex_id() {
381 const NUM_NODES: usize = 100;
382
383 let ids: Vec<MutexId> = (0..NUM_NODES).map(|_| Default::default()).collect();
384
385 let mut edges = Vec::with_capacity(NUM_NODES * NUM_NODES);
386 for i in 0..NUM_NODES {
387 for j in i..NUM_NODES {
388 if i != j {
389 edges.push((i, j));
390 }
391 }
392 }
393
394 edges.shuffle(&mut thread_rng());
395
396 for (x, y) in edges {
397 // Acquire the mutexes, smallest first to ensure a cycle-free graph
398 let _ignored = ids[x].get_borrowed();
399 let _ = ids[y].get_borrowed();
400 }
401 }
402}