Age | Commit message (Collapse) | Author |
|
Have to be careful with bit fields - when subtracting, this was
overflowing into the write_locking bit.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Taking a write lock will be able to fail, with the new cycle detector -
unless we pass it nofail, which is possible but not preferred.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In the future, with the new deadlock cycle detector, we won't be using
bare six_lock_* anymore: lock wait entries will all be embedded in
btree_trans, and we will need a btree_trans context whenever locking a
btree node.
This patch plumbs a btree_trans to the few places that need it, and adds
two new locking functions
- btree_node_lock_nopath, which may fail returning a transaction
restart, and
- btree_node_lock_nopath_nofail, to be used in places where we know we
cannot deadlock (i.e. because we're holding no other locks).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
six locks are unfair: while a thread is blocked trying to take a write
lock, new read locks will fail. The new deadlock cycle detector makes
use of our existing lock tracing, so we need to tell it we're holding a
write lock before we take the lock for it to work correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Since we've now got time_stats for lock hold times (per btree
transaction), we don't need this anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes a small memory leak.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Also, do some reorganizing/renaming, convert atomic counters in bch_fs
to persistent counters, and add a few missing counters.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
-BCH_ERR_journal_reclaim_would_deadlock
On failure to get a journal pre-reservation because we're called from
journal reclaim we're not supposed to return a transaction restart error
- this fixes a livelock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This moves the IS_ERR_OR_NULL() check to the inline part, since that's a
fast path event.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It now includes journal_flags.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It now prints the error name when the btree node is an error pointer;
also, don't trace failures when the the btree node is
BCH_ERR_no_btree_node_up.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- Don't decrease BTREE_ITER_MAX when building with CONFIG_LOCKDEP
anymore. The lockdep table sizes are configurable now, we don't need
this anymore.
- btree_trans_too_many_iters() is less conservative now. Previously it
was causing a transaction restart if we had used more than
BTREE_ITER_MAX / 2 paths, change this to BTREE_ITER_MAX - 8.
This helps with excessive transaction restarts/livelocks in the bucket
allocator path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We need to use the right class for some assertions to work correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The upcoming lock cycle detection code will need to know precisely which
locks every btree_trans is holding, including write locks - this patch
updates btree_node_locked_type to include write locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Improve our debugfs output, to help in debugging deadlocks: this shows,
for every btree node we print, the current number of readers/intent
locks/write locks held.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This is just some type safety cleanup.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This patch
- tracks maximum bch2_trans_kmalloc() memory used in btree_transaction_stats
- makes it available in debugfs
- switches bch2_trans_init() to using that for the amount of memory to
preallocate, instead of the parameter passed in
This drastically reduces transaction restarts, and means we no longer
need to track this in the source code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
six_lock_count now counts up whether a write lock held, and this patch
now also correctly counts six_lock->intent_lock_recurse.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Previously, we used two different bit arrays for tracking held btree
node locks. This patch switches to an array of two bit integers, which
will let us track, in a future patch, when we hold a write lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Held btree locks are tracked in btree_path->nodes_locked and
btree_path->nodes_intent_locked. Upcoming patches are going to change
the representation in struct btree_path, so this patch switches to
proper helpers instead of direct access to these fields.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Tidy things up a bit before doing more work in this file.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Start to centralize some of the locking code in a new file; more locking
code will be moving here in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Device labels are represented as pointers in the member info section: we
need to get and then set the label for it to be kept correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This message fires when the data update path races with a foreground
write that overwrote the data that was being moved - this isn't a
concerning event as long as it's not happening too often, so switch it
to a tracepoint.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
-BCH_ERR_transaction_restart_nested
The new convention is that functions that handle transaction restarts
within an existing transaction context should return
-BCH_ERR_transaction_restart_nested when they did so, since they
invalidated the outer transaction context.
This also means bch2_btree_delete_range_trans() is changed to only call
bch2_trans_begin() after a transaction restart, not on every loop
iteration.
This is to fix a bug in fsck, in check_inode() when we truncate an inode
with BCH_INODE_I_SIZE_DIRTY set.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- fsck_inode_rm() wasn't returning BCH_ERR_transaction_restart_nested
- change bch2_trans_verify_not_restarted() to call panic() - we don't
want these errors to be missed
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
iter->k needs to be consistent with iter->pos - required for
bch2_btree_iter_(rewind|advance) to work correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When returning a key from the key cache, in BTREE_ITER_WITH_KEY_CACHE
mode, we don't want to set should_be_locked on iter->path; we're not
returning a key from that path, so we donn't need to, and also since we
traversed the key cache iterator before setting should_be_locked on that
path it might be unlocked (if we unlocked, bch2_trans_relock() won't
have relocked it).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
For debugging the eytzinger search tree code, and low level bkey packing
code, it can be helpful to see things in binary: this patch improves our
helpers for doing so.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This is to fix a livelock in the btree split path described by the
previous patch.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This is prep work for fixing a livelock involving the btree split path.
Currently, bch2_btree_update_start() unconditionally drops btree locks
before allocating btree nodes, then calls bch2_trans_relock() - which
may fail and cause a transaction restart.
It appears multiple threads are attempting to split at the same time,
and then when we call bch2_trans_relock() another thread is holding an
intent lock on the root node, preparing to split, causing us to
restart... then it drops locks, allocates, and relocks, but then we're
holding the root lock... oops
This patch does not fix the bug, but it plumbs btree_trans all the way
through the allocator path so that we can use the same transaction
context and don't have to drop locks. The next patch will be reworking
bch2_btree_update_start.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We should be calling btree_node_mem_ptr_set() before path_level_init(),
since we already touched the key that btree_node_mem_ptr_set() will
modify and path_level_init() will be doing the lookup in the child btree
node we're recursing to.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
New btree nodes dont't have allocation information written until the
node has been written, which means there's a race where the backpointer
for the old node points to a node that no longer exists.
bch2_backpointer_get_node() checks for this, but fsck uses
bch2_backpointer_get_key(); this patch updates
bch2_backpointer_get_key() to call get_node() on not found to
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Topology repair may change btree node min/max keys: when it does so, we
need to always rebuild eytzinger search trees because nodes directly
depend on those values.
This fixes a bug found by the 'kill_btree_node' test, where we'd pop an
assertion in bch2_bset_search_linear().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This is a no brainer, and makes the output of some of our tests easier
to manage.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
For now this is just a BUG_ON() - we may want to change this to return
an error in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Olexa Bilaniuk <obilaniu@gmail.com>
|
|
This improves flush_buf() so that it always returns nonzero when we're
done reading and ready to return to userspace, and so that it returns
the value we want to return to userspace (number of bytes read, if there
wasn't an error).
In the future we'll be better abstracting this mechanism and pulling it
out of bcachefs, and using it to replace seq_file.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- Add a flag, has_indent_or_tabstops, that is set if indent level or
tabstops are set.
- Tabstops can no longer be set by modifying the tabstop array
directly: instead, the new functions are provided:
printbuf_tabstop_push() - add a new tabstop, n spaces after
previous tabstop
printbuf_tabtstop_pop() - remove previous tabstop
printbuf_tabstops_reset() - remove all tabstops
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We were iterating starting at BCACHEFS_ROOT_INO, but snapshots start at
POS_MIN - meaning this code was never getting run.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Reported-by: Olexa Bilaniuk <obilaniu@gmail.com>
|
|
Instead of counting transaction restarts, count when the transaction is
restarted: if bch2_trans_begin() was called when the transaction wasn't
restarted we need to ensure restart_count is still incremented.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Turns out this assertion was something we could legitimately hit - add a
comment describing what's going on, and handle it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We need to turn the flush_buf() thing into a proper API, to replace
seq_file.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We need a way to check if the machinery for handling btree_paths with in
a transaction is behaving reasonably, as it often has not been - we've
had bugs with transaction path overflows caused by duplicate paths and
plenty of other things.
This patch tracks, per transaction fn, the most btree paths ever
allocated by that transaction and makes it available in debugfs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Going to be adding more things to this in the next patch.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This fixes an assertion about unexpected transaction restarts -
bch2_delete_range_trans() handles transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This fixes an assertion in bch2_btree_path_peek_slot().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
bch2_path_put() is supposed to drop paths that aren't needed on
transaction restart, or to hold locks that we're supposed to keep until
transaction commit: but it was failing to free paths in some cases that
it should have, leading to transaction path overflows with lots of
duplicate paths.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|