Age | Commit message (Collapse) | Author |
|
bcachefs-tools recently started putting a backup superblock at the end
of the device. This causes a problem if the bucket size doesn't divide
the device size - but we can fix it by just skipping marking that part.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When we switched to using bch2_btree_bset_insert_key() for extents it
turned out it started leaving invalid keys around - of type deleted but
nonzero size - but this is fine (if ugly) because they're never written
out.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
|
|
Do not compile the acl.o target if BCACHEFS_POSIX_ACL is not enabled.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
|
|
Add a field to struct btree_iter for tracking whether it should be
locked - this fixes spurious transaction restarts in
bch2_trans_relock().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This patch adds some new tracepoints to the btree iterator code, and
adds new fields to the existing tracepoints - primarily for the iterator
position.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This helps avoid transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
|
|
|
|
Upcoming refactoring is going to change bch2_trans_update() to start
returning transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We were missing a kthread_should_stop() check in the loop in
bch2_invalidate_buckets(), very occasionally leading to us getting stuck
while shutting down.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When devices have different bucket sizes, we may accumulate a journal
write that doesn't fit on some of our devices - previously, we'd
underflow when calculating space on that device and then everything
would get weird.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This fixes a "disk usage increased without a reservation" bug, when
reflinking compressed extents. Also, there's no good reason for reflink
to be fragmenting extents anyways.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Found by sparse
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Waiting on a btree node write with btree locks held can deadlock, if the
write errors: the write error path has to do do a btree update to drop
the pointer to the replica that errored.
The interior update path has to wait on in flight btree writes before
freeing nodes on disk. Previously, this was done in
bch2_btree_interior_update_will_free_node(), and could deadlock; now, we
just stash a pointer to the node and do it in
btree_update_nodes_written(), just prior to the transactional part of
the update.
|
|
We can't use btree_update_wq becuase btree updates may be waiting on
btree writes to complete.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
If the transactior restarts on a different CPU, it could end up needing
to read in a different btree node, which makes another transaction
restart more likely...
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
|
|
|
|
Journal write errors were racing with the submission path - potentially
causing writes to other replicas to not get submitted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
__bch2_trans_mark_reflink_p wasn't always correctly returning the number
of sectors processed - the new logic is a bit more straightforward
overall too.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We're seeing a bug where inode creates end up spinning in
bch2_inode_create - disabling sharding will simplify what we're testing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
%pU for printing out pointers to uuids doesn't work in perf trace
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We seem to have a bug where the copygc thread ends up spinning and
making the system unusable - this will at least prevent it from locking
up the machine, and it's a good thing to have anyways.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
bch2_btree_iter_peek() won't always return a key - whoops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
|
|
|
|
After unclean shutdown, btree writes may have completed on one device
and not others - and this inconsistency could lead us to writing new
bsets with a gap in our btree node in one of our replicas.
Fortunately, this is only an issue with bsets that are newer than the
most recent journal flush, and we already have a mechanism for detecting
and blacklisting those. We just need to make sure to start new btree
writes after the most recent _non_ blacklisted bset.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We weren't interpreting the flags argument at all.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Also, clean up workqueue usage - we shouldn't be using system
workqueues, pretty much everything we do needs to be on our own
WQ_MEM_RECLAIM workqueues.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
|
|
|
|
|
|
There's a new module parameter, verify_all_btree_replicas, that enables
reading from every btree replica when reading in btree nodes and
comparing them against each other. We've been seeing some strange btree
corruption - this will hopefully aid in tracking it down and catching it
more often.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We need the btree to be in a consistent state before we can rewrite
btree nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
this fixes a valgrind complaint
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Buffered writes may have to increase their disk reservation at btree
update time, due to compression and erasure coding being unpredictable:
O_DIRECT writes should be checking for -ENOSPC, but buffered writes have
already been accepted and should not.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When we delete the dirent an inode points to, we need to zero out the
backpointer fields - this was missed in the RENAME_OVERWRITE case.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Caught by xfstest generic/628
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Currently, we handle multiple overlapping extents in the same
transaction commit by doing fixups in bch2_trans_update() - this patch
extents that to split updates when necessary. The next patch that
changes the reflink code to not fragment extents when making them
indirect will require this.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This fixes a bug where an async O_DIRECT write that required multiple
bch2_write calls could deadlock, because bch2_write runs out of the same
workqueue used for index updates and would block on the io_in_flight
semaphore.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
The current implementation of bch_statfs does not scale the number of
available blocks provided in f_bavail by the reserve factor. This causes
an allocation of a file of this size to fail.
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
|
|
This bug led to push_whiteout() generating whiteouts that failed
bch2_bkey_invalid() due to nonzero length fields - oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Not supposed to pass a null ptr to memcpy (even if the size is 0).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
It was being skipped when hole punching, leading to problems when
splitting compressed extents.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
It's needed when we split an existing compressed extent - we get a null
ptr deref without it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
fs/bcachefs/bset.c edited prefetch macro to add clang support
fs/bcachefs/btree_iter.c bugfix: initialize iter->real_pos in bch2_btree_iter_init for later use
fs/bcachefs/io.c bugfix: eliminated undefined behavior (negative bitshift)
fs/bcachefs/buckets.c bugfix: invert sign to handle 64bit abs()
|