summaryrefslogtreecommitdiff
path: root/fs/bcachefs/alloc_background.c
AgeCommit message (Collapse)Author
2024-04-13bcachefs: bch2_trans_unlock() must always be followed by relock() or begin()Kent Overstreet
We're about to add new asserts for btree_trans locking consistency, and part of that requires that aren't using the btree_trans while it's unlocked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13bcachefs: member helper cleanupsKent Overstreet
Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13bcachefs: prt_printf() now respects \r\n\tKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-02bcachefs: Check for bad needs_discard before doing discardKent Overstreet
In the discard worker, we were failing to validate the bucket state - meaning a corrupt needs_discard btree could cause us to discard a bucket that we shouldn't. If check_alloc_info hasn't run yet we just want to bail out, otherwise it's a filesystem inconsistent error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-18bcachefs: Improve bch2_fatal_error()Kent Overstreet
error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-17bcachefs: Fix nested transaction restart handling in bch2_bucket_gens_init()Kent Overstreet
Nested transaction restart handling is typically best avoided; when the inner context handles a transaction restart it invalidates the outer transaction context, so we need to make sure to return a transaction_restart_nested error. This code wasn't doing that, and hit the assertion in for_each_btree_key() that checks for that via trans->restart_count. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: reconstruct_alloc cleanupKent Overstreet
Now that we've got the errors_silent mechanism, we don't have to check if the reconstruct_alloc option is set all over the place. Also - users no longer have to explicitly select fsck and fix_errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: Split out discard fastpathKent Overstreet
Buckets usually can't be discarded until the transaction that made them empty has been committed in the journal. Tracing has indicated that we're queuing the discard worker excessively, only for it to skip over many buckets that are still waiting on a journal commit, discarding only one or two buckets per iteration. We want to switch to only queuing the discard worker after a journal flush write, but there's an important optimization we need to preserve: if a bucket becomes empty and it was never committed in the journal while it was in use, we want to discard it and reuse it right away - since overwriting it before the previous writes are flushed from the device cache eans those writes only cost bus bandwidth. So, this patch implements a fast path for buckets that can be discarded right away. We need new locking between the two discard workers; the new list of buckets being discarded provides that locking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: bch2_trigger_alloc() handles state changes betterKent Overstreet
bch2_trigger_alloc() kicks off certain tasks on bucket state changes; e.g. triggering the bucket discard worker and the invalidate worker. We've observed the discard worker running too often - most runs it doesn't do any work, according to the tracepoint - so clearly, we're kicking it off too often. This adds an explicit statechange() macro to make these checks more precise. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-24bcachefs: discard path uses unlock_long()Kent Overstreet
Some (bad) devices can have really terrible discard latency; we don't want them blocking memory reclaim and causing warnings. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: Avoid flushing the journal in the discard pathKent Overstreet
When issuing discards, we may need to flush the journal if there's too many buckets that can't be discarded until a journal flush. But the heuristic was bad; we should be comparing the number of buckets that need to flushes against the number of free buckets, not the number of buckets we saw. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: helpers for printing data typesKent Overstreet
We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: BTREE_TRIGGER_ATOMICKent Overstreet
Add a new flag to be explicit about when we're running atomic triggers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: drop to_text code for obsolete bps in alloc keysKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: fsck_err()s don't need to manually check c->sb.version anymoreKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: unify alloc triggerKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: move bch2_mark_alloc() to alloc_background.cKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: trans_mark now takes bkey_sKent Overstreet
Prep work for disk space accounting rewrite: we're going to want to use a single callback for both of our current triggers, so we need to change them to have the same type signature first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: btree_iter -> btree_path_idx_tKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_member_device_rcu() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_member_device() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_btree_key() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_btree_key_upto() -> for_each_btree_key_old_upto()Kent Overstreet
And for_each_btree_key2_upto -> for_each_btree_key_upto Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch_err_(fn|msg) check if should printKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill for_each_btree_key()Kent Overstreet
for_each_btree_key() handles transaction restarts, like for_each_btree_key2(), but only calls bch2_trans_begin() after a transaction restart - for_each_btree_key2() wraps every loop iteration in a transaction. The for_each_btree_key() behaviour is problematic when it leads to holding the SRCU lock that prevents key cache reclaim for an unbounded amount of time - there's no real need to keep it around. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Explicity go RW for fsckKent Overstreet
This eliminates a lot of BCH_TRANS_COMMIT_lazy_rw flags, and is less error prone. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: count_event()Kent Overstreet
Small helper for event counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush()Kent Overstreet
More accurate naming. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Refactor bch2_check_alloc_to_lru_ref()Kent Overstreet
This code was somewhat convoluted - because originally bch2_lru_set() could modify the LRU index if there was a collision. That's no longer the case, so the "create LRU entry" path has no reason to update the alloc key, so we can separate the handling of the two fsck errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: New bucket sector count helpersKent Overstreet
This introduces bch2_bucket_sectors() and bch2_bucket_sectors_dirty(), prep work for separately accounting stripe sectors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Don't use update_cached_sectors() in bch2_mark_alloc()Kent Overstreet
bch2_update_cached_sectors_list() is closer to how the new disk space accounting works, called from trans_mark(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename BTREE_INSERT flagsKent Overstreet
BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix locking when checking freespace btreeKent Overstreet
On transaction restart, we weren't re-validating the hole we saw. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04bcachefs: use swab40 for bch_backpointer.bucket_offset bitfieldBrian Foster
The bucket_offset field of bch_backpointer is a 40-bit bitfield, but the bch2_backpointer_swab() helper uses swab32. This leads to inconsistency when an on-disk fs is accessed from an opposite endian machine. As it turns out, we already have an internal swab40() helper that is used from the bch_alloc_v4 swab callback. Lift it into the backpointers header file and use it consistently in both places. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04bcachefs: byte order swap bch_alloc_v4.fragmentation_lru fieldBrian Foster
A simple test to populate a filesystem on one CPU architecture and fsck on an arch of the opposite byte order produces errors related to the fragmentation LRU. This occurs because the 64-bit fragmentation_lru field is not byte-order swapped when reads detect that the on-disk/bset key values were written in opposite byte-order of the current CPU. Update the bch2_alloc_v4 swab callback to handle fragmentation_lru as is done for other multi-byte fields. This doesn't affect existing filesystems when accessed by CPUs of the same endianness because the ->swab() callback is only called when the bset flags indicate an endianness mismatch between the CPU and on-disk data. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04bcachefs: Ensure copygc does not spinKent Overstreet
If copygc does no work - finds no fragmented buckets - wait for a bit of IO to happen. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-01bcachefs: Enumerate fsck errorsKent Overstreet
This patch adds a superblock error counter for every distinct fsck error; this means that when analyzing filesystems out in the wild we'll be able to see what sorts of inconsistencies are being found and repair, and hence what bugs to look for. Errors validating bkeys are not yet considered distinct fsck errors, but this patch adds a new helper, bkey_fsck_err(), in order to add distinct error types for them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: bch2_btree_id_str()Kent Overstreet
Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Correctly initialize new buckets on device resizeKent Overstreet
bch2_dev_resize() was never updated for the allocator rewrite with persistent freelists, and it wasn't noticed because the tests weren't running fsck - oops. Fix this by running bch2_dev_freespace_init() for the new buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New superblock section members_v2Hunter Shaffer
members_v2 has dynamically resizable entries so that we can extend bch_member. The members can no longer be accessed with simple array indexing Instead members_v2_get is used to find a member's exact location within the array and returns a copy of that member. Alternatively member_v2_get_mut retrieves a mutable point to a member. Signed-off-by: Hunter Shaffer <huntershaffer182456@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Heap allocate btree_transKent Overstreet
We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix W=12 build errorsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix -Wformat in bch2_bucket_gens_invalid()Nathan Chancellor
When building bcachefs for 32-bit ARM, there is a compiler warning in bch2_bucket_gens_invalid() due to use of an incorrect format specifier: fs/bcachefs/alloc_background.c:530:10: error: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Werror,-Wformat] 529 | prt_printf(err, "bad val size (%lu != %zu)", | ~~~ | %zu 530 | bkey_val_bytes(k.k), sizeof(struct bch_bucket_gens)); | ^~~~~~~~~~~~~~~~~~~ fs/bcachefs/util.h:223:54: note: expanded from macro 'prt_printf' 223 | #define prt_printf(_out, ...) bch2_prt_printf(_out, __VA_ARGS__) | ^~~~~~~~~~~ On 64-bit architectures, size_t is 'unsigned long', so there is no warning when using %lu but on 32-bit architectures, size_t is 'unsigned int'. Use '%zu', the format specifier for 'size_t', to eliminate the warning. Fixes: 4be0d766a7e9 ("bcachefs: bucket_gens btree") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix -Wformat in bch2_alloc_v4_invalid()Nathan Chancellor
When building bcachefs for 32-bit ARM, there is a compiler warning in bch2_alloc_v4_invalid() due to use of an incorrect format specifier: fs/bcachefs/alloc_background.c:246:30: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] 245 | prt_printf(err, "bad val size (%u > %lu)", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | %u 246 | alloc_v4_u64s(a.v), bkey_val_u64s(k.k)); | ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ fs/bcachefs/bkey.h:58:27: note: expanded from macro 'bkey_val_u64s' 58 | #define bkey_val_u64s(_k) ((_k)->u64s - BKEY_U64s) | ^ fs/bcachefs/util.h:223:54: note: expanded from macro 'prt_printf' 223 | #define prt_printf(_out, ...) bch2_prt_printf(_out, __VA_ARGS__) | ^~~~~~~~~~~ This expression is of type 'size_t'. On 64-bit architectures, size_t is 'unsigned long', so there is no warning when using %lu but on 32-bit architectures, size_t is 'unsigned int'. Use '%zu', the format specifier for 'size_t' to eliminate the warning. Fixes: 11be8e8db283 ("bcachefs: New on disk format: Backpointers") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: __bch2_btree_insert() -> bch2_btree_insert_trans()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert more code to bch_err_msg()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill stripe check in bch2_alloc_v4_invalid()Kent Overstreet
Since we set bucket data type to BCH_DATA_stripe based on the data pointer, not just the stripe pointer, it doesn't make sense to check for no stripe in the .key_invalid method - this is a situation that shouldn't happen, but our other fsck/repair code handles it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Always check alloc data typeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>