summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-23bcachefs: Zone support for journal codezonesKent Overstreet
For zoned devices, we need to ensure we close buckets we're no longer writing to, since some (flash) devices have a limit on the number of active zones. Also, on startup, if we're going to continue appending to a partially-written bucket we need to query the zone's write pointer. This patch updates the journal code to: - On startup, we now query the write pointer for the bucket cur_idx points to - On startup, we ensure all journal buckets except that the one cur_idx points to are closed - In the journal write path, we factor out journal_close_buckets(), which now increments cur_idx when a bucket fills up so that we can start allocating from the next one - it now also issues the appropriate zone command to close the previous bucket. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: New superblock layout for zoned devices XXX todoKent Overstreet
On zoned devices that don't have any random-write capable zones, we need a different strategy for writing the superblock. This patch implements a path that uses the first two zones for sequentially logging superblock writes. XXX: We still need to do something with the sb_layout struct and make sure it points to the right place, so that superblock buckets get marked correctly Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bucket_capacity()Kent Overstreet
On zoned devices, zone capacity is variable. This patch implements a new data structure (eytzinger search tree) for getting a bucket's capacity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Initial support for zonesKent Overstreet
- On zoned devices, bucket size must match zone size - Factor out bch2_bucket_discard(), which now also issues a zone reset on zoned devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23delalloc btree nodes, wipKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch2_alloc_sectors_trans()Kent Overstreet
New helper, to be used for delayed allocation of btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Record keys for new btree nodes from write pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Split brain detectionKent Overstreet
Use the new bch_member->seq, sb->write_time fields to detect split brain and kick out devices when necessary. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch_member->seqKent Overstreet
Add new fields for split brain detection: - bch_member->seq, which tracks the sequence number of the last superblock write that happened to each member device - bch_sb->write_time, which tracks the time of the last superblock write, to allow detection of when two members have diverged but had the same number of superblock writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Version upgrade doesn't require fsckKent Overstreet
We explicitly track which fsck passes need to run on version upgrade, no need to run all of fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: check for failure to downgradeKent Overstreet
With the upcoming member seq patch, it's now critical that we don't ever write to a superblock that hasn't been version downgraded - failure to update member seq fields will cause split brain detection to fire erroniously. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Fix nochanges/read_only interactionKent Overstreet
nochanges means "we cannot issue writes at all"; it's possible to go into a pseudo read-write mode where we pin dirty metadata in memory, which is used for fsck in dry run mode and doing journal replay on a read only mount, but we do not want to allow an actual read-write mount in nochanges mode. But we do always want to allow early read-write, during recovery - this patch clarifies that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Convert split_devs() to darrayKent Overstreet
Bit of cleanup & modernization: also moving this code to util.c, it'll be used by userspace as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23MAINTAINERS: Update my email addressKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: for_each_keylist_key() declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bkey_for_each_ptr() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Check journal entries for invalid keys in trans commit pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Fixes for rust bindgenKent Overstreet
bindgen doesn't seem to like u128 or DECLARE_FLEX_ARRAY(), but we can hack around them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: check_directory_structure() can now be run onlineKent Overstreet
Now that we have dynamically resizable btree paths, check_directory_structure() can check one path - inode up to the root - in a single transaction. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Fix reattach_inode() for snapshotsKent Overstreet
reattach_inode() was broken w.r.t. snapshots - we'd lookup the subvolume to look up lost+found, but if we're in an interior node snapshot that didn't make any sense. Instead, this adds a dirent path for creating in a specific snapshot, skipping the subvolume; and we also make sure to create lost+found in the root snapshot, to avoid conflicts with lost+found being created in overlapping snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch2_btree_trans_peek_slot_updatesKent Overstreet
refactoring the BTREE_ITER_WITH_UPDATES code, prep for removing the flag and making it always-on Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch2_btree_trans_peek_prev_updatesKent Overstreet
bch2_btree_iter_peek_prev() now supports BTREE_ITER_WITH_UPDATES Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch2_btree_trans_peek_updatesKent Overstreet
refactoring the BTREE_ITER_WITH_UPDATES code, prep for removing the flag and making it always-on Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Kill GFP_NOFAIL usage in readahead pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Improve the nopromote tracepointKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: mean and variance: fix kernel-doc for function paramsRandy Dunlap
Add missing function parameter descriptions in mean_and_variance.c. The also eliminates the "Excess function parameter" warnings. Prevents these kernel-doc warnings: mean_and_variance.c:67: warning: Function parameter or member 's' not described in 'mean_and_variance_get_mean' mean_and_variance.c:78: warning: Function parameter or member 's1' not described in 'mean_and_variance_get_variance' mean_and_variance.c:94: warning: Function parameter or member 's' not described in 'mean_and_variance_get_stddev' mean_and_variance.c:108: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_update' mean_and_variance.c:108: warning: Function parameter or member 'x' not described in 'mean_and_variance_weighted_update' mean_and_variance.c:108: warning: Excess function parameter 's1' description in 'mean_and_variance_weighted_update' mean_and_variance.c:108: warning: Excess function parameter 's2' description in 'mean_and_variance_weighted_update' mean_and_variance.c:134: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_get_mean' mean_and_variance.c:143: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_get_variance' mean_and_variance.c:153: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_get_stddev' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Brian Foster <bfoster@redhat.com> Cc: linux-bcachefs@vger.kernel.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Use GFP_KERNEL for promote allocationsKent Overstreet
We already have btree locks dropped here - no need for GFP_NOFS. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Delete dio read alignment checkKent Overstreet
We'll typically fomat devices with the physical blocksize supported, but the logical blocksize will be smaller. There's no real need to be checking the blocksize at the filesystem level, anyways - the block layer has to check this anyways. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: skip journal more often in key cache reclaimKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: clean up some dead fallocate codeBrian Foster
The have_reservation local variable in bch2_extent_fallocate() is initialized to false and set to true further down in the function. Between this two points, one branch of code checks for negative value and one for positive, and nothing ever checks the variable after it is set to true. Clean up some of the unnecessary logic and code. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Make sure allocation failure errors are loggedKent Overstreet
The previous patch fixed a bug in allocation path error handling, and it would've been noticed sooner had it been logged properly. Generally speaking, errors that shouldn't happen in normal operation and are being returned up the stack should be logged: the write path was already logging IO errors, but non IO errors were missed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: drop extra semicolonKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Replace zero-length array with flex-array member and use __counted_byGustavo A. R. Silva
Fake flexible arrays (zero-length and one-element arrays) are deprecated, and should be replaced by flexible-array members. So, replace zero-length array with a flexible-array member in `struct bch_ioctl_fsck_offline`. Also annotate array `devs` with `__counted_by()` to prepare for the coming implementation by GCC and Clang of the `__counted_by` attribute. Flexible array members annotated with `__counted_by` can have their accesses bounds-checked at run-time via `CONFIG_UBSAN_BOUNDS` (for array indexing) and `CONFIG_FORTIFY_SOURCE` (for strcpy/memcpy-family functions). This fixes the following -Warray-bounds warnings: fs/bcachefs/chardev.c: In function 'bch2_ioctl_fsck_offline': fs/bcachefs/chardev.c:363:34: warning: array subscript 0 is outside array bounds of '__u64[0]' {aka 'long long unsigned int[]'} [-Warray-bounds=] 363 | if (copy_from_user(devs, &user_arg->devs[0], sizeof(user_arg->devs[0]) * arg.nr_devs)) { | ^~~~~~~~~~~~~~~~~~ In file included from fs/bcachefs/chardev.c:5: fs/bcachefs/bcachefs_ioctl.h:400:33: note: while referencing 'devs' 400 | __u64 devs[0]; This results in no differences in binary output. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: Use array_size() in call to copy_from_user()Gustavo A. R. Silva
Use array_size() helper, instead of the open-coded version in call to copy_from_user(). Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: kill __bch2_btree_iter_peek_upto_and_restart()Kent Overstreet
dead code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: fsck -> bch2_trans_run()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch2_dirent_lookup() -> lockrestart_do()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: vstruct_for_each() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: for_each_member_device_rcu() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: for_each_member_device() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: for_each_btree_key() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: kill for_each_btree_key_norestart()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: kill for_each_btree_key_old_upto()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: for_each_btree_key_upto() -> for_each_btree_key_old_upto()Kent Overstreet
And for_each_btree_key2_upto -> for_each_btree_key_upto Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: darray_for_each() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: trans_for_each_update() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: qstr_eq()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: bch_err_(fn|msg) check if should printKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: fix userspace build errorsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-23bcachefs: growable btree_pathsKent Overstreet
XXX: we're allocating memory with btree locks held - bad We need to plumb through an error path so we can do allocate_dropping_locks() - but we're merging this now because it fixes a transaction path overflow caused by indirect extent fragmentation, and the resize path is rare. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>