bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-03-24	trace_sched_switch -> trace_sched_wakinglatency_debug	Kent Overstreet

2025-03-24	sched_wakeup_backtrace debugfs	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24	trace_sched_wakeup_backtrace	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24	time_stats: report information in json format	Darrick J. Wong
	Export json versions of time statistics information. Given the tabular nature of the numbers exposed, this will make it a lot easier for higher (than C) level languages (e.g. python) to import information without needing to write yet another clumsy string parser. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24	time_stats: Promote to lib/	Kent Overstreet
	Library code from bcachefs for tracking latency measurements. The main interface is time_stats_update(stats, start_time); which collects a new event with an end time of the current time. It features percpu buffering of input values, making it very low overhead, and nicely formatted output to printbufs or seq_buf. Sample output, from the bcache conversion: root@moria-kvm:/sys/fs/bcache/bdaedb8c-4554-4dd2-87e4-276e51eb47cc# cat internal/btree_sort_times count: 6414 since mount recent duration of events min: 440 ns max: 1102 us total: 674 ms mean: 105 us 102 us stddev: 101 us 88 us time between events min: 881 ns max: 3 s mean: 7 ms 6 ms stddev: 52 ms 6 ms Cc: Darrick J. Wong <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Coly Li <colyli@suse.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2025-03-07	eytzinger: Promote to include/linux/	Kent Overstreet
	eytzinger trees are a faster alternative to binary search. They're a bit more expensive to setup, but lookups perform much better assuming the tree isn't entirely in cache. Binary search is a worst case scenario for branch prediction and prefetching, but eytzinger trees have children adjacent in memory and thus we can prefetch before knowing the result of a comparison. An eytzinger tree is a binary tree laid out in an array, with the same geometry as the usual binary heap construction, but used as a search tree instead. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	mean and variance: Promote to lib/math	Kent Overstreet
	Small statistics library, for taking in a series of value and computing mean, weighted mean, standard deviation and weighted deviation. The main use case is for statistics on latency measurements. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: Daniel Hill <daniel@gluo.nz> Cc: Darrick J. Wong <djwong@kernel.org>
2025-03-07	time_stats: allow custom epoch names	Darrick J. Wong
	Let callers of time_stats_to_seq_buf define the epoch name; "mount" doesn't make sense generally. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	time_stats: don't print any output if event count is zero	Darrick J. Wong
	There's no point in printing an empty report for no data, so add a flag that allows us to do that. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	time_stats: report lifetime of the stats object	Darrick J. Wong
	Capture the initialization time of the time_stats object so that we can report how long the counter has been observing data. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: bch2_time_stats_to_seq_buf()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: eytzinger1_{next,prev} cleanup	Andreas Gruenbacher
	The eytzinger code was previously relying on the following wrap-around properties and their "eytzinger0" equivalents: eytzinger1_prev(0, size) == eytzinger1_last(size) eytzinger1_next(0, size) == eytzinger1_first(size) However, these properties are no longer relied upon and no longer necessary, so remove the corresponding asserts and forbid the use of eytzinger1_prev(0, size) and eytzinger1_next(0, size). This allows to further simplify the code in eytzinger1_next() and eytzinger1_prev(): where the left shifting happens, eytzinger1_next() is trying to move i to the lowest child on the left, which is equivalent to doubling i until the next doubling would cause it to be greater than size. This is implemented by shifting i to the left so that the most significant bits align and then shifting i to the right by one if the result is greater than size. Likewise, eytzinger1_prev() is trying to move to the lowest child on the right; the same applies here. The 1-offset in (size - 1) in eytzinger1_next() isn't needed at all, but the equivalent offset in eytzinger1_prev() is surprisingly needed to preserve the 'eytzinger1_prev(0, size) == eytzinger1_last(size)' property. However, since we no longer support that property, we can get rid of these offsets as well. This saves one addition in each function and makes the code less confusing. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: convert eytzinger sort to be 1-based (2)	Andreas Gruenbacher
	In this second step, transform the eytzinger indexes i, j, and k in eytzinger1_sort_r() from 0-based to 1-based. This step looks a bit messy, but the resulting code is slightly better. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: convert eytzinger sort to be 1-based (1)	Andreas Gruenbacher
	In this first step, convert the eytzinger sort functions to use 1-based primitives. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: convert eytzinger0_find to be 1-based	Andreas Gruenbacher
	Several of the algorithms on eytzinger trees are implemented in terms of the eytzinger0 primitives. However, those algorithms can just as easily be expressed in terms of the eytzinger1 primitives, and that leads to better and easier to understand code. Start by converting eytzinger0_find(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: Add eytzinger0_find self test	Andreas Gruenbacher
	Function eytzinger0_find() isn't currently covered, so add a self test. We can rely on eytzinger0_find_le() here because it is being tested independently. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: add eytzinger0_find_ge self test	Andreas Gruenbacher
	Add an eytzinger0_find_ge() self test similar to eytzinger0_find_gt(). Note that this test requires eytzinger0_find_ge() to return the first matching element in the array in case of duplicates. To prevent bisection errors, we only add this test after strenghening the original implementation (see the previous commit). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: implement eytzinger0_find_ge directly	Andreas Gruenbacher
	Implement eytzinger0_find_ge() directly instead of implementing it in terms of eytzinger0_find_le() and adjusting the result. This turns eytzinger0_find_ge() into a minimum search, so when there are duplicate elements, the result of eytzinger0_find_ge() will now always point at the first matching element. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: implement eytzinger0_find_gt directly	Andreas Gruenbacher
	Instead of implementing eytzinger0_find_gt() in terms of eytzinger0_find_le() and adjusting the result, implement it directly. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: add eytzinger0_find_gt self test	Andreas Gruenbacher
	Add an eytzinger0_find_gt() self test similar to eytzinger0_find_le(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: simplify eytzinger0_find_le	Andreas Gruenbacher
	Replace the over-complicated implementation of eytzinger0_find_le() by an equivalent, simpler version. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: convert eytzinger0_find_le to be 1-based	Andreas Gruenbacher
	eytzinger0_find_le() is also easy to concert to 1-based eytzinger (but see the next commit). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: improve eytzinger0_find_le self test	Andreas Gruenbacher
	Rename eytzinger0_find_test_val() to eytzinger0_find_test_le() and add a new eytzinger0_find_test_val() wrapper that calls it. We have already established that the array is sorted in eytzinger order, so we can use the eytzinger iterator functions and check the boundary conditions to verify the result of eytzinger0_find_le(). Only scan the entire array if we get an incorrect result. When we need to scan, use eytzinger0_for_each_prev() so that we'll stop at the highest matching element in the array in case there are duplicates; going through the array linearly wouldn't give us that. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: add eytzinger0_for_each_prev	Andreas Gruenbacher
	Add an eytzinger0_for_each_prev() macro for iterating through an eytzinger array in reverse. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: eytzinger0_find_test improvement	Andreas Gruenbacher
	In eytzinger0_find_test(), remember the smallest element seen so far instead of comparing adjacent array elements. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: eytzinger[01]_test improvement	Andreas Gruenbacher
	In eytzinger[01]_test(), make sure that eytzinger[01]_for_each() iterates over all array elements. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: eytzinger self tests: fix cmp_u16 typo	Andreas Gruenbacher
	Fix an obvious typo in cmp_u16(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: eytzinger self tests: missing newline termination	Andreas Gruenbacher
	pr_info() format strings need to be newline terminated. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: eytzinger self tests: loop cleanups	Andreas Gruenbacher
	The iterator variable of eytzinger0_for_each() loops has been changed to be locally scoped at some point, so remove variables defined outside the loop that are now unused. In addition and for clarity, use a different variable inside those loops where an outside variable would be shadowed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-07	bcachefs: EYTZINGER_DEBUG fix	Andreas Gruenbacher
	When EYTZINGER_DEBUG is defined, <linux/bug.h> needs to be included. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Don't start promotes from bch2_rbio_free()	Kent Overstreet
	we don't want to block completion of the read - starting a promote calls into the write path, which will block. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Bail out early on alloc_nowait data updates	Kent Overstreet
	If a data update doesn't want to block on allocations (promotes, self healing on read error) - check if the allocation would fail before kicking off the data update and calling into the write path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Rework init order in bch2_data_update_init()	Kent Overstreet
	Initialize the write op first, so that in the next patch we can check if the allocator would block (for BCH_WRITE_alloc_nowait ops) and bail out before taking nocow locks/dev refs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Self healing writes are BCH_WRITE_alloc_nowait	Kent Overstreet
	If a drive is failing and we're moving data off of it, we can't necessairly depend on capacity/disk reservation calculations to avoid deadlocking/blocking on the allocator. And, we don't want to queue up infinite self healing moves anyways. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Promotes should use BCH_WRITE_only_specified_devs	Kent Overstreet
	Promotes, like most other internal moves, should only go to the specified target and not fall back to allocating from the full filesystem. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Be stricter in bch2_read_retry_nodecode()	Kent Overstreet
	Now that data_update embeds bch_read_bio, BCH_READ_NODECODE means that the read is embedded in a a data_update - and we can check in the retry path if the extent has changed and bail out. This likely fixes some subtle bugs with read errors and data moves. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: cleanup redundant code around data_update_op initialization	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: bch2_update_unwritten_extent() no longer depends on wbio	Kent Overstreet
	Prep work for improving bch2_data_update_init(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: promote_op uses embedded bch_read_bio	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: data_update now embeds bch_read_bio	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: rbio_init() cleanup	Kent Overstreet
	Move more initialization to rbio_init(), to assist in further cleanups. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: rbio_init_fragment()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Rename BCH_WRITE flags fer consistency with other x-macros enums	Kent Overstreet
	The uppercase/lowercase style is nice for making the namespace explicit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: x-macroize BCH_READ flags	Kent Overstreet
	Will be adding a bch2_read_bio_to_text(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Avoid holding btree locks when blocking on IO	Kent Overstreet
	Read retries are done synchronously, so we definitely shouldn't be holding any locks (even the srcu lock for btree key cache reclaim) when submitting the IO. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: kill bch_read_bio.devs_have	Kent Overstreet
	Dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: bch2_data_update_inflight_to_text()	Kent Overstreet
	Add a new helper for bch2_moving_ctxt_to_text(), which may be used to debug if moving_ios are getting stuck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: BCH_IOCTL_QUERY_COUNTERS	Kent Overstreet
	Add an ioctl for querying counters, the same ones provided in /sys/fs/bcachefs/<uuid>/counters/, but more suitable for a 'bcachefs top' command. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: BCH_COUNTER_bucket_discard_fast	Kent Overstreet
	Add a separate counter for fastpath bucket discards, which don't require a journal flush. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: enum bch_persistent_counters_stable	Kent Overstreet
	Persistent counters, like recovery passes, include a stable enum in their definition - but this was never correctly plumbed. This allows us to add new counters and properly organize them with a non-stable "presentation order", which can also be used in userspace by the new 'bcachefs fs top' tool. Fortunatel, since we haven't yet added any new counters where presentation order ID doesn't match stable ID, this won't cause any reordering issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>