Age | Commit message (Collapse) | Author |
|
This factors pr_hex_bytes(), a new printbuf-style helper, out from
hex_string and adds it to pretty-printers.c.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This adds options to printbuf for specifying whether units should be
printed raw (default) or with human readable units, and for controlling
whether human-readable units should be base 2 (default), or base 10.
This also adds new helpers that obey these options:
- pr_human_readable_u64
- pr_human_readable_s64
These obey printbuf->si_units
- pr_units_u64
- pr_units_s64
These obey both printbuf-human_readable_units and printbuf->si_units
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This patch adds two new features to printbuf for structured formatting:
- Indent level: the indent level, as a number of spaces, may be
increased with pr_indent_add() and decreased with pr_indent_sub().
Subsequent lines, when started with pr_newline() (not "\n", although
that may change) will then be intended according to the current
indent level. This helps with pretty-printers that structure a large
amonut of data across multiple lines and multiple functions.
- Tabstops: Tabstops may be set by assigning to the printbuf->tabstops
array.
Then, pr_tab() may be used to advance to the next tabstop, printing
as many spaces as required - leaving previous output left justified
to the previous tabstop. pr_tab_rjust() advances to the next tabstop
but inserts the spaces just after the previous tabstop - right
justifying the previously-outputted text to the next tabstop.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This makes printbufs optionally heap allocated: a printbuf initialized
with the PRINTBUF initializer will automatically heap allocate and
resize as needed.
Allocations are done with GFP_KERNEL: code should use e.g.
memalloc_nofs_save()/restore() as needed. Since we do not currently have
memalloc_nowait_save()/restore(), in contexts where it is not safe to
block we provide the helpers
printbuf_atomic_inc()
printbuf_atomic_dec()
When the atomic count is nonzero, memory allocations will be done with
GFP_NOWAIT.
On memory allocation failure, output will be truncated. Code that wishes
to check for memory allocation failure (in contexts where we should
return -ENOMEM) should check if printbuf->allocation_failure is set.
Since printbufs are expected to be typically used for log messages and
on a best effort basis, we don't return errors directly.
Other helpers provided by this patch:
- printbuf_make_room(buf, extra)
Reallocates if necessary to make room for @extra bytes (not including
terminating null).
- printbuf_str(buf)
Returns a null terminated string equivalent to the contents of @buf.
If @buf was never allocated (or allocation failed), returns a
constant empty string.
- printbuf_exit(buf)
Releases memory allocated by a printbuf.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
printbuf now needs to know the number of characters that would have been
written if the buffer was too small, like snprintf(); this changes
string_get_size() to return the the return value of snprintf().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This converts vsnprintf() to printbufs: instead of passing around raw
char * pointers for current buf position and end of buf, we have a real
type!
This makes the calling convention for our existing pretty printers a lot
saner and less error prone, plus printbufs add some new helpers that
make the code smaller and more readable, with a lot less crazy pointer
arithmetic.
There are a lot more refactorings to be done: this patch tries to stick
to just converting the calling conventions, as that needs to be done all
at once in order to avoid introducing a ton of wrappers that will just
be deleted.
Thankfully we have good unit tests for printf, and they have been run
and are all passing with this patch.
We have two new exported functions with this patch:
- pr_buf(), which is like snprintf but outputs to a printbuf
- vpr_buf, like vsnprintf
These are the actual core print routines now - vsnprintf() is a wrapper
around vpr_buf().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This adds printbufs: a printbuf points to a char * buffer and knows the
size of the output buffer as well as the current output position.
Future patches will be adding more features to printbuf, but initially
printbufs are targeted at refactoring and improving our existing code in
lib/vsprintf.c - so this initial printbuf patch has the features
required for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Delete some obsolete tracepoints, organize alloc tracepoints better,
make a few tracepoints more consistent.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Now that the bucket_alloc_fail tracepoint includes the error code, the
open_bucket_alloc_fail tracepoint is redundant.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This behavior dates from the early, early days of bcache, and upon
further delving appears to not make any sense. The shrinker only works
in terms of 'objects' of unknown size; normalizing to pages only had the
effect of changing the batch size, which we could do directly - if we
wanted; we probably don't. Normalizing to pages meant our batch size was
very small, which seems to have been keeping us from doing as much
shrinking as we should be under heavy memory pressure; this patch
appears to alleviate some OOMs we've been seeing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- bch2_clear_need_discard() was using bch2_trans_relock() incorrectly,
and always bailing out before doing any work - ouch.
- Add a tracepoint that fires every time bch2_do_discards() runs, and
tells us about the work it did
- When too many buckets aren't able to be discarded because they need a
journal commit, bch2_do_discards now flushes the journal.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
For backpointers, we'll need to delete old backpointers before adding
new backpointers - otherwise we'll run into spurious duplicate
backpointer errors.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This adds counters for each of the reasons we may skip allocating a
bucket - we're seeing a bug where we loop endlessly trying to allocate
when we should have plenty of buckets available, so hopefully this will
help us track down why.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Previously, we'd go into an infinite loop when attempting to cache a
bkey in the key cache larger than 128 u64s - since we were only using a
u8 for the size field, it'd get rounded up to 256 then truncated to 0.
Oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- bucket_alloc_fail now indicates whether allocation was nonblocking
- we now return strings, not integers, for alloc reserve.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Also include the number of buckets available, and the number of buckets
awaiting journal commit - and add a sysfs counter, too.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This fixes a regression from "bcachefs: Stash a copy of key being
overwritten in btree_insert_entry". In btree_key_can_insert_cached(), we
may reallocate the key cache key, invalidating pointers previously
returned by peek() - fix it by issuing a transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
The error code when we fail to allocate a node in the btree node cache
doesn't make it to bch2_btree_path_traverse_all(). Instead, we need to
stash a flag in btree_trans so we know we have to take the cannibalize
lock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Some of our tracepoints were calling snprintf("pS") - which does symbol
table lookups - in TP_fast_assign(), which turns out to be a really bad
idea.
This was done because perf trace wasn't correctly printing tracepoints
that use %pS anymore - but it turns out trace-cmd does handle it
correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Updates to non key cache iterators will now be transparently redirected
to the key cache for cached btrees.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This improves the transaction restart tracepoints - adding distinct
tracepoints for all the locations and reasons a transaction might have
been restarted, and ensures that there's a tracepoint for every
transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Symbol decoding, via %ps, isn't supported in userspace - this will also
be faster when we're using trans->fn in the fast path, as with the new
BCH_JSET_ENTRY_log journal messages.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Add a field to bch_dev for the dev_t of the underlying block device -
this fixes a null ptr deref in tracepoints.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This is to help with diagnosing why the btree node can doesn't seem to
be shrinking - we've had issues in the past with granularity/batch size,
since btree nodes are so big.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When support for snapshots was merged, export operations weren't
updated yet. This patch adds new filehandle types for bcachefs that
include the subvolume ID and updates export operations for subvolumes -
and also .get_parent, support for which was added just prior to
snapshots.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Instead of unconditionally upgrading read locks to intent locks in
do_bch2_trans_commit(), this patch changes the path that takes write
locks to first trylock, and then if trylock fails check if we have a
conflicting read lock, and restart the transaction if necessary.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
These haven't turned out to be useful
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This splits btree_iter into two components: btree_iter is now the
externally visible componont, and it points to a btree_path which is now
reference counted.
This means we no longer have to clone iterators up front if they might
be mutated - btree_path can be shared by multiple iterators, and cloned
if an iterator would mutate a shared btree_path. This will help us use
iterators more efficiently, as well as slimming down the main long lived
state in btree_trans, and significantly cleans up the logic for iterator
lifetimes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Btree iterator tracepoints should print whether they're for the key
cache.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This patch adds some new tracepoints to the btree iterator code, and
adds new fields to the existing tracepoints - primarily for the iterator
position.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
%pU for printing out pointers to uuids doesn't work in perf trace
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Trying to debug an issue where after traverse_all() we shouldn't have to
traverse any iterators... yet we are
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This uses the kthread_wait_freezable() macro to simplify a lot of the
allocator thread code, along with cleaning up bch2_invalidate_bucket2().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
By changing it to upgrade iterators to intent locks to avoid lock
restarts we can simplify __bch2_btree_node_lock() quite a bit - this
fixes a probable bug where it could potentially drop a lock on an
unrelated error but still succeed instead of causing a transaction
restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We have foreground btree node merging now, and any future btree node
merging improvements are going to be based off of that code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We need to flush the btree key cache when it's too dirty, because
otherwise the shrinker won't be able to reclaim memory - this is done by
journal reclaim. But journal reclaim also kicks btree node writes: this
meant that btree node writes were getting kicked much too often just
because we needed to flush btree key cache keys.
This patch splits journal pins into two different lists, and teaches
journal reclaim to not flush btree node writes when it only needs to
flush key cache keys.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This is needed to ensure we don't deadlock because journal reclaim and
thus memory reclaim isn't making forward progress.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
The transaction restart path traverses all iterators, we don't need to
do it here.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Ensuring the key cache isn't too dirty is critical for ensuring that the
shrinker can reclaim memory.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Symbol decoding was changed from %pf to %ps
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We have a bug where we can get stuck with a process spinning in
transaction restarts - need more information.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Per device copygc threads don't move data to different devices and they
make fragmentation works - they don't make much sense anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This is prep work for the btree key cache: btree iterators will point to
either struct btree, or a new struct bkey_cached.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Transaction restart tracing should probably be overhaulled at some
point.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Forked from drivers/md/bcache, now a full blown COW multi device
filesystem with a long list of features - https://bcachefs.org
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|