Age | Commit message (Collapse) | Author |
|
In journal replay, we weren't immediately dropping journal pins when we
start doing updates that ewern't from journal replay - leading to
journal reclaim getting stuck.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Now that the bucket_alloc_fail tracepoint includes the error code, the
open_bucket_alloc_fail tracepoint is redundant.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This adds error logging to a bunch of functions in fsck.c - in fsck,
reduntant error messages is probably better than not enough.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
If the dirent an inode points to doesn't exist, we shouldn't be
returning an error - just 0/false.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When we detect a filesystem inconsistency, we should include the
relevent keys in the error message. This patch adds a parameter to pass
the key with the lru entry to bch2_lru_delete(), so that it can be
printed.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When many journal replay keys have been overwritten,
bch2_journal_keys_peek() was taking excessively long to scan before it
found a key to return.
Fix this by introducing bch2_journal_keys_peek_upto() which takes a
parameter for the end of the range we want, so that we can terminate the
search much sooner, and replace all uses of bch2_journal_keys_peek()
with peek_upto() or peek_slot().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Error messages should always print out the full key when available -
this gives us a starting point when looking through the journal to debug
what went wrong.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
It's an error if a bucket is in state BCH_DATA_cached but not on the LRU
btree - i.e io_time[READ] == 0 - so, make sure it's set before adding
it.
Also, make some of the LRU code a bit clearer and more direct.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This gets us better error messages.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This patch updates bch2_open_buckets_to_text() to include the device and
bucket the open_bucket owns.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
In journal_entry_add(), we were repeatedly scanning the journal entries
radix tree to scan for old entries that can be freed, with O(n^2)
behaviour. This patch tweaks things to remember the previous last_seq,
so we don't have to scan for entries to free from the start.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We start doing allocations before the GC thread is created, which means
we need to check for that to avoid a null ptr deref.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We now pass a rw argument to .key_invalid methods so they can trigger
assertions for updates but not on existing keys. We shouldn't trigger
these extra assertions in journal replay - this patch changes the
transaction commit path accordingly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- We weren't clearing the LRU btree
- bch2_alloc_read() runs before bch2_check_alloc_key() deletes alloc
keys for devices/buckets that don't exists, so it needs to check for
that
- bch2_check_lrus() needs to check that buckets exists
- improve some error messages
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
These showed up when building for mips.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
New helper, for deleting extents.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
With backpointers this doesn't work anymore - backpointers always need
to be updated to point to the new extent position.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We need to ensure that work structs in bch_fs always get initialized -
otherwise an error in filesystem initialization can pop a warning in the
workqueue code when we try to cancel a work struct that wasn't
initialized.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Previously, the journal read path used a linked list for storing the
journal entries we read from disk. But there's been a bug that's been
causing journal_flush_delay to incorrectly be set to 0, leading to far
more journal entries than is normal being written out, which then means
filesystems are no longer able to start due to the O(n^2) behaviour of
inserting into/searching that linked list.
Fix this by switching to a radix tree.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
When there weren't any keys in the journal there's no need to allocate
the buffer - but doing that causes a spurious -ENOMEM.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Previously, we were missing accounting for buckets in need_gc_gens and
need_discard states. This matters because buckets in those states need
other btree operations done before they can be used, so they can't be
conuted when checking current number of free buckets against the
allocation watermark.
Also, we weren't directly counting free buckets at all. Now, data type 0
== BCH_DATA_free, and free buckets are counted; this means we can get
rid of the separate (poorly defined) count of unavailable buckets.
This is a new on disk format version, with upgrade and fsck required for
the accounting changes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We're currently debugging an issue with discards not getting run; this
patch adds a manual trigger so we can then watch the tracepoint while it
runs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- We were failing to start topology repair, because we hadn't set the
superblock flag indicating it needed to run
- set_node_min() forget to update the btree node's key
- bch2_gc_alloc_reset() didn't reset data type, leading to inserting an
invalid key that was empty but had nonzero data type
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This gets us better error messages.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
.key_invalid is a better place for this assertion.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
In the future printbufs will be mempool-ified, so we shouldn't be using
more than one at a time if we don't have to.
This also fixes an extra trailing newline.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
We've been seeing this error in fsck and we weren't able to track down
where it came from - but now that .key_invalid methods take a rw
argument, we can safely check for this.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
In check_extents() and check_dirents(), we're working towards only
handling transaction restarts in one place, at the top level - but we're
not there yet. check_i_sectors() and check_subdir_count() handle
transaction restarts locally, which means the iterator for the
dirent/extent is left unlocked (should_be_locked == 0), leading to
asserts popping when we go to do updates.
This patch hacks around this for now, until we can delete the offending
code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This adds a new parameter to .key_invalid() methods for whether the key
is being read or written; the idea being that methods can do more
aggressive checks when a key is newly created and being written, when we
wouldn't want to delete the key because of those checks.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- Move checks for whether the device & bucket are valid from the
.key_invalid method to bch2_check_alloc_key(). This is because
.key_invalid() is called on keys that may no longer exist (post
journal replay), which is a problem when removing/resizing devices.
- We weren't checking the need_discard btree to ensure that every set
bucket has a corresponding alloc key. This refactors the code for
checking the freespace btree, so that it now checks both.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Btree updates before we go RW work by inserting into the array of keys
that journal replay will insert - but inserting into a flat array is
O(n), meaning if btree_gc needs to update many alloc keys, we're O(n^2).
Fortunately, the updates btree_gc does happens in sequential order,
which means a gap buffer works nicely here - this patch implements a gap
buffer for journal keys.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This behavior dates from the early, early days of bcache, and upon
further delving appears to not make any sense. The shrinker only works
in terms of 'objects' of unknown size; normalizing to pages only had the
effect of changing the batch size, which we could do directly - if we
wanted; we probably don't. Normalizing to pages meant our batch size was
very small, which seems to have been keeping us from doing as much
shrinking as we should be under heavy memory pressure; this patch
appears to alleviate some OOMs we've been seeing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
mark_stripe_bucket() was busted; it was using @new unitialized.
Also, clean up all the gc mark functions, and convert them to the same
style.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This neatly avoids bugs where we fail partway through initializing a new
filesystem, if we just don't write out partly-initialized state.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
With printbufs, it's now easy to build up multi-line log messages and
emit them with one call, which is good because it prevents multiple
multi-line log messages from getting Interspersed in the log buffer;
this patch also improves the formatting and converts it to latest style.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Trivial cleanup.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
In a few places we were passing a variable to pr_buf() for the format
string - oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
- bch2_clear_need_discard() was using bch2_trans_relock() incorrectly,
and always bailing out before doing any work - ouch.
- Add a tracepoint that fires every time bch2_do_discards() runs, and
tells us about the work it did
- When too many buckets aren't able to be discarded because they need a
journal commit, bch2_do_discards now flushes the journal.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This patch introduces bch2_alloc_to_v4_mut() which returns a
bkey_i_alloc_v4 *, which then can be passed to bch2_trans_update()
directly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This introduces a new alloc key which doesn't use varints. Soon we'll be
adding backpointers and storing them in alloc keys, which means our
pack/unpack workflow for alloc keys won't really work - we'll need to be
mutating alloc keys in place.
Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that
converts older types of alloc keys to v4 if needed.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
For backpointers, we'll need to delete old backpointers before adding
new backpointers - otherwise we'll run into spurious duplicate
backpointer errors.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
For backpointers, we need to switch the order triggers are run in: we
need to run triggers for deletions/overwrites before triggers for
inserts.
To avoid breaking the reflink triggers, this patch moves deleting of
indirect extents with refcount=0 to their triggers, instead of doing it
when we update those keys.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Print bucket:offset when the filesystem is online; this makes debugging
easier when correlating with alloc updates.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Add a new helper for logging messages to the journal - a new debugging
tool, an alternative to trace_printk().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|
|
This fixes a bug where __bch2_btree_node_update_key() wasn't clearing
should_be_locked, leading to bch2_btree_path_traverse() always failing -
all callers of btree_path_make_mut() want should_be_locked cleared, so
do it there.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
|