summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2022-06-17bcachefs: Redo data_update interfacedata_updateKent Overstreet
This patch significantly cleans up and simplifies the data_update interface. Instead of only being able to specify a single pointer by device to rewrite, we're now able to specify any or all of the pointers in the original extent to be rewrited, as a bitmask. data_cmd is no more: the various pred functions now just return true if the extent should be moved/updated. All the data_update path does is rewrite existing replicas, or add new ones. This fixes a bug where with background compression on replicated filesystems, where rebalance -> data_update would incorrectly drop the wrong old replica, and keep trying to recompress an extent pointer and each time failing to drop the right replica. Oops. Now, the data update path doesn't look at the io options to decide which pointers to keep and which to drop - it only goes off of the data_update_options passed to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-17bcachefs: Improve an error messageKent Overstreet
When inserting a key type that's not valid for a given btree, we should print out which btree we were inserting into. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-17bcachefs: Fix assertion in bch2_dev_list_add_dev()Kent Overstreet
We were only allowing 4 devices in a dev_list, not 16. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-17bcachefs: Improve "copygc requested to run" error messageKent Overstreet
This improves the "copygc requested to run but no buckets found" to show the device that requires copygc to be run on - we'll definitely need to improve this more. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Pull out data_update.cKent Overstreet
This is the start of reorganizing the data IO paths. The plan is to also break apart io.c into data_read.c and data_write.c, and migrate_write will be renamed to the data_update path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Split out dev_buckets_free()Kent Overstreet
Previously, dev_buckets_available() only counted buckets that are eligible to be allocated right now - i.e. buckets that don't have cached data, or need discard, or need gc gens, etc. But most users of this function want to know how many buckets are eligible to be allocated from without moving data around - copygc, allocator striping, which means we should be including cached data buckets etc. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Copygc now uses backpointersKent Overstreet
Previously, copygc needed to walk the entire extents & reflink btrees to find extents that needed to be moved. Now that we have backpointers, this patch implements bch2_evacuate_bucket() in the move code, which copygc now uses for evacuating mostly empty buckets. Also, thanks to the new backpointers code, copygc can now move btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: New on disk format: BackpointersKent Overstreet
This patch adds backpointers: we now have a reverse index from device and offset on that device (specifically, offset within a bucket) back to btree nodes and (non cached) data extents. The first 40 backpointers within a bucket are stored in the alloc key; after that backpointers spill over to the next backpointers btree. This is to help avoid performance regressions from additional btree updates on large streaming workloads. This patch adds all the code for creating, checking and repairing backpointers. The next patch in the series is going to use backpointers for copygc - finally getting rid of the need to scan all extents to do copygc. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Increase max size for btree_trans bump allocatorKent Overstreet
With backpointers, alloc keys have gotten bigger, so we're needing more memory here. We're probably going to need to go with something more sophisticated than a bump allocator, but - let's see if we can avoid doing that just yet. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Reimplement repair for overlapping extentsDaniel Hill
Repair now checks if overlapping extents exist in the same snapshot and calls update_trans_update_extent to do the repair work. Signed-off-by: Daniel Hill <daniel@gluo.nz>
2022-06-16bcachefs: Add a persistent counter for bucket discardsKent Overstreet
Like the previous patch for bucket invalidates, add another counter for a core allocator path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Fix btree node read retriesKent Overstreet
b->written wasn't being reset to 0 in the btree node read retry path, causing decrypting & validation of previously read bsets to not be re-run - ouch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Add a persistent counter for bucket invalidationKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Call bch2_do_invalidates() when going read writeKent Overstreet
Like bch2_do_discards(), we should check if this needs to be done when going rw. Also, add some sysfs code for debugging bucket invalidation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Improved human readable integer parsingKent Overstreet
Printbufs recently switched to using string_get_size() for printing integers in human readable units. This updates __bch2_strtoh() to parse numbers printed by string_get_size() - we now have to handle floating point numbers, and new unit suffixes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Fix freespace initializationKent Overstreet
bch2_dev_freespace_init() was using __bch2_trans_do() incorrectly, and calling bch2_bucket_do_index() with a stale alloc key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: shrinker.to_text() methodsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16bcachefs: Convert to lib/printbuf.cKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-16d_path: prt_path()Kent Overstreet
This implements a new printbuf version of d_path()/mangle_path(), which will replace the seq_buf version. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-07bcachefs: Fix btree node read error pathKent Overstreet
We were forgetting to clear the read_in_flight flag - oops. This also fixes it to not call bch2_fatal_error() before topology repair has had a chance to do its thing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-07bcachefs: Fix btree_and_journal_iterKent Overstreet
We had a bug where btree_and_journal_iter would return the same key twice - after deleting it (perhaps because it was present in both the btree and the journal?) This reworks btree_and_journal_iter to track the current position, much like btree_paths, which makes the logic considerably simpler and more robust. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-07fixup! bcachefs: Gap buffer for journal keysKent Overstreet
2022-06-06bcachefs: Fix for cmd_list_journalKent Overstreet
cmd_list_journal wasn't correctly listing the most recent journal entries as blacklisted - because in the recovery path when just reading the journal, we were failing to add those to the blacklist table. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-05bcachefs: Also log overwrites in journalKent Overstreet
Lately we've been doing a lot of debugging by looking at the journal to see what was changed, and by what code path. This patch adds a new journal entry type for recording overwrites, so that we don't have to search backwards through the journal to see what was being overwritten in order to work out what the triggers were supposed to be doing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-05bcachefs: Refactor journal entry addingKent Overstreet
This takes copying the payload out of bch2_journal_add_entry(), which means we can use it for journal_transaction_name() - also prep work for journalling overwrites. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-05bcachefs: Btree key cache coherencyKent Overstreet
This is the last piece for btree key cache coherency: We already have: - btree iterator code checks the key cache when iterating over a cached btree - update path ensures that updates go to the key cache when updating a cached btree But for iterating over a cached btree to work, we need to ensure that if a key exists in the key cache, it also exists in the btree - otherwise the iterator code will skip past it and not check the key cache. This patch implements that last piece: on a key cache update, if creating a new key, we now also update the underlying btree. This fixes a device removal bug, where deleting alloc info wasn't correctly deleting all keys associated with a given device. It also means we should be able to re-enable the key cache for inodes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-04bcachefs: Add some missing error messagesKent Overstreet
bch2_opt_parse() was failing to generate error messages in error path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-06-03bcachefs: Fix memory corruption in encryption pathKent Overstreet
When do_encrypt() was passed a vmalloc address and the buffer spanned more than a single page, we were encrypting/decrypting completely different pages than the ones intended. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30fixup! bcachefs: Add persistent countersKent Overstreet
2022-05-30bcachefs: bch2_trans_reset_updates()Kent Overstreet
Factor out a new helper. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix error checking in bch2_fs_alloc()Kent Overstreet
One of the init calls had a ; instead of a ?:, and errors after that got dropped - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Print message on btree node read retry successKent Overstreet
Right now, we print an error message on btree node read error, and we print that we're retrying, but we don't explicitly say if the retry succeeded - this makes things a little clearer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: journal_transaction_name is now always onKent Overstreet
We want this option always enabled - it costs practically nothing and it's an essential debugging tool. It's enabled by default, but old filesystems are still out there that need debugging, so let's just force it on and kill the option. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix journal_keys_search() overheadKent Overstreet
Previously, on every btree_iter_peek() operation we were searching the journal keys, doing a full binary search - which was slow. This patch fixes that by saving our position in the journal keys, so that we only do a full binary search when moving our position backwards or a large jump forwards. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Always print when doing journal replay in fsckKent Overstreet
This logging improvement helps see when the previous fsck pass has completed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Rename group to label for remaining strings.Daniel Hill
Signed-off-by: Daniel Hill <daniel@gluo.nz>
2022-05-30bcachefs: Fix encryption path on armKent Overstreet
flush_dcache_page() is not a noop on arm, but we were using virt_to_page() instead of vmalloc_to_page() for an address on the kernel stack - vmalloc memory, leading to an oops in flush_dcache_page(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Switch to key_type_user, not logonKent Overstreet
The only difference key_type_logon and key_type_user is that key_type_logon keys can't be read by userspace. However, userspace has actually been adding keys to both the logon and user keychains, because userspace fsck requires the keychain interface - so we might as well just use user and drop the logon keychain. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: LRU repair tweaksKent Overstreet
- Drop old unneeded parameter for whether we're in initial GC - which was from when btree updates had to be done differently before we went RW. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Delete bch_writepageKent Overstreet
Per Dave Chinner and the xfs folks, .writepage is no longer needed, and it's better not to define it if .writepages is the intended path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Make bch_option compatible with Rust ffiBrett Holman
Rust FFI lacks support for unnamed structs and unions. The space saved in bch_option is not enough to be significant. Signed-off-by: Brett Holman <bholman.devel@gmail.com>
2022-05-30bcachefs: Put btree_trans_verify_sorted() behind debug_check_iteratorsKent Overstreet
This is pretty expensive, and we've tested sufficiently with it now that it doesn't need to be on by default. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix extent mergingKent Overstreet
When merging extents, we have to check that we won't overflow size fields in any CRC entries - but the check for this was wrong, because in the loop it was in we weren't keeping a pointer to the (packed, encoded) CRC field. Fix this by moving it to its own loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Improve invalid bkey error messageKent Overstreet
Bkeys have gotten a lot bigger since this code was written and now are often formatted across multiple lines - while the reason a bkey is invalid will still be short and fit on a single line. This patch prints the error bfore the bkey, making it a bit more readable. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Fix journal_iters_fix()Kent Overstreet
journal_iters_fix() was incorrectly rewinding iterators past keys they had already returned, leading to those keys being double counted in the bch2_gc() path - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Go RW before bch2_check_lrus()Kent Overstreet
btree updates before going RW are expensive if they're in random order, since they use the list of keys for journal replay to insert, which is just a gap buffer. This patch improves the bucket invalidate path so that if bch2_check_lrus() hasn't finished it only prints warnings instead of doing an emergency shutdown, which means we can now set BCH_FS_MAY_GO_RW before bch2_check_lrus(). Also, the filesystem state bits are reorganized a bit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Add persistent countersDaniel Hill
This adds a new superblock field for persisting counters and adds a sysfs interface in counters/ exposing these counters. The superblock field is ignored by older versions letting us avoid an on disk version bump. Each sysfs file outputs a counter that tracks since filesystem creation and a counter for the current mount session. Signed-off-by: Daniel Hill <daniel@gluo.nz>
2022-05-30bcachefs: Tracepoint improvementsKent Overstreet
Delete some obsolete tracepoints, organize alloc tracepoints better, make a few tracepoints more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Don't kick journal reclaim unless low on spaceKent Overstreet
We shouldn't kick journal reclaim unnecessarily, it's got its own timer for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-05-30bcachefs: Lock ordering fixKent Overstreet
Can't take btree node locks while holding btree_reserve_cache_lock - it would be nice if we could check this with lockdep. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>