summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2022-03-12bcachefs: Journal seq now incremented at entry open, not closeKent Overstreet
This patch changes journal_entry_open() to initialize the new journal entry, not __journal_entry_close(). This also means that journal_cur_seq() refers to the sequence number of the last journal entry when we don't have an open journal entry, not the next one. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Drop unneeded journal pin in bch2_btree_update_start()Kent Overstreet
When we do an interior btree update, we create new btree nodes and link them into the btree in memory, but they don't become reachable on disk until later, when btree_update_nodes_written_trans() runs. Updates to the new nodes can thus happen before they're reachable on disk, and if the updates to those new nodes are written before the nodes become reachable, we would then drop the journal pin for those updates before the btree has them. This is what the journal pin in bch2_btree_update_start() was protecting against. However, it's not actually needed because we don't allow subsequent append writes to btree nodes until the node is reachable on disk. Dropping this unneeded pin also fixes a bug introduced by "bcachefs: Journal seq now incremented at entry open, not close" - in the new code, if the journal is completely empty a journal pin list for journal_cur_seq() won't exist. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: bch2_journal_halt() now takes journal lockKent Overstreet
This change is prep work for moving some work from __journal_entry_close() to journal_entry_open(): without this change, journal_entry_open() doesn't know if it's going to be able to open a new journal entry until the cmpxchg loop, meaning it can't create the new journal pin entry and update other global state because those have to be done prior to the cmpxchg opening the new journal entry. Fortunately, we don't call bch2_journal_halt() from interrupt context. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Kill JOURNAL_NEED_WRITEKent Overstreet
This replaces the journal flag JOURNAL_NEED_WRITE with per-journal buf state - more explicit, and solving a race in the old code that would lead to entries being opened and written unnecessarily. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Delete some dead journal codeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix a use after freeKent Overstreet
This fixes a regression from "bcachefs: Stash a copy of key being overwritten in btree_insert_entry". In btree_key_can_insert_cached(), we may reallocate the key cache key, invalidating pointers previously returned by peek() - fix it by issuing a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix a memory leakKent Overstreet
This fixes a regression from "bcachefs: Heap allocate printbufs" - bch2_sb_field_validate() was leaking an error string. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix race leading to btree node write getting stuckKent Overstreet
Checking btree_node_may_write() isn't atomic with the other btree flags, dirty and need_write in particular. There was a rare race where we'd unblock a node from writing while __btree_node_flush() was setting need_write, and no thread would notice that the node was now both able to write and needed to be written. Fix this by adding btree node flags for will_make_reachable and write_blocked that can be checked in the cmpxchg loop in __bch2_btree_node_write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Kill bch2_btree_node_write_cond()Kent Overstreet
bch2_btree_node_write_cond() was only used in one place - this inlines it into __btree_node_flush() and makes the cmpxchg loop actually correct. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Improve btree_node_write_if_need()Kent Overstreet
btree_node_write_if_need() kicks off a btree node write only if need_write is set; this makes the locking easier to reason about by moving the check into the cmpxchg loop in __bch2_btree_node_write(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix locking in btree_node_write_done()Kent Overstreet
There was a rare recursive locking bug, in __bch2_btree_node_write() nowrite path -> btree_node_write_done(), in the path that kicks off another write. This splits out an inner __btree_node_write_done() that expects to be run with the btree node lock held. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Start moving debug info from sysfs to debugfsKent Overstreet
In sysfs, files can only output at most PAGE_SIZE. This is a problem for debug info that needs to list an arbitrary number of times, and because of this limit some of our debug info has been terser and harder to read than we'd like. This patch moves info about journal pins and cached btree nodes to debugfs, and greatly expands and improves the output we return. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Use x-macros for btree node flagsKent Overstreet
This is for adding an array of strings for btree node flag names. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Kill BCH_FS_HOLD_BTREE_WRITESKent Overstreet
This was just dead code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix an assert in the initial GC pathKent Overstreet
With lockdep enabled, we (rarely) hit an assertion in bch2_gc_init_recurse() -> bch2_gc_mark_key() -> bch2_trans_unlock(): this is because initial GC doesn't use btree iterators for btree walking, it does its own thing for various (complicated) reasons - but we don't have to worry about deadlocks because we're only taking read locks. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Don't spin in journal reclaimKent Overstreet
If we're not able to flush anything, we shouldn't keep looping. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix btree path sortingKent Overstreet
In btree_update_interior.c, we were changing a path's level directly - which affects path sort order - without re-sorting paths, leading to assertions when bch2_path_get() verified paths were sorted correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix journal_flush_done()Kent Overstreet
journal_flush_done() was overwriting did_work, thus occasionally returning false when it did do work and occasional assertions in the shutdown sequence because we didn't completely flush the key cache. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Heap allocate printbufsKent Overstreet
This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Convert bch2_pd_controller_print_debug() to a printbufKent Overstreet
Fewer random on-stack char arrays. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Improve debug assertionKent Overstreet
We're hitting a strange bug with transaction paths not being sorted correctly - this dumps transaction paths in the order we thought was sorted, which will hopefully shed some light as to what's going on. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix bch2_journal_pins_to_text()Kent Overstreet
When key cache pins were put onto their own list, we neglected to update bch2_journal_pins_to_text() to print them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Always clear should_be_locked in bch2_trans_begin()Kent Overstreet
bch2_trans_begin() invalidates all iterators, until they're revalidated by calling peek() or traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Run alloc triggers lastKent Overstreet
Triggers can generate additional btree updates - we need to run alloc triggers after all other triggers have run, because they generate updates for the alloc btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: run_one_trigger() now checks journal keysKent Overstreet
Previously, when doing updates and running triggers before journal replay completes, triggers would see the incorrect key for the old key being overwritten - this patch updates the trigger code to check the journal keys when necessary, needed for the upcoming allocator rewrite. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Stash a copy of key being overwritten in btree_insert_entryKent Overstreet
We currently need to call bch2_btree_path_peek_slot() multiple times in the transaction commit path - and some of those need to be updated to also check the keys from journal replay, too. Let's consolidate this and stash the key being overwritten in btree_insert_entry. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Consolidate trigger code a bitKent Overstreet
Upcoming patches are doing more work on the triggers code, this patch just moves code around. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: bch2_trans_mark_key() now takes a bkey_i *Kent Overstreet
We're now coming up with triggers that modify the update being done. A bkey_s_c is const - bkey_i is the correct type to be using here. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix 32 bit buildKent Overstreet
vstruct_bytes() was returning a u64 - it should be a size_t, the corect type for the size of anything that fits in memory. Also replace a 64 bit divide with div_u64(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Improve some btree node read error messagesKent Overstreet
On btree node read error, it's helpful to see what we were trying to read - was it all zeroes? Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Use unlikely() in err_on() macrosKent Overstreet
Should be obviously a good thing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Improve reflink repair codeKent Overstreet
When a reflink pointer points to a missing indirect extent, we replace it with an error key. Instead of replacing the entire reflink pointer with an error key, this patch replaces only the missing range with an error key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Normal update/commit path now works before going RWKent Overstreet
This improves __bch2_trans_commit - early in the recovery process, when we're running btree_gc and before we want to go RW, it now uses bch2_journal_key_insert() to add the update to the list of updates for journal replay to do, instead of btree_gc having to use separate interfaces depending on whether we're running at bringup or, later, runtime. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Revert "Ensure journal doesn't get stuck in nochanges mode"Kent Overstreet
This patch was originally to work around the journal geting stuck in nochanges mode - but that was just a hack, we needed to fix the actual bug. It should be fixed now, so revert it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix for journal getting stuckKent Overstreet
The journal can get stuck if we need to get a journal reservation for something we have a pre-reservation for, but aren't able to reclaim space, or if the pin fifo is full - it's impractical to resize the pin fifo at runtime. Previously, we reserved 8 entries in the pin fifo for pre-reservations, but that seems small - we're seeing the journal occasionally get stuck. Let's reserve a quarter of it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Set BTREE_NODE_SEQ() correctly in merge pathKent Overstreet
BTREE_NODE_SEQ() is supposed to give us a time ordering of btree nodes on disk, so that we can tell which btree node is newer if we ever have to scan the entire device to find btree nodes. The btree node merge path wasn't setting it correctly on the new node - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Drop journal_write_compact()Kent Overstreet
Long ago it was possible to get a journal reservation and not use it, but that's no longer allowed, which means journal_write_compact() has very little work to do, and isn't really worth the code anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Add tabstops to printbufsKent Overstreet
Now, when outputting to printbufs, we can set tabstops and left or right justify text to them - this is to be used by the userspace 'bcachefs fs usage' command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix a use after freeKent Overstreet
In move_read_endio, we were checking if the next pending write has its read completed - but this can turn after a use after free (and we were accessing the list without a lock), so instead just better to just unconditionally do the wakeup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Add .to_text() methods for all superblock sectionsKent Overstreet
This patch improves the superblock .to_text() methods and adds methods for all types that were missing them. It also improves printbufs by allowing them to specfiy what units we want to be printing in, and adds new wrapper methods for unifying our kernel and userspace environments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Kill bch_scnmemcpy()Kent Overstreet
bch_scnmemcpy was for printing length-limited strings that might not have a terminating null - turns out sprintf & pr_buf can do this with %.*s. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Don't issue discards when in nochanges modeKent Overstreet
When the nochanges option is selected, we're supposed to never issue writes. Unfortunately, it seems discards were missed when implemnting this, leading to some painful filesystem corruption. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: opts.read_journal_onlyKent Overstreet
Add an option that tells recovery to only read the journal, to be used by the list_journal command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Change __bch2_trans_commit() to run triggers then get RWKent Overstreet
This is prep work for the next patch, which is going to change __bch2_trans_commit() to use bch2_journal_key_insert() when very early in the recovery process, so that we have a unified interface for doing btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Delete some flag bits that are no longer usedKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Store logical location of journal entriesKent Overstreet
When viewing what's in the journal, it's more useful to have the logical location - journal bucket and offset within that bucket - than just the offset on that device. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Check for errors from crypto_skcipher_encrypt()Kent Overstreet
Apparently it actually is possible for crypto_skcipher_encrypt() to return an error - not sure why that would be - but we need to replace our assertion with actual error handling. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Fix failure to allocate btree node in cacheKent Overstreet
The error code when we fail to allocate a node in the btree node cache doesn't make it to bch2_btree_path_traverse_all(). Instead, we need to stash a flag in btree_trans so we know we have to take the cannibalize lock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Change bch2_dev_lookup() to not use lookup_bdev()Kent Overstreet
bch2_dev_lookup() is used from the extended attribute set methods, for setting the target options, where we're already holding an inode lock - it turns out pathname lookups also take inode locks, so that was susceptible to deadlocks. Fortunately we already stash the device name in ca->name. This does change user-visible behaviour though: instead of specifying e.g. /dev/sda1, user must now specify sda1. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2022-03-12bcachefs: Only allocate buckets_nouse when requestedKent Overstreet
It's only needed by the migrate tool - this patch adds an option to enable allocating it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>