summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKent Overstreet <kent.overstreet@gmail.com>2016-02-15 17:06:53 -0900
committerKent Overstreet <kent.overstreet@gmail.com>2016-02-15 17:06:53 -0900
commit703ebbdf1954ceabbf84f45df0cc812f58ae7b0e (patch)
tree0607f58579b55bc4b9e28c973160a6d3717760ee
parent36893bac7f2418bbaa8466eb363ee3d99262e9b8 (diff)
todo
-rw-r--r--Todo.mdwn27
1 files changed, 27 insertions, 0 deletions
diff --git a/Todo.mdwn b/Todo.mdwn
index e343aef..0710301 100644
--- a/Todo.mdwn
+++ b/Todo.mdwn
@@ -57,3 +57,30 @@ bcache/bcachefs todo list:
as buckets get emptied out. should figure out where to hook this in.
* bcachefs - journal unused inode hint, or otherwise improve inode allocation
+
+ * Idea for improving the situation with btree nodes + large buckets:
+
+ This is probably only needed for running on raw flash, but might be worth
+ doing even without raw flash - it'd reduce internal fragmentation on disk
+ from the btree when we're using large buckets:
+
+ The current situation is that each btree node is a contiguous chunk of disk
+ space, used as a log - when the log fills up, the btree node is full and we
+ split or rewrite it. Each btree node is, on disk, its own individual log.
+ When the btree node size is smaller than the bucket size we allocate multiple
+ btree nodes per bucket - but this means that the entire bucket is not written
+ to sequentially (as the different btree nodes within the bucket will be
+ appended to in different orders), and the entire bucket won't be reused until
+ every btree node within the bucket won't be freed.
+
+ Idea: have a second journal for the btree. The different bsets (log entries)
+ for a particular btree node would no longer be contiguous on disk - whenever
+ any btree node write happens, it goes to the tail of the journal. This would
+ not affect anything about how btree nodes work in memory - only the on disk
+ layout would change.
+
+ We'd have to maintain a map of where the bsets currently live on disk for any
+ given btree node. At runtime this map would presumably just be pinned in
+ memory - it wouldn't be very big, but then we'd have to have provisions for
+ regenerating it at startup (either just scan the entire btree journal, or
+ maintain this map in the primary journal).