todo

author: Kent Overstreet <kent.overstreet@gmail.com> 2016-02-15 17:06:53 -0900
committer: Kent Overstreet <kent.overstreet@gmail.com> 2016-02-15 17:06:53 -0900
commit: 703ebbdf1954ceabbf84f45df0cc812f58ae7b0e (patch)
tree: 0607f58579b55bc4b9e28c973160a6d3717760ee
parent: 36893bac7f2418bbaa8466eb363ee3d99262e9b8 (diff)
1 files changed, 27 insertions, 0 deletions
diff --git a/Todo.mdwn b/Todo.mdwn
index e343aef..0710301 100644
--- a/Todo.mdwn
+++ b/Todo.mdwn
@@ -57,3 +57,30 @@ bcache/bcachefs todo list:
    as buckets get emptied out. should figure out where to hook this in.
 
  * bcachefs - journal unused inode hint, or otherwise improve inode allocation
+
+ * Idea for improving the situation with btree nodes + large buckets:
+
+   This is probably only needed for running on raw flash, but might be worth
+   doing even without raw flash - it'd reduce internal fragmentation on disk
+   from the btree when we're using large buckets:
+
+   The current situation is that each btree node is a contiguous chunk of disk
+   space, used as a log - when the log fills up, the btree node is full and we
+   split or rewrite it. Each btree node is, on disk, its own individual log.
+   When the btree node size is smaller than the bucket size we allocate multiple
+   btree nodes per bucket - but this means that the entire bucket is not written
+   to sequentially (as the different btree nodes within the bucket will be
+   appended to in different orders), and the entire bucket won't be reused until
+   every btree node within the bucket won't be freed.
+
+   Idea: have a second journal for the btree. The different bsets (log entries)
+   for a particular btree node would no longer be contiguous on disk - whenever
+   any btree node write happens, it goes to the tail of the journal. This would
+   not affect anything about how btree nodes work in memory - only the on disk
+   layout would change.
+
+   We'd have to maintain a map of where the bsets currently live on disk for any
+   given btree node. At runtime this map would presumably just be pinned in
+   memory - it wouldn't be very big, but then we'd have to have provisions for
+   regenerating it at startup (either just scan the entire btree journal, or
+   maintain this map in the primary journal).
author	Kent Overstreet <kent.overstreet@gmail.com>	2016-02-15 17:06:53 -0900
committer	Kent Overstreet <kent.overstreet@gmail.com>	2016-02-15 17:06:53 -0900
commit	703ebbdf1954ceabbf84f45df0cc812f58ae7b0e (patch)
tree	0607f58579b55bc4b9e28c973160a6d3717760ee
parent	36893bac7f2418bbaa8466eb363ee3d99262e9b8 (diff)