mm: shrinker: Add a .to_text() method for shrinkers - bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Kent Overstreet <kent.overstreet@linux.dev>	2023-11-22 18:15:33 -0500
committer	Kent Overstreet <kent.overstreet@linux.dev>	2025-07-03 01:20:20 -0400
commit	593bcea988dabd2052ab74f6a5650c64bf90fcc6 (patch)
tree	8b421fbd363aee850adc9de640f6205a7a786308 /tools/testing/selftests/bpf/prog_tests/autoload.c
parent	197ff6ce1e17687ba79c14cd17abad9731c0097e (diff)

mm: shrinker: Add a .to_text() method for shrinkers

This adds a new callback method to shrinkers which they can use to describe anything relevant to memory reclaim about their internal state, for example object dirtyness. This patch also adds shrinkers_to_text(), which reports on the top 10 shrinkers - by object count - in sorted order, to be used in OOM reporting. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: linux-mm@kvack.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> From david@fromorbit.com Tue Aug 27 23:32:26 2024 > > + if (!mutex_trylock(&shrinker_mutex)) { > > + seq_buf_puts(out, "(couldn't take shrinker lock)"); > > + return; > > + } > > Please don't use the shrinker_mutex like this. There can be tens of > thousands of entries in the shrinker list (because memcgs) and > holding the shrinker_mutex for long running traversals like this is > known to cause latency problems for memcg reaping. If we are at > ENOMEM, the last thing we want to be doing is preventing memcgs from > being reaped. > > > + list_for_each_entry(shrinker, &shrinker_list, list) { > > + struct shrink_control sc = { .gfp_mask = GFP_KERNEL, }; > > This iteration and counting setup is neither node or memcg aware. > For node aware shrinkers, this will only count the items freeable > on node 0, and ignore all the other memory in the system. For memcg > systems, it will also only scan the root memcg and so miss counting > any memory in memcg owned caches. > > IOWs, the shrinker iteration mechanism needs to iterate both by NUMA > node and by memcg. On large machines with multiple nodes and hosting > thousands of memcgs, a total shrinker state iteration is has to walk > a -lot- of structures. > > And example of this is drop_slab() - called from > /proc/sys/vm/drop_caches(). It does this to iterate all the > shrinkers for all the nodes and memcgs in the system: > > static unsigned long drop_slab_node(int nid) > { > unsigned long freed = 0; > struct mem_cgroup *memcg = NULL; > > memcg = mem_cgroup_iter(NULL, NULL, NULL); > do { > freed += shrink_slab(GFP_KERNEL, nid, memcg, 0); > } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); > > return freed; > } > > void drop_slab(void) > { > int nid; > int shift = 0; > unsigned long freed; > > do { > freed = 0; > for_each_online_node(nid) { > if (fatal_signal_pending(current)) > return; > > freed += drop_slab_node(nid); > } > } while ((freed >> shift++) > 1); > } > > Hence any iteration for finding the 10 largest shrinkable caches in > the system needs to do something similar. Only, it needs to iterate > memcgs first and then aggregate object counts across all nodes for > shrinkers that are NUMA aware. > > Because it needs direct access to the shrinkers, it will need to use > the RCU lock + refcount method of traversal because that's the only > safe way to go from memcg to shrinker instance. IOWs, it > needs to mirror the code in shrink_slab/shrink_slab_memcg to obtain > a safe reference to the relevant shrinker so it can call > ->count_objects() and store a refcounted pointer to the shrinker(s) > that will get printed out after the scan is done.... > > Once the shrinker iteration is sorted out, I'll look further at the > rest of the code in this patch... > > -Dave.

Diffstat (limited to 'tools/testing/selftests/bpf/prog_tests/autoload.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: