summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-02-14fix missing vmalloc.h includesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-11codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocationsSuren Baghdasaryan
If slabobj_ext vector allocation for a slab object fails and later on it succeeds for another object in the same slab, the slabobj_ext for the original object will be NULL and will be flagged in case when CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled. Mark failed slabobj_ext vector allocations using a new objext_flags flag stored in the lower bits of slab->obj_exts. When new allocation succeeds it marks all tag references in the same slabobj_ext vector as empty to avoid warnings implemented by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11codetag: debug: mark codetags for reserved pages as emptySuren Baghdasaryan
To avoid debug warnings while freeing reserved pages which were not allocated with usual allocators, mark their codetags as empty before freeing. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11codetag: debug: skip objext checking when it's for objext itselfSuren Baghdasaryan
objext objects are created with __GFP_NO_OBJ_EXT flag and therefore have no corresponding objext themselves (otherwise we would get an infinite recursion). When freeing these objects their codetag will be empty and when CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled this will lead to false warnings. Introduce CODETAG_EMPTY special codetag value to mark allocations which intentionally lack codetag to avoid these warnings. Set objext codetags to CODETAG_EMPTY before freeing to indicate that the codetag is expected to be empty. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11lib: add memory allocations report in show_mem()Suren Baghdasaryan
Include allocations in show_mem reports. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11rhashtable: Plumb through alloc tagKent Overstreet
This gives better memory allocation profiling results; rhashtable allocations will be accounted to the code that initialized the rhashtable. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11mm: vmalloc: Enable memory allocation profilingKent Overstreet
This wrapps all external vmalloc allocation functions with the alloc_hooks() wrapper, and switches internal allocations to _noprof variants where appropriate, for the new memory allocation profiling feature. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11mm: percpu: enable per-cpu allocation taggingSuren Baghdasaryan
Redefine __alloc_percpu, __alloc_percpu_gfp and __alloc_reserved_percpu to record allocations and deallocations done by these functions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11mempool: Hook up to memory allocation profilingKent Overstreet
This adds hooks to mempools for correctly annotating mempool-backed allocations at the correct source line, so they show up correctly in /sys/kernel/debug/allocations. Various inline functions are converted to wrappers so that we can invoke alloc_hooks() in fewer places. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11mm/slab: enable slab allocation tagging for kmalloc and friendsSuren Baghdasaryan
Redefine kmalloc, krealloc, kzalloc, kcalloc, etc. to record allocations and deallocations done by these functions. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-11lib: add codetag reference into slabobj_extSuren Baghdasaryan
To store code tag for every slab object, a codetag reference is embedded into slabobj_ext when CONFIG_MEM_ALLOC_PROFILING=y. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-11mm: create new codetag references during page splittingSuren Baghdasaryan
When a high-order page is split into smaller ones, each newly split page should get its codetag. The original codetag is reused for these pages but it's recorded as 0-byte allocation because original codetag already accounts for the original high-order allocated page. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-11mm: enable page allocation taggingSuren Baghdasaryan
Redefine page allocators to record allocation tags upon their invocation. Instrument post_alloc_hook and free_pages_prepare to modify current allocation tag. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-11change alloc_pages name in dma_map_ops to avoid name conflictsSuren Baghdasaryan
After redefining alloc_pages, all uses of that name are being replaced. Change the conflicting names to prevent preprocessor from replacing them when it's not intended. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10mm: percpu: increase PERCPU_MODULE_RESERVE to accommodate allocation tagsSuren Baghdasaryan
As each allocation tag generates a per-cpu variable, more space is required to store them. Increase PERCPU_MODULE_RESERVE to provide enough area. A better long-term solution would be to allocate this memory dynamically. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org>
2024-02-10lib: introduce support for page allocation taggingSuren Baghdasaryan
Introduce helper functions to easily instrument page allocators by storing a pointer to the allocation tag associated with the code that allocated the page in a page_ext field. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-10lib: add allocation tagging support for memory allocation profilingSuren Baghdasaryan
Introduce CONFIG_MEM_ALLOC_PROFILING which provides definitions to easily instrument memory allocators. It registers an "alloc_tags" codetag type with /proc/allocinfo interface to output allocation tag information when the feature is enabled. CONFIG_MEM_ALLOC_PROFILING_DEBUG is provided for debugging the memory allocation profiling instrumentation. Memory allocation profiling can be enabled or disabled at runtime using /proc/sys/vm/mem_profiling sysctl when CONFIG_MEM_ALLOC_PROFILING_DEBUG=n. CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT enables memory allocation profiling by default. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-10lib: prevent module unloading if memory is not freedSuren Baghdasaryan
Skip freeing module's data section if there are non-zero allocation tags because otherwise, once these allocations are freed, the access to their code tag would cause UAF. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10lib: code tagging module supportSuren Baghdasaryan
Add support for code tagging from dynamically loaded modules. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-10lib: code tagging frameworkSuren Baghdasaryan
Add basic infrastructure to support code tagging which stores tag common information consisting of the module name, function, file name and line number. Provide functions to register a new code tag type and navigate between code tags. Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10slab: objext: introduce objext_flags as extension to page_memcg_data_flagsSuren Baghdasaryan
Introduce objext_flags to store additional objext flags unrelated to memcg. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creationSuren Baghdasaryan
Slab extension objects can't be allocated before slab infrastructure is initialized. Some caches, like kmem_cache and kmem_cache_node, are created before slab infrastructure is initialized. Objects from these caches can't have extension objects. Introduce SLAB_NO_OBJ_EXT slab flag to mark these caches and avoid creating extensions for objects allocated from these slabs. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creationSuren Baghdasaryan
Introduce __GFP_NO_OBJ_EXT flag in order to prevent recursive allocations when allocating slabobj_ext on a slab. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10mm: introduce slabobj_ext to support slab object extensionsSuren Baghdasaryan
Currently slab pages can store only vectors of obj_cgroup pointers in page->memcg_data. Introduce slabobj_ext structure to allow more data to be stored for each slab object. Wrap obj_cgroup into slabobj_ext to support current functionality while allowing to extend slabobj_ext in the future. Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10mm: enumerate all gfp flagsSuren Baghdasaryan
Introduce GFP bits enumeration to let compiler track the number of used bits (which depends on the config options) instead of hardcoding them. That simplifies __GFP_BITS_SHIFT calculation. Suggested-by: Petr Tesařík <petr@tesarici.cz> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2024-02-10fs: Convert alloc_inode_sb() to a macroKent Overstreet
We're introducing alloc tagging, which tracks memory allocations by callsite. Converting alloc_inode_sb() to a macro means allocations will be tracked by its caller, which is a bit more useful. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
2024-02-10lib/string_helpers: Add flags param to string_get_size()Kent Overstreet
The new flags parameter allows controlling - Whether or not the units suffix is separated by a space, for compatibility with sort -h - Whether or not to append a B suffix - we're not always printing bytes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Noralf Trønnes" <noralf@tronnes.org> Cc: Jens Axboe <axboe@kernel.dk>
2024-02-10Merge tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Update a potentially stale firmware attribute (Maurizio) - Fixes for the recent verbose error logging (Keith, Chaitanya) - Protection information payload size fix for passthrough (Francis) - Fix for a queue freezing issue in virtblk (Yi) - blk-iocost underflow fix (Tejun) - blk-wbt task detection fix (Jan) * tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux: virtio-blk: Ensure no requests in virtqueues before deleting vqs. blk-iocost: Fix an UBSAN shift-out-of-bounds warning nvme: use ns->head->pi_size instead of t10_pi_tuple structure size nvme-core: fix comment to reflect right functions nvme: move passthrough logging attribute to head blk-wbt: Fix detection of dirty-throttled tasks nvme-host: fix the updating of the firmware version
2024-02-09Merge tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "Some fscrypt-related fixups (sparse reads are used only for encrypted files) and two cap handling fixes from Xiubo and Rishabh" * tag 'ceph-for-6.8-rc4' of https://github.com/ceph/ceph-client: ceph: always check dir caps asynchronously ceph: prevent use-after-free in encode_cap_msg() ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFT libceph: just wait for more data to be available on the socket libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*() libceph: fail sparse-read if the data length doesn't match
2024-02-09work around gcc bugs with 'asm goto' with outputsLinus Torvalds
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-09Merge tag 'efi-fixes-for-v6.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "The only notable change here is the patch that changes the way we deal with spurious errors from the EFI memory attribute protocol. This will be backported to v6.6, and is intended to ensure that we will not paint ourselves into a corner when we tighten this further in order to comply with MS requirements on signed EFI code. Note that this protocol does not currently exist in x86 production systems in the field, only in Microsoft's fork of OVMF, but it will be mandatory for Windows logo certification for x86 PCs in the future. - Tighten ELF relocation checks on the RISC-V EFI stub - Give up if the new EFI memory attributes protocol fails spuriously on x86 - Take care not to place the kernel in the lowest 16 MB of DRAM on x86 - Omit special purpose EFI memory from memblock - Some fixes for the CXL CPER reporting code - Make the PE/COFF layout of mixed-mode capable images comply with a strict interpretation of the spec" * tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section cxl/trace: Remove unnecessary memcpy's cxl/cper: Fix errant CPER prints for CXL events efi: Don't add memblocks for soft-reserved memory efi: runtime: Fix potential overflow of soft-reserved region size efi/libstub: Add one kernel-doc comment x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR x86/efistub: Give up if memory attribute protocol returns an error riscv/efistub: Tighten ELF relocation check riscv/efistub: Ensure GP-relative addressing is not used
2024-02-07libceph: just wait for more data to be available on the socketXiubo Li
A short read may occur while reading the message footer from the socket. Later, when the socket is ready for another read, the messenger invokes all read_partial_*() handlers, including read_partial_sparse_msg_data(). The expectation is that read_partial_sparse_msg_data() would bail, allowing the messenger to invoke read_partial() for the footer and pick up where it left off. However read_partial_sparse_msg_data() violates that and ends up calling into the state machine in the OSD client. The sparse-read state machine assumes that it's a new op and interprets some piece of the footer as the sparse-read header and returns bogus extents/data length, etc. To determine whether read_partial_sparse_msg_data() should bail, let's reuse cursor->total_resid. Because once it reaches to zero that means all the extents and data have been successfully received in last read, else it could break out when partially reading any of the extents and data. And then osd_sparse_read() could continue where it left off. [ idryomov: changelog ] Link: https://tracker.ceph.com/issues/63586 Fixes: d396f89db39a ("libceph: add sparse read support to msgr1") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07libceph: fail sparse-read if the data length doesn't matchXiubo Li
Once this happens that means there have bugs. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-06blk-wbt: Fix detection of dirty-throttled tasksJan Kara
The detection of dirty-throttled tasks in blk-wbt has been subtly broken since its beginning in 2016. Namely if we are doing cgroup writeback and the throttled task is not in the root cgroup, balance_dirty_pages() will set dirty_sleep for the non-root bdi_writeback structure. However blk-wbt checks dirty_sleep only in the root cgroup bdi_writeback structure. Thus detection of recently throttled tasks is not working in this case (we noticed this when we switched to cgroup v2 and suddently writeback was slow). Since blk-wbt has no easy way to get to proper bdi_writeback and furthermore its intention has always been to work on the whole device rather than on individual cgroups, just move the dirty_sleep timestamp from bdi_writeback to backing_dev_info. That fixes the checking for recently throttled task and saves memory for everybody as a bonus. CC: stable@vger.kernel.org Fixes: b57d74aff9ab ("writeback: track if we're sleeping on progress in balance_dirty_pages()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240123175826.21452-1-jack@suse.cz [axboe: fixup indentation errors] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-02-04Merge tag 'dmaengine-fix-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Core: - fix return value of is_slave_direction() for D2D dma Driver fixes for: - Documentaion fixes to resolve warnings for at_hdmac driver - bunch of fsl driver fixes for memory leaks, and useless kfree - TI edma and k3 fixes for packet error and null pointer checks" * tag 'dmaengine-fix-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: at_hdmac: add missing kernel-doc style description dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV dmaengine: fsl-qdma: Remove a useless devm_kfree() dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA dmaengine: ti: k3-udma: Report short packet errors dmaengine: ti: edma: Add some null pointer checks to the edma_probe dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools dmaengine: at_hdmac: fix some kernel-doc warnings
2024-02-03cxl/cper: Fix errant CPER prints for CXL eventsIra Weiny
Jonathan reports that CXL CPER events dump an extra generic error message. {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 {1}[Hardware Error]: event severity: recoverable {1}[Hardware Error]: Error 0, type: recoverable {1}[Hardware Error]: section type: unknown, fbcd0a77-c260-417f-85a9-088b1621eba6 {1}[Hardware Error]: section length: 0x90 {1}[Hardware Error]: 00000000: 00000090 00000007 00000000 0d938086 ................ {1}[Hardware Error]: 00000010: 00100000 00000000 00040000 00000000 ................ ... CXL events were rerouted though the CXL subsystem for additional processing. However, when that work was done it was missed that cper_estatus_print_section() continued with a generic error message which is confusing. Teach CPER print code to ignore printing details of some section types. Assign the CXL event GUIDs to this set to prevent confusing unknown prints. Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-02-02Merge tag 'pci-v6.8-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Fix a potential deadlock that was reintroduced by an ASPM revert merged for v6.8 (Johan Hovold) - Add Manivannan Sadhasivam as PCI Endpoint maintainer (Lorenzo Pieralisi) * tag 'pci-v6.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Manivannan Sadhasivam as PCI Endpoint maintainer PCI/ASPM: Fix deadlock when enabling ASPM
2024-02-02Merge tag 'block-6.8-2024-02-01' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Remove duplicated enums (Guixen) - Use appropriate controller state accessors (Keith) - Retryable authentication (Hannes) - Add missing module descriptions (Chaitanya) - Fibre-channel fixes for blktests (Daniel) - Various type correctness updates (Caleb) - Improve fabrics connection debugging prints (Nitin) - Passthrough command verbose error logging (Adam) - Fix for where we set IO priority in the bio for drivers that use fops->submit_bio() to queue IO, like md/dm etc. * tag 'block-6.8-2024-02-01' of git://git.kernel.dk/linux: (32 commits) block: Fix where bio IO priority gets set nvme: allow passthru cmd error logging nvme-fc: show hostnqn when connecting to fc target nvme-rdma: show hostnqn when connecting to rdma target nvme-tcp: show hostnqn when connecting to tcp target nvmet-fc: use RCU list iterator for assoc_list nvmet-fc: take ref count on tgtport before delete assoc nvmet-fc: avoid deadlock on delete association path nvmet-fc: abort command when there is no binding nvmet-fc: do not tack refs on tgtports from assoc nvmet-fc: remove null hostport pointer check nvmet-fc: hold reference on hostport match nvmet-fc: free queue and assoc directly nvmet-fc: defer cleanup using RCU properly nvmet-fc: release reference on target port nvmet-fcloop: swap the list_add_tail arguments nvme-fc: do not wait in vain when unloading module nvme-fc: log human-readable opcode on timeout nvme: split out fabrics version of nvme_opcode_str() nvme: take const cmd pointer in read-only helpers ...
2024-02-01Merge tag 'net-6.8-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. As Paolo promised we continue to hammer out issues in our selftests. This is not the end but probably the peak. Current release - regressions: - smc: fix incorrect SMC-D link group matching logic Current release - new code bugs: - eth: bnxt: silence WARN() when device skips a timestamp, it happens Previous releases - regressions: - ipmr: fix null-deref when forwarding mcast packets - conntrack: evaluate window negotiation only for packets in the REPLY direction, otherwise SYN retransmissions trigger incorrect window scale negotiation - ipset: fix performance regression in swap operation Previous releases - always broken: - tcp: add sanity checks to types of pages getting into the rx zerocopy path, we only support basic NIC -> user, no page cache pages etc. - ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv() - nt_tables: more input sanitization changes - dsa: mt7530: fix 10M/100M speed on MediaTek MT7988 switch - bridge: mcast: fix loss of snooping after long uptime, jiffies do wrap on 32bit - xen-netback: properly sync TX responses, protect with locking - phy: mediatek-ge-soc: sync calibration values with MediaTek SDK, increase connection stability - eth: pds: fixes for various teardown, and reset races Misc: - hsr: silence WARN() if we can't alloc supervision frame, it happens" * tag 'net-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) doc/netlink/specs: Add missing attr in rt_link spec idpf: avoid compiler padding in virtchnl2_ptype struct selftests: mptcp: join: stop transfer when check is done (part 2) selftests: mptcp: join: stop transfer when check is done (part 1) selftests: mptcp: allow changing subtests prefix selftests: mptcp: decrease BW in simult flows selftests: mptcp: increase timeout to 30 min selftests: mptcp: add missing kconfig for NF Mangle selftests: mptcp: add missing kconfig for NF Filter in v6 selftests: mptcp: add missing kconfig for NF Filter mptcp: fix data re-injection from stale subflow selftests: net: enable some more knobs selftests: net: add missing config for NF_TARGET_TTL selftests: forwarding: List helper scripts in TEST_FILES Makefile variable selftests: net: List helper scripts in TEST_FILES Makefile variable selftests: net: Remove executable bits from library scripts selftests: bonding: Check initial state selftests: team: Add missing config options hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove xen-netback: properly sync TX responses ...
2024-02-01Merge tag 'hid-for-linus-2024020101' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - cleanups in the error path in hid-steam (Dan Carpenter) - fixes for Wacom tablets selftests that sneaked in while the CI was taking a break during the year end holidays (Benjamin Tissoires) - null pointer check in nvidia-shield (Kunwu Chan) - memory leak fix in hidraw (Su Hui) - another null pointer fix in i2c-hid-of (Johan Hovold) - another memory leak fix in HID-BPF this time, as well as a double fdget() fix reported by Dan Carpenter (Benjamin Tissoires) - fix for Cirque touchpad when they go on suspend (Kai-Heng Feng) - new device ID in hid-logitech-hidpp: "Logitech G Pro X SuperLight 2" (Jiri Kosina) * tag 'hid-for-linus-2024020101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: bpf: use __bpf_kfunc instead of noinline HID: bpf: actually free hdev memory after attaching a HID-BPF program HID: bpf: remove double fdget() HID: i2c-hid-of: fix NULL-deref on failed power up HID: hidraw: fix a problem of memory leak in hidraw_release() HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend HID: nvidia-shield: Add missing null pointer checks to LED initialization HID: logitech-hidpp: add support for Logitech G Pro X Superlight 2 selftests/hid: wacom: fix confidence tests HID: hid-steam: Fix cleanup in probe() HID: hid-steam: remove pointless error message
2024-02-01Merge tag 'lsm-pr-20240131' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fixes from Paul Moore: "Two small patches to fix some problems relating to LSM hook return values and how the individual LSMs interact" * tag 'lsm-pr-20240131' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: fix default return value of the socket_getpeersec_*() hooks lsm: fix the logic in security_inode_getsecctx()
2024-02-01Merge tag 'nf-24-01-31' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) TCP conntrack now only evaluates window negotiation for packets in the REPLY direction, from Ryan Schaefer. Otherwise SYN retransmissions trigger incorrect window scale negotiation. From Ryan Schaefer. 2) Restrict tunnel objects to NFPROTO_NETDEV which is where it makes sense to use this object type. 3) Fix conntrack pick up from the middle of SCTP_CID_SHUTDOWN_ACK packets. From Xin Long. 4) Another attempt from Jozsef Kadlecsik to address the slow down of the swap command in ipset. 5) Replace a BUG_ON by WARN_ON_ONCE in nf_log, and consolidate check for the case that the logger is NULL from the read side lock section. 6) Address lack of sanitization for custom expectations. Restrict layer 3 and 4 families to what it is supported by userspace. * tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger netfilter: ipset: fix performance regression in swap operation netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEV netfilter: conntrack: correct window scaling with retransmitted SYN ==================== Link: https://lore.kernel.org/r/20240131225943.7536-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-31nvme: take const cmd pointer in read-only helpersCaleb Sander
nvme_is_fabrics() and nvme_is_write() only read struct nvme_command, so take it by const pointer. This allows callers to pass a const pointer and communicates that these functions don't modify the command. Signed-off-by: Caleb Sander <csander@purestorage.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-31netfilter: ipset: fix performance regression in swap operationJozsef Kadlecsik
The patch "netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test", commit 28628fa9 fixes a race condition. But the synchronize_rcu() added to the swap function unnecessarily slows it down: it can safely be moved to destroy and use call_rcu() instead. Eric Dumazet pointed out that simply calling the destroy functions as rcu callback does not work: sets with timeout use garbage collectors which need cancelling at destroy which can wait. Therefore the destroy functions are split into two: cancelling garbage collectors safely at executing the command received by netlink and moving the remaining part only into the rcu callback. Link: https://lore.kernel.org/lkml/C0829B10-EAA6-4809-874E-E1E9C05A8D84@automattic.com/ Fixes: 28628fa952fe ("netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test") Reported-by: Ale Crismani <ale.crismani@automattic.com> Reported-by: David Wang <00107082@163.com> Tested-by: David Wang <00107082@163.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-01-31PCI/ASPM: Fix deadlock when enabling ASPMJohan Hovold
A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); *** DEADLOCK *** Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom] The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock. Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice. Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 6.7
2024-01-31HID: bpf: use __bpf_kfunc instead of noinlineBenjamin Tissoires
Follow the docs at Documentation/bpf/kfuncs.rst: - declare the function with `__bpf_kfunc` - disables missing prototype warnings, which allows to remove them from include/linux/hid-bpf.h Removing the prototypes is not an issue because we currently have to redeclare them when writing the BPF program. They will eventually be generated by bpftool directly AFAIU. Link: https://lore.kernel.org/r/20240124-b4-hid-bpf-fixes-v2-3-052520b1e5e6@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-01-30lsm: fix default return value of the socket_getpeersec_*() hooksOndrej Mosnacek
For these hooks the true "neutral" value is -EOPNOTSUPP, which is currently what is returned when no LSM provides this hook and what LSMs return when there is no security context set on the socket. Correct the value in <linux/lsm_hooks.h> and adjust the dispatch functions in security/security.c to avoid issues when the BPF LSM is enabled. Cc: stable@vger.kernel.org Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-01-30dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEVFrank Li
is_slave_direction() should return true when direction is DMA_DEV_TO_DEV. Fixes: 49920bc66984 ("dmaengine: add new enum dma_transfer_direction") Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240123172842.3764529-1-Frank.Li@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-01-29Merge tag 'mm-hotfixes-stable-2024-01-28-23-21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 11 are cc:stable and the remainder address post-6.7 issues or aren't considered appropriate for backporting" * tag 'mm-hotfixes-stable-2024-01-28-23-21' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) mm: thp_get_unmapped_area must honour topdown preference mm: huge_memory: don't force huge page alignment on 32 bit userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory scs: add CONFIG_MMU dependency for vfree_atomic() mm/memory: fix folio_set_dirty() vs. folio_mark_dirty() in zap_pte_range() mm/huge_memory: fix folio_set_dirty() vs. folio_mark_dirty() selftests/mm: Update va_high_addr_switch.sh to check CPU for la57 flag selftests: mm: fix map_hugetlb failure on 64K page size systems MAINTAINERS: supplement of zswap maintainers update stackdepot: make fast paths lock-less again stackdepot: add stats counters exported via debugfs mm, kmsan: fix infinite recursion due to RCU critical section mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again selftests/mm: switch to bash from sh MAINTAINERS: add man-pages git trees mm: memcontrol: don't throttle dying tasks on memory.high mm: mmap: map MAP_STACK to VM_NOHUGEPAGE uprobes: use pagesize-aligned virtual address when replacing pages selftests/mm: mremap_test: fix build warning ...
2024-01-28Merge tag 'x86_urgent_for_v6.8_rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure 32-bit syscall registers are properly sign-extended - Add detection for AMD's Zen5 generation CPUs and Intel's Clearwater Forest CPU model number - Make a stub function export non-GPL because it is part of the paravirt alternatives and that can be used by non-GPL code * tag 'x86_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5 x86/entry/ia32: Ensure s32 is sign extended to s64 x86/cpu: Add model number for Intel Clearwater Forest processor x86/CPU/AMD: Add X86_FEATURE_ZEN5 x86/paravirt: Make BUG_func() usable by non-GPL modules