summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)Author
2022-07-11s390/pci: stash dtsm and maxstblMatthew Rosato
Store information about what IOAT designation types are supported by underlying hardware as well as the largest store block size allowed. These values will be needed by passthrough. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-10-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/pci: stash associated GISA designationMatthew Rosato
For passthrough devices, we will need to know the GISA designation of the guest if interpretation facilities are to be used. Setup to stash this in the zdev and set a default of 0 (no GISA designation) for now; a subsequent patch will set a valid GISA designation for passthrough devices. Also, extend mpcific routines to specify this stashed designation as part of the mpcific command. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-9-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/pci: externalize the SIC operation controls and routineMatthew Rosato
A subsequent patch will be issuing SIC from KVM -- export the necessary routine and make the operation control definitions available from a header. Because the routine will now be exported, let's rename __zpci_set_irq_ctrl to zpci_set_irq_ctrl and get rid of the zero'd iib wrapper function of the same name. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-8-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/airq: allow for airq structure that uses an input vectorMatthew Rosato
When doing device passthrough where interrupts are being forwarded from host to guest, we wish to use a pinned section of guest memory as the vector (the same memory used by the guest as the vector). To accomplish this, add a new parameter for airq_iv_create which allows passing an existing vector to be used instead of allocating a new one. The caller is responsible for ensuring the vector is pinned in memory as well as for unpinning the memory when the vector is no longer needed. A subsequent patch will use this new parameter for zPCI interpretation. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-7-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/airq: pass more TPI info to airq handlersMatthew Rosato
A subsequent patch will introduce an airq handler that requires additional TPI information beyond directed vs floating, so pass the entire tpi_info structure via the handler. Only pci actually uses this information today, for the other airq handlers this is effectively a no-op. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-6-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the AISI facilityMatthew Rosato
Detect the Adapter Interruption Suppression Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-5-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the AENI facilityMatthew Rosato
Detect the Adapter Event Notification Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-4-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the AISII facilityMatthew Rosato
Detect the Adapter Interruption Source ID Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-3-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11s390/sclp: detect the zPCI load/store interpretation facilityMatthew Rosato
Detect the zPCI Load/Store Interpretation facility. Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220606203325.110625-2-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-30bitops: unify non-atomic bitops prototypes across architecturesAlexander Lobakin
Currently, there is a mess with the prototypes of the non-atomic bitops across the different architectures: ret bool, int, unsigned long nr int, long, unsigned int, unsigned long addr volatile unsigned long *, volatile void * Thankfully, it doesn't provoke any bugs, but can sometimes make the compiler angry when it's not handy at all. Adjust all the prototypes to the following standard: ret bool retval can be only 0 or 1 nr unsigned long native; signed makes no sense addr volatile unsigned long * bitmaps are arrays of ulongs Next, some architectures don't define 'arch_' versions as they don't support instrumentation, others do. To make sure there is always the same set of callables present and to ease any potential future changes, make them all follow the rule: * architecture-specific files define only 'arch_' versions; * non-prefixed versions can be defined only in asm-generic files; and place the non-prefixed definitions into a new file in asm-generic to be included by non-instrumented architectures. Finally, add some static assertions in order to prevent people from making a mess in this room again. I also used the %__always_inline attribute consistently, so that they always get resolved to the actual operations. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-30s390/qdio: Fix spelling mistakeZhang Jiaming
Change 'defineable' to 'definable'. Change 'paramater' to 'parameter'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Link: https://lore.kernel.org/r/20220623060543.12870-1-jiaming@nfschina.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30s390/archrandom: simplify back to earlier design and initialize earlierJason A. Donenfeld
s390x appears to present two RNG interfaces: - a "TRNG" that gathers entropy using some hardware function; and - a "DRBG" that takes in a seed and expands it. Previously, the TRNG was wired up to arch_get_random_{long,int}(), but it was observed that this was being called really frequently, resulting in high overhead. So it was changed to be wired up to arch_get_random_ seed_{long,int}(), which was a reasonable decision. Later on, the DRBG was then wired up to arch_get_random_{long,int}(), with a complicated buffer filling thread, to control overhead and rate. Fortunately, none of the performance issues matter much now. The RNG always attempts to use arch_get_random_seed_{long,int}() first, which means a complicated implementation of arch_get_random_{long,int}() isn't really valuable or useful to have around. And it's only used when reseeding, which means it won't hit the high throughput complications that were faced before. So this commit returns to an earlier design of just calling the TRNG in arch_get_random_seed_{long,int}(), and returning false in arch_get_ random_{long,int}(). Part of what makes the simplification possible is that the RNG now seeds itself using the TRNG at bootup. But this only works if the TRNG is detected early in boot, before random_init() is called. So this commit also causes that check to happen in setup_arch(). Cc: stable@vger.kernel.org Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Ingo Franzki <ifranzki@linux.ibm.com> Cc: Juergen Christ <jchrist@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20220610222023.378448-1-Jason@zx2c4.com Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30s390/purgatory: remove duplicated build rule of kexec-purgatory.oMasahiro Yamada
This is equivalent to the pattern rule in scripts/Makefile.build. Having the dependency on $(obj)/purgatory.ro is enough. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20220613170902.1775211-3-masahiroy@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30s390/purgatory: hard-code obj-y in MakefileMasahiro Yamada
The purgatory/ directory is entirely guarded in arch/s390/Kbuild. CONFIG_ARCH_HAS_KEXEC_PURGATORY is bool type. $(CONFIG_ARCH_HAS_KEXEC_PURGATORY) is always 'y' when Kbuild visits this Makefile for building. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20220613170902.1775211-2-masahiroy@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-30s390: remove unneeded 'select BUILD_BIN2C'Masahiro Yamada
Since commit 4c0f032d4963 ("s390/purgatory: Omit use of bin2c"), s390 builds the purgatory without using bin2c. Remove 'select BUILD_BIN2C' to avoid the unneeded build of bin2c. Fixes: 4c0f032d4963 ("s390/purgatory: Omit use of bin2c") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20220613170902.1775211-1-masahiroy@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-28treewide: uapi: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. This code was transformed with the help of Coccinelle: (linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch) @@ identifier S, member, array; type T1, T2; @@ struct S { ... T1 member; T2 array[ - 0 ]; }; -fstrict-flex-arrays=3 is coming and we need to land these changes to prevent issues like these in the short future: ../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source] strcpy(de3->name, "."); ^ Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If this breaks anything, we can use a union with a new member name. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/78 Build-tested-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%25lkp@intel.com/ Acked-by: Dan Williams <dan.j.williams@intel.com> # For ndctl.h Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2022-06-27Merge branch 'master' into mm-stableakpm
2022-06-24jump_label: make initial NOP patching the special caseArd Biesheuvel
Instead of defaulting to patching NOP opcodes at init time, and leaving it to the architectures to override this if this is not needed, switch to a model where doing nothing is the default. This is the common case by far, as only MIPS requires NOP patching at init time. On all other architectures, the correct encodings are emitted by the compiler and so no initial patching is needed. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220615154142.1574619-4-ardb@kernel.org
2022-06-24jump_label: mips: move module NOP patching into arch codeArd Biesheuvel
MIPS is the only remaining architecture that needs to patch jump label NOP encodings to initialize them at load time. So let's move the module patching part of that from generic code into arch/mips, and drop it from the others. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220615154142.1574619-3-ardb@kernel.org
2022-06-24jump_label: s390: avoid pointless initial NOP patchingArd Biesheuvel
Patching NOPs into other NOPs at boot time serves no purpose, so let's use the same NOP encodings at compile time and runtime. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220615154142.1574619-2-ardb@kernel.org
2022-06-23s390/pai: Fix multiple concurrent event installationThomas Richter
Two different events such as pai_crypto/KM_AES_128/ and pai_crypto/KM_AES_192/ can be installed multiple times on the same CPU and the events are executed concurrently: # perf stat -e pai_crypto/KM_AES_128/ -C0 -a -- sleep 5 & # sleep 2 # perf stat -e pai_crypto/KM_AES_192/ -C0 -a -- true This results in the first event being installed two times with two seconds delay. The kernel does install the second event after the first event has been deleted and re-added, as can be seen in the traces: 13:48:47.600350 paicrypt_start event 0x1007 (event KM_AES_128) 13:48:49.599359 paicrypt_stop event 0x1007 (event KM_AES_128) 13:48:49.599198 paicrypt_start event 0x1007 13:48:49.599199 paicrypt_start event 0x1008 13:48:49.599921 paicrypt_event_destroy event 0x1008 13:48:52.601507 paicrypt_event_destroy event 0x1007 This is caused by functions event_sched_in() and event_sched_out() which call the PMU's add() and start() functions on schedule_in and the PMU's stop() and del() functions on schedule_out. This is correct for events attached to processes. The pai_crypto events are system-wide events and not attached to processes. Since the kernel common code can not be changed easily, fix this issue and do not reset the event count value to zero each time the event is added and started. Instead use a flag and zero the event count value only when called immediately after the event has been initialized. Therefore only the first invocation of the the event's add() function initializes the event count value to zero. The following invocations of the event's add() function leave the current event count value untouched. Fixes: 39d62336f5c1 ("s390/pai: add support for cryptography counters") Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23s390/pai: Prevent invalid event number for pai_crypto PMUThomas Richter
The pai_crypto PMU has to check the event number. It has to be in the supported range. This is not the case, the lower limit is not checked. Fix this and obey the lower limit. Fixes: 39d62336f5c1 ("s390/pai: add support for cryptography counters") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23s390/cpumf: Handle events cycles and instructions identicalThomas Richter
Events CPU_CYCLES and INSTRUCTIONS can be submitted with two different perf_event attribute::type values: - PERF_TYPE_HARDWARE: when invoked via perf tool predefined events name cycles or cpu-cycles or instructions. - pmu->type: when invoked via perf tool event name cpu_cf/CPU_CYLCES/ or cpu_cf/INSTRUCTIONS/. This invocation also selects the PMU to which the event belongs. Handle both type of invocations identical for events CPU_CYLCES and INSTRUCTIONS. They address the same hardware. The result is different when event modifier exclude_kernel is also set. Invocation with event modifier for user space event counting fails. Output before: # perf stat -e cpum_cf/cpu_cycles/u -- true Performance counter stats for 'true': <not supported> cpum_cf/cpu_cycles/u 0.000761033 seconds time elapsed 0.000076000 seconds user 0.000725000 seconds sys # Output after: # perf stat -e cpum_cf/cpu_cycles/u -- true Performance counter stats for 'true': 349,613 cpum_cf/cpu_cycles/u 0.000844143 seconds time elapsed 0.000079000 seconds user 0.000800000 seconds sys # Fixes: 6a82e23f45fe ("s390/cpumf: Adjust registration of s390 PMU device drivers") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> [agordeev@linux.ibm.com corrected commit ID of Fixes commit] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23s390/crash: make copy_oldmem_page() return number of bytes copiedAlexander Gordeev
Callback copy_oldmem_page() returns either error code or zero. Instead, it should return the error code or number of bytes copied. Fixes: df9694c7975f ("s390/dump: streamline oldmem copy functions") Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-23s390/crash: add missing iterator advance in copy_oldmem_page()Alexander Gordeev
In case old memory was successfully copied the passed iterator should be advanced as well. Currently copy_oldmem_page() is always called with single-segment iterator. Should that ever change - copy_oldmem_user and copy_oldmem_kernel() functions would need a rework to deal with multi-segment iterators. Fixes: 5d8de293c224 ("vmcore: convert copy_oldmem_page() to take an iov_iter") Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-16mm: avoid unnecessary page fault retires on shared memory typesPeter Xu
I observed that for each of the shared file-backed page faults, we're very likely to retry one more time for the 1st write fault upon no page. It's because we'll need to release the mmap lock for dirty rate limit purpose with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). Then after that throttling we return VM_FAULT_RETRY. We did that probably because VM_FAULT_RETRY is the only way we can return to the fault handler at that time telling it we've released the mmap lock. However that's not ideal because it's very likely the fault does not need to be retried at all since the pgtable was well installed before the throttling, so the next continuous fault (including taking mmap read lock, walk the pgtable, etc.) could be in most cases unnecessary. It's not only slowing down page faults for shared file-backed, but also add more mmap lock contention which is in most cases not needed at all. To observe this, one could try to write to some shmem page and look at "pgfault" value in /proc/vmstat, then we should expect 2 counts for each shmem write simply because we retried, and vm event "pgfault" will capture that. To make it more efficient, add a new VM_FAULT_COMPLETED return code just to show that we've completed the whole fault and released the lock. It's also a hint that we should very possibly not need another fault immediately on this page because we've just completed it. This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%) I believe it could help more than that. We need some special care on GUP and the s390 pgfault handler (for gmap code before returning from pgfault), the rest changes in the page fault handlers should be relatively straightforward. Another thing to mention is that mm_account_fault() does take this new fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping them as-is. Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vineet Gupta <vgupta@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm part] Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Brian Cain <bcain@quicinc.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Weinberger <richard@nod.at> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Will Deacon <will@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Simek <monstr@monstr.eu> Cc: Matt Turner <mattst88@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: David Hildenbrand <david@redhat.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Rich Felker <dalias@libc.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Helge Deller <deller@gmx.de> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-15arch/*: Disable softirq stacks on PREEMPT_RT.Sebastian Andrzej Siewior
PREEMPT_RT preempts softirqs and the current implementation avoids do_softirq_own_stack() and only uses __do_softirq(). Disable the unused softirqs stacks on PREEMPT_RT to save some memory and ensure that do_softirq_own_stack() is not used bwcause it is not expected. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-10Merge tag 'for-linus-5.19a-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - a small cleanup removing "export" of an __init function - a small series adding a new infrastructure for platform flags - a series adding generic virtio support for Xen guests (frontend side) * tag 'for-linus-5.19a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: unexport __init-annotated xen_xlate_map_ballooned_pages() arm/xen: Assign xen-grant DMA ops for xen-grant DMA devices xen/grant-dma-ops: Retrieve the ID of backend's domain for DT devices xen/grant-dma-iommu: Introduce stub IOMMU driver dt-bindings: Add xen,grant-dma IOMMU description for xen-grant DMA ops xen/virtio: Enable restricted memory access using Xen grant mappings xen/grant-dma-ops: Add option to restrict memory access under Xen xen/grants: support allocating consecutive grants arm/xen: Introduce xen_setup_dma_ops() virtio: replace arch_has_restricted_virtio_memory_access() kernel: add platform_has() infrastructure
2022-06-09gcc-12: disable '-Warray-bounds' universally for nowLinus Torvalds
In commit 8b202ee21839 ("s390: disable -Warray-bounds") the s390 people disabled the '-Warray-bounds' warning for gcc-12, because the new logic in gcc would cause warnings for their use of the S390_lowcore macro, which accesses absolute pointers. It turns out gcc-12 has many other issues in this area, so this takes that s390 warning disable logic, and turns it into a kernel build config entry instead. Part of the intent is that we can make this all much more targeted, and use this conflig flag to disable it in only particular configurations that cause problems, with the s390 case as an example: select GCC12_NO_ARRAY_BOUNDS and we could do that for other configuration cases that cause issues. Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more targeted way, and disable the warning only for particular uses: again the s390 case as an example: KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds) but this ends up just doing it globally in the top-level Makefile, since the current issues are spread fairly widely all over: KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds We'll try to limit this later, since the gcc-12 problems are rare enough that *much* of the kernel can be built with it without disabling this warning. Cc: Kees Cook <keescook@chromium.org> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-08KVM: Move kvm_arch_vcpu_precreate() under kvm->lockZeng Guang
kvm_arch_vcpu_precreate() targets to handle arch specific VM resource to be prepared prior to the actual creation of vCPU. For example, x86 platform may need do per-VM allocation based on max_vcpu_ids at the first vCPU creation. It probably leads to concurrency control on this allocation as multiple vCPU creation could happen simultaneously. From the architectual point of view, it's necessary to execute kvm_arch_vcpu_precreate() under protect of kvm->lock. Currently only arm64, x86 and s390 have non-nop implementations at the stage of vCPU pre-creation. Remove the lock acquiring in s390's design and make sure all architecture can run kvm_arch_vcpu_precreate() safely under kvm->lock without recrusive lock issue. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Message-Id: <20220419154409.11842-1-guang.zeng@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-07No need of likely/unlikely on calls of check_copy_size()Al Viro
it's inline and unlikely() inside of it (including the implicit one in WARN_ON_ONCE()) suffice to convince the compiler that getting false from check_copy_size() is unlikely. Spotted-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-07Merge tag 'kvm-s390-next-5.19-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: pvdump and selftest improvements - add an interface to provide a hypervisor dump for secure guests - improve selftests to show tests
2022-06-06virtio: replace arch_has_restricted_virtio_memory_access()Juergen Gross
Instead of using arch_has_restricted_virtio_memory_access() together with CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, replace those with platform_has() and a new platform feature PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Borislav Petkov <bp@suse.de>
2022-06-04Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - bitmap: optimize bitmap_weight() usage, from me - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro Carvalho Chehab - include/linux/find: Fix documentation, from Anna-Maria Behnsen - bitmap: fix conversion from/to fix-sized arrays, from me - bitmap: Fix return values to be unsigned, from Kees Cook It has been in linux-next for at least a week with no problems. * tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits) nodemask: Fix return values to be unsigned bitmap: Fix return values to be unsigned KVM: x86: hyper-v: replace bitmap_weight() with hweight64() KVM: x86: hyper-v: fix type of valid_bank_mask ia64: cleanup remove_siblinginfo() drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate lib/bitmap: add test for bitmap_{from,to}_arr64 lib: add bitmap_{from,to}_arr64 lib/bitmap: extend comment for bitmap_(from,to)_arr32() include/linux/find: Fix documentation lib/bitmap.c make bitmap_print_bitmask_to_buf parseable MAINTAINERS: add cpumask and nodemask files to BITMAP_API arch/x86: replace nodes_weight with nodes_empty where appropriate mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate clocksource: replace cpumask_weight with cpumask_empty in clocksource.c genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate irq: mips: replace cpumask_weight with cpumask_empty where appropriate drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate arch/x86: replace cpumask_weight with cpumask_empty where appropriate ...
2022-06-03Merge tag 'kthread-cleanups-for-v5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull kthread updates from Eric Biederman: "This updates init and user mode helper tasks to be ordinary user mode tasks. Commit 40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads") caused init and the user mode helper threads that call kernel_execve to have struct kthread allocated for them. This struct kthread going away during execve in turned made a use after free of struct kthread possible. Here, commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for init and umh") is enough to fix the use after free and is simple enough to be backportable. The rest of the changes pass struct kernel_clone_args to clean things up and cause the code to make sense. In making init and the user mode helpers tasks purely user mode tasks I ran into two complications. The function task_tick_numa was detecting tasks without an mm by testing for the presence of PF_KTHREAD. The initramfs code in populate_initrd_image was using flush_delayed_fput to ensuere the closing of all it's file descriptors was complete, and flush_delayed_fput does not work in a userspace thread. I have looked and looked and more complications and in my code review I have not found any, and neither has anyone else with the code sitting in linux-next" * tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: sched: Update task_tick_numa to ignore tasks without an mm fork: Stop allowing kthreads to call execve fork: Explicitly set PF_KTHREAD init: Deal with the init process being a user mode process fork: Generalize PF_IO_WORKER handling fork: Explicity test for idle tasks in copy_thread fork: Pass struct kernel_clone_args into copy_thread kthread: Don't allocate kthread_struct for init and umh
2022-06-03Merge tag 's390-5.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: "Just a couple of small improvements, bug fixes and cleanups: - Add Eric Farman as maintainer for s390 virtio drivers. - Improve machine check handling, and avoid incorrectly injecting a machine check into a kvm guest. - Add cond_resched() call to gmap page table walker in order to avoid possible huge latencies. Also use non-quiesing sske instruction to speed up storage key handling. - Add __GFP_NORETRY to KEXEC_CONTROL_MEMORY_GFP so s390 behaves similar like common code. - Get sie control block address from correct stack slot in perf event code. This fixes potential random memory accesses. - Change uaccess code so that the exception handler sets the result of get_user() and __get_kernel_nofault() to zero in case of a fault. Until now this was done via input parameters for inline assemblies. Doing it via fault handling is what most or even all other architectures are doing. - Couple of other small cleanups and fixes" * tag 's390-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/stack: add union to reflect kvm stack slot usages s390/stack: merge empty stack frame slots s390/uaccess: whitespace cleanup s390/uaccess: use __noreturn instead of __attribute__((noreturn)) s390/uaccess: use exception handler to zero result on get_user() failure s390/uaccess: use symbolic names for inline assembler operands s390/mcck: isolate SIE instruction when setting CIF_MCCK_GUEST flag s390/mm: use non-quiescing sske for KVM switch to keyed guest s390/gmap: voluntarily schedule during key setting MAINTAINERS: Update s390 virtio-ccw s390/kexec: add __GFP_NORETRY to KEXEC_CONTROL_MEMORY_GFP s390/Kconfig.debug: fix indentation s390/Kconfig: fix indentation s390/perf: obtain sie_block from the right address s390: generate register offsets into pt_regs automatically s390: simplify early program check handler s390/crypto: fix scatterwalk_unmap() callers in AES-GCM
2022-06-03KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriateYury Norov
Copying bitmaps from/to 64-bit arrays with bitmap_copy is not safe on 32-bit BE machines. Use designated functions instead. CC: Alexander Gordeev <agordeev@linux.ibm.com> CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: Christian Borntraeger <borntraeger@linux.ibm.com> CC: Claudio Imbrenda <imbrenda@linux.ibm.com> CC: David Hildenbrand <david@redhat.com> CC: Heiko Carstens <hca@linux.ibm.com> CC: Janosch Frank <frankja@linux.ibm.com> CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> CC: Sven Schnelle <svens@linux.ibm.com> CC: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2022-06-02Merge tag 'livepatching-for-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching cleanup from Petr Mladek: - Remove duplicated livepatch code [Christophe] * tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: livepatch: Remove klp_arch_set_pc() and asm/livepatch.h
2022-06-01KVM: s390: Add KVM_CAP_S390_PROTECTED_DUMPJanosch Frank
The capability indicates dump support for protected VMs. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-9-frankja@linux.ibm.com Message-Id: <20220517163629.3443-9-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: Add CPU dump functionalityJanosch Frank
The previous patch introduced the per-VM dump functions now let's focus on dumping the VCPU state via the newly introduced KVM_S390_PV_CPU_COMMAND ioctl which mirrors the VM UV ioctl and can be extended with new commands later. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-8-frankja@linux.ibm.com Message-Id: <20220517163629.3443-8-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: Add configuration dump functionalityJanosch Frank
Sometimes dumping inside of a VM fails, is unavailable or doesn't yield the required data. For these occasions we dump the VM from the outside, writing memory and cpu data to a file. Up to now PV guests only supported dumping from the inside of the guest through dumpers like KDUMP. A PV guest can be dumped from the hypervisor but the data will be stale and / or encrypted. To get the actual state of the PV VM we need the help of the Ultravisor who safeguards the VM state. New UV calls have been added to initialize the dump, dump storage state data, dump cpu data and complete the dump process. We expose these calls in this patch via a new UV ioctl command. The sensitive parts of the dump data are encrypted, the dump key is derived from the Customer Communication Key (CCK). This ensures that only the owner of the VM who has the CCK can decrypt the dump data. The memory is dumped / read via a normal export call and a re-import after the dump initialization is not needed (no re-encryption with a dump key). Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-7-frankja@linux.ibm.com Message-Id: <20220517163629.3443-7-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: pv: Add query dump informationJanosch Frank
The dump API requires userspace to provide buffers into which we will store data. The dump information added in this patch tells userspace how big those buffers need to be. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-6-frankja@linux.ibm.com Message-Id: <20220517163629.3443-6-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: pv: Add dump support definitionsJanosch Frank
Let's add the constants and structure definitions needed for the dump support. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-5-frankja@linux.ibm.com Message-Id: <20220517163629.3443-5-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: pv: Add query interfaceJanosch Frank
Some of the query information is already available via sysfs but having a IOCTL makes the information easier to retrieve. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-4-frankja@linux.ibm.com Message-Id: <20220517163629.3443-4-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01s390/uv: Add dump fields to queryJanosch Frank
The new dump feature requires us to know how much memory is needed for the "dump storage state" and "dump finalize" ultravisor call. These values are reported via the UV query call. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-3-frankja@linux.ibm.com Message-Id: <20220517163629.3443-3-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01s390/uv: Add SE hdr query informationJanosch Frank
We have information about the supported se header version and pcf bits so let's expose it via the sysfs files. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-2-frankja@linux.ibm.com Message-Id: <20220517163629.3443-2-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01s390/stack: add union to reflect kvm stack slot usagesHeiko Carstens
Add a union which describes how the empty stack slots are being used by kvm and perf. This should help to avoid another bug like the one which was fixed with commit c9bfb460c3e4 ("s390/perf: obtain sie_block from the right address"). Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Tested-by: Nico Boehr <nrb@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-06-01s390/stack: merge empty stack frame slotsHeiko Carstens
Merge empty1 and empty2 arrays within the stack frame to one single array. This is possible since with commit 42b01a553a56 ("s390: always use the packed stack layout") the alternative stack frame layout is gone. Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-06-01s390/uaccess: whitespace cleanupHeiko Carstens
Whitespace cleanup to get rid if some checkpatch findings, but mainly to have consistent coding style within the header file again. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-06-01s390/uaccess: use __noreturn instead of __attribute__((noreturn))Heiko Carstens
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>