summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm
AgeCommit message (Collapse)Author
2014-08-08Merge branch 'akpm' (second patchbomb from Andrew Morton)Linus Torvalds
Merge more incoming from Andrew Morton: "Two new syscalls: memfd_create in "shm: add memfd_create() syscall" kexec_file_load in "kexec: implementation of new syscall kexec_file_load" And: - Most (all?) of the rest of MM - Lots of the usual misc bits - fs/autofs4 - drivers/rtc - fs/nilfs - procfs - fork.c, exec.c - more in lib/ - rapidio - Janitorial work in filesystems: fs/ufs, fs/reiserfs, fs/adfs, fs/cramfs, fs/romfs, fs/qnx6. - initrd/initramfs work - "file sealing" and the memfd_create() syscall, in tmpfs - add pci_zalloc_consistent, use it in lots of places - MAINTAINERS maintenance - kexec feature work" * emailed patches from Andrew Morton <akpm@linux-foundation.org: (193 commits) MAINTAINERS: update nomadik patterns MAINTAINERS: update usb/gadget patterns MAINTAINERS: update DMA BUFFER SHARING patterns kexec: verify the signature of signed PE bzImage kexec: support kexec/kdump on EFI systems kexec: support for kexec on panic using new system call kexec-bzImage64: support for loading bzImage using 64bit entry kexec: load and relocate purgatory at kernel load time purgatory: core purgatory functionality purgatory/sha256: provide implementation of sha256 in purgaotory context kexec: implementation of new syscall kexec_file_load kexec: new syscall kexec_file_load() declaration kexec: make kexec_segment user buffer pointer a union resource: provide new functions to walk through resources kexec: use common function for kimage_normal_alloc() and kimage_crash_alloc() kexec: move segment verification code in a separate function kexec: rename unusebale_pages to unusable_pages kernel: build bin2c based on config option CONFIG_BUILD_BIN2C bin2c: move bin2c in scripts/basic shm: wait for pins to be released when sealing ...
2014-08-08arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate areaAndy Lutomirski
The core mm code will provide a default gate area based on FIXADDR_USER_START and FIXADDR_USER_END if !defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR). This default is only useful for ia64. arm64, ppc, s390, sh, tile, 64-bit UML, and x86_32 have their own code just to disable it. arm, 32-bit UML, and x86_64 have gate areas, but they have their own implementations. This gets rid of the default and moves the code into ia64. This should save some code on architectures without a gate area: it's now possible to inline the gate_area functions in the default case. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [in principle] Acked-by: Richard Weinberger <richard@nod.at> [for um] Acked-by: Will Deacon <will.deacon@arm.com> [for arm64] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nathan Lynch <Nathan_Lynch@mentor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08lib/scatterlist: make ARCH_HAS_SG_CHAIN an actual KconfigLaura Abbott
Rather than have architectures #define ARCH_HAS_SG_CHAIN in an architecture specific scatterlist.h, make it a proper Kconfig option and use that instead. At same time, remove the header files are are now mostly useless and just include asm-generic/scatterlist.h. [sfr@canb.auug.org.au: powerpc files now need asm/dma.h] Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [powerpc] Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull second round of KVM changes from Paolo Bonzini: "Here are the PPC and ARM changes for KVM, which I separated because they had small conflicts (respectively within KVM documentation, and with 3.16-rc changes). Since they were all within the subsystem, I took care of them. Stephen Rothwell reported some snags in PPC builds, but they are all fixed now; the latest linux-next report was clean. New features for ARM include: - KVM VGIC v2 emulation on GICv3 hardware - Big-Endian support for arm/arm64 (guest and host) - Debug Architecture support for arm64 (arm32 is on Christoffer's todo list) And for PPC: - Book3S: Good number of LE host fixes, enable HV on LE - Book3S HV: Add in-guest debug support This release drops support for KVM on the PPC440. As a result, the PPC merge removes more lines than it adds. :) I also included an x86 change, since Davidlohr tied it to an independent bug report and the reporter quickly provided a Tested-by; there was no reason to wait for -rc2" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (122 commits) KVM: Move more code under CONFIG_HAVE_KVM_IRQFD KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use KVM: nVMX: Fix nested vmexit ack intr before load vmcs01 KVM: PPC: Enable IRQFD support for the XICS interrupt controller KVM: Give IRQFD its own separate enabling Kconfig option KVM: Move irq notifier implementation into eventfd.c KVM: Move all accesses to kvm::irq_routing into irqchip.c KVM: irqchip: Provide and use accessors for irq routing table KVM: Don't keep reference to irq routing table in irqfd struct KVM: PPC: drop duplicate tracepoint arm64: KVM: fix 64bit CP15 VM access for 32bit guests KVM: arm64: GICv3: mandate page-aligned GICV region arm64: KVM: GICv3: move system register access to msr_s/mrs_s KVM: PPC: PR: Handle FSCR feature deselects KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr KVM: PPC: Remove DCR handling KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Handle magic page in kvmppc_ld/st ...
2014-08-07Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "This is the powerpc new goodies for 3.17. The short story: The biggest bit is Michael removing all of pre-POWER4 processor support from the 64-bit kernel. POWER3 and rs64. This gets rid of a ton of old cruft that has been bitrotting in a long while. It was broken for quite a few versions already and nobody noticed. Nobody uses those machines anymore. While at it, he cleaned up a bunch of old dusty cabinets, getting rid of a skeletton or two. Then, we have some base VFIO support for KVM, which allows assigning of PCI devices to KVM guests, support for large 64-bit BARs on "powernv" platforms, support for HMI (Hardware Management Interrupts) on those same platforms, some sparse-vmemmap improvements (for memory hotplug), There is the usual batch of Freescale embedded updates (summary in the merge commit) and fixes here or there, I think that's it for the highlights" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (102 commits) powerpc/eeh: Export eeh_iommu_group_to_pe() powerpc/eeh: Add missing #ifdef CONFIG_IOMMU_API powerpc: Reduce scariness of interrupt frames in stack traces powerpc: start loop at section start of start in vmemmap_populated() powerpc: implement vmemmap_free() powerpc: implement vmemmap_remove_mapping() for BOOK3S powerpc: implement vmemmap_list_free() powerpc: Fail remap_4k_pfn() if PFN doesn't fit inside PTE powerpc/book3s: Fix endianess issue for HMI handling on napping cpus. powerpc/book3s: handle HMIs for cpus in nap mode. powerpc/powernv: Invoke opal call to handle hmi. powerpc/book3s: Add basic infrastructure to handle HMI in Linux. powerpc/iommu: Fix comments with it_page_shift powerpc/powernv: Handle compound PE in config accessors powerpc/powernv: Handle compound PE for EEH powerpc/powernv: Handle compound PE powerpc/powernv: Split ioda_eeh_get_state() powerpc/powernv: Allow to freeze PE powerpc/powernv: Enable M64 aperatus for PHB3 powerpc/eeh: Aux PE data for error log ...
2014-08-05Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvmPaolo Bonzini
Patch queue for ppc - 2014-08-01 Highlights in this release include: - BookE: Rework instruction fetch, not racy anymore now - BookE HV: Fix ONE_REG accessors for some in-hardware registers - Book3S: Good number of LE host fixes, enable HV on LE - Book3S: Some misc bug fixes - Book3S HV: Add in-guest debug support - Book3S HV: Preload cache lines on context switch - Remove 440 support Alexander Graf (31): KVM: PPC: Book3s PR: Disable AIL mode with OPAL KVM: PPC: Book3s HV: Fix tlbie compile error KVM: PPC: Book3S PR: Handle hyp doorbell exits KVM: PPC: Book3S PR: Fix ABIv2 on LE KVM: PPC: Book3S PR: Fix sparse endian checks PPC: Add asm helpers for BE 32bit load/store KVM: PPC: Book3S HV: Make HTAB code LE host aware KVM: PPC: Book3S HV: Access guest VPA in BE KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE KVM: PPC: Book3S HV: Access XICS in BE KVM: PPC: Book3S HV: Fix ABIv2 on LE KVM: PPC: Book3S HV: Enable for little endian hosts KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct KVM: PPC: Deflect page write faults properly in kvmppc_st KVM: PPC: Book3S: Stop PTE lookup on write errors KVM: PPC: Book3S: Add hack for split real mode KVM: PPC: Book3S: Make magic page properly 4k mappable KVM: PPC: Remove 440 support KVM: Rename and add argument to check_extension KVM: Allow KVM_CHECK_EXTENSION on the vm fd KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode KVM: PPC: Implement kvmppc_xlate for all targets KVM: PPC: Move kvmppc_ld/st to common code KVM: PPC: Remove kvmppc_bad_hva() KVM: PPC: Use kvm_read_guest in kvmppc_ld KVM: PPC: Handle magic page in kvmppc_ld/st KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Remove DCR handling KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: PR: Handle FSCR feature deselects Alexey Kardashevskiy (1): KVM: PPC: Book3S: Fix LPCR one_reg interface Aneesh Kumar K.V (4): KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation KVM: PPC: BOOK3S: PR: Emulate virtual timebase register KVM: PPC: BOOK3S: PR: Emulate instruction counter KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page Anton Blanchard (2): KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Bharat Bhushan (10): kvm: ppc: bookehv: Added wrapper macros for shadow registers kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR kvm: ppc: booke: Add shared struct helpers of SPRN_ESR kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 kvm: ppc: Add SPRN_EPR get helper function kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit KVM: PPC: Booke-hv: Add one reg interface for SPRG9 KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr Michael Neuling (1): KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling Mihai Caraman (8): KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule KVM: PPC: e500: Fix default tlb for victim hint KVM: PPC: e500: Emulate power management control SPR KVM: PPC: e500mc: Revert "add load inst fixup" KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 KVM: PPC: Book3s: Remove kvmppc_read_inst() function KVM: PPC: Allow kvmppc_get_last_inst() to fail KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Paul Mackerras (4): KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication Stewart Smith (2): Split out struct kvmppc_vcore creation to separate function Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 Conflicts: Documentation/virtual/kvm/api.txt
2014-08-05powerpc: Fail remap_4k_pfn() if PFN doesn't fit inside PTEMadhusudanan Kandasamy
remap_4k_pfn() silently truncates upper bits of input 4K PFN if it cannot be contained in PTE. This leads invalid memory mapping and could result in a system crash when the memory is accessed. This patch fails remap_4k_pfn() and returns -EINVAL if the input 4K PFN cannot be contained in PTE. V3 : Added parentheses to protect 'pfn' and entire macro as suggested by Brian. V2 : Rewritten to avoid helper function as suggested by Stephen Rothwell. Signed-off-by: Madhusudanan Kandasamy <kmadhu@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Invoke opal call to handle hmi.Mahesh Salgaonkar
When we hit the HMI in Linux, invoke opal call to handle/recover from HMI errors in real mode and then in virtual mode during check_irq_replay() invoke opal_poll_events()/opal_do_notifier() to retrieve HMI event from OPAL and act accordingly. Now that we are ready to handle HMI interrupt directly in linux, remove the HMI interrupt registration with firmware. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/book3s: Add basic infrastructure to handle HMI in Linux.Mahesh Salgaonkar
Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Allow to freeze PEGavin Shan
The patch synchronizes header file with firmware to have new OPAL API opal_pci_eeh_freeze_set(), which is used to freeze the specified PE in order to support "compound" PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/powernv: Enable M64 aperatus for PHB3Guo Chao
This patch enables M64 aperatus for PHB3. We already had platform hook (ppc_md.pcibios_window_alignment) to affect the PCI resource assignment done in PCI core so that each PE's M32 resource was built on basis of M32 segment size. Similarly, we're using that for M64 assignment on basis of M64 segment size. * We're using last M64 BAR to cover M64 aperatus, and it's shared by all 256 PEs. * We don't support P7IOC yet. However, some function callbacks are added to (struct pnv_phb) so that we can reuse them on P7IOC in future. * PE, corresponding to PCI bus with large M64 BAR device attached, might span multiple M64 segments. We introduce "compound" PE to cover the case. The compound PE is a list of PEs and the master PE is used as before. The slave PEs are just for MMIO isolation. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Aux PE data for error logGavin Shan
The patch allows PE (struct eeh_pe) instance to have auxillary data, whose size is configurable on basis of platform. For PowerNV, the auxillary data will be used to cache PHB diag-data for that PE (frozen PE or fenced PHB). In turn, we can retrieve the diag-data at any later points. It's useful for the case of VFIO PCI devices where the error log should be cached, and then be retrieved by the guest at later point. Also, it can avoid PHB diag-data overwritting if another frozen PE reported and the previous diag-data isn't fetched by guest. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Make diag-data not endian dependentGavin Shan
It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness problems in EEH"). The patch helps to get non-endian-dependent diag-data. Cc: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Selectively enable IO for error logGavin Shan
According to the experiment I did, PCI config access is blocked on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That means we always get 0xFF's while dumping PCI config space of the frozen PE on P7IOC. We don't have the problem on PHB3. So we have to enable I/O prioir to collecting error log. Otherwise, meaningless 0xFF's are always returned. The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is selectively set to indicate the case for: P7IOC on PowerNV platform, pSeries platform. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Refactor EEH flag accessorsGavin Shan
There are multiple global EEH flags. Almost each flag has its own accessor, which doesn't make sense. The patch refactors EEH flag accessors so that they look unified: eeh_add_flag(): Add EEH flag eeh_clear_flag(): Clear EEH flag eeh_has_flag(): Check if one specific flag has been set eeh_enabled(): Check if EEH functionality has been enabled Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: EEH support for VFIO PCI deviceGavin Shan
The patch exports functions to be used by new VFIO ioctl command, which will be introduced in subsequent patch, to support EEH functinality for VFIO PCI devices. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05powerpc/eeh: Avoid event on passed PEGavin Shan
We must not handle EEH error on devices which are passed to somebody else. Instead, we expect that the frozen device owner detects an EEH error and recovers from it. This avoids EEH error handling on passed through devices so the device owner gets a chance to handle them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt
Scott writes: Highlights include e6500 hardware threading support, an e6500 TLB erratum workaround, corenet error reporting, support for a new board, and some minor fixes.
2014-08-04Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - big rtmutex and futex cleanup and robustification from Thomas Gleixner - mutex optimizations and refinements from Jason Low - arch_mutex_cpu_relax() removal and related cleanups - smaller lockdep tweaks" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) arch, locking: Ciao arch_mutex_cpu_relax() locking/lockdep: Only ask for /proc/lock_stat output when available locking/mutexes: Optimize mutex trylock slowpath locking/mutexes: Try to acquire mutex only if it is unlocked locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro locking/mutexes: Correct documentation on mutex optimistic spinning rtmutex: Make the rtmutex tester depend on BROKEN futex: Simplify futex_lock_pi_atomic() and make it more robust futex: Split out the first waiter attachment from lookup_pi_state() futex: Split out the waiter check from lookup_pi_state() futex: Use futex_top_waiter() in lookup_pi_state() futex: Make unlock_pi more robust rtmutex: Avoid pointless requeueing in the deadlock detection chain walk rtmutex: Cleanup deadlock detector debug logic rtmutex: Confine deadlock logic to futex rtmutex: Simplify remove_waiter() rtmutex: Document pi chain walk rtmutex: Clarify the boost/deboost part rtmutex: No need to keep task ref for lock owner check rtmutex: Simplify and document try_to_take_rtmutex() ...
2014-07-31KVM: PPC: PR: Handle FSCR feature deselectsAlexander Graf
We handle FSCR feature bits (well, TAR only really today) lazily when the guest starts using them. So when a guest activates the bit and later uses that feature we enable it for real in hardware. However, when the guest stops using that bit we don't stop setting it in hardware. That means we can potentially lose a trap that the guest expects to happen because it thinks a feature is not active. This patch adds support to drop TAR when then guest turns it off in FSCR. While at it it also restricts FSCR access to 64bit systems - 32bit ones don't have it. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-30KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_sprBharat Bhushan
This are not specific to e500hv but applicable for bookehv (As per comment from Scott Wood on my patch "kvm: ppc: bookehv: Added wrapper macros for shadow registers") Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-29powerpc/e6500: Add support for hardware threadsAndy Fleming
The general idea is that each core will release all of its threads into the secondary thread startup code, which will eventually wait in the secondary core holding area, for the appropriate bit in the PACA to be set. The kick_cpu function pointer will set that bit in the PACA, and thus "release" the core/thread to boot. We also need to do a few things that U-Boot normally does for CPUs (like enable branch prediction). Signed-off-by: Andy Fleming <afleming@freescale.com> [scottwood@freescale.com: various changes, including only enabling threads if Linux wants to kick them] Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-07-29powerpc/booke: Define MSR bits the same way as reg.hScott Wood
This ensures that all MSR definitions are consistently unsigned long, and that MSR_CM does not become 0xffffffff80000000 (this is usually harmless because MSR is 32-bit on booke and is mainly noticeable when debugging, but still I'd rather avoid it). Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-07-28KVM: PPC: Remove DCR handlingAlexander Graf
DCR handling was only needed for 440 KVM. Since we removed it, we can also remove handling of DCR accesses. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Expose helper functions for data/inst faultsAlexander Graf
We're going to implement guest code interpretation in KVM for some rare corner cases. This code needs to be able to inject data and instruction faults into the guest when it encounters them. Expose generic APIs to do this in a reasonably subarch agnostic fashion. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Separate loadstore emulation from priv emulationAlexander Graf
Today the instruction emulator can get called via 2 separate code paths. It can either be called by MMIO emulation detection code or by privileged instruction traps. This is bad, as both code paths prepare the environment differently. For MMIO emulation we already know the virtual address we faulted on, so instructions there don't have to actually fetch that information. Split out the two separate use cases into separate files. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Handle magic page in kvmppc_ld/stAlexander Graf
We use kvmppc_ld and kvmppc_st to emulate load/store instructions that may as well access the magic page. Special case it out so that we can properly access it. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Move kvmppc_ld/st to common codeAlexander Graf
We have enough common infrastructure now to resolve GVA->GPA mappings at runtime. With this we can move our book3s specific helpers to load / store in guest virtual address space to common code as well. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Implement kvmppc_xlate for all targetsAlexander Graf
We have a nice API to find the translated GPAs of a GVA including protection flags. So far we only use it on Book3S, but there's no reason the same shouldn't be used on BookE as well. Implement a kvmppc_xlate() version for BookE and clean it up to make it more readable in general. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base pageAneesh Kumar K.V
When calculating the lower bits of AVA field, use the shift count based on the base page size. Also add the missing segment size and remove stale comment. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8Stewart Smith
The POWER8 processor has a Micro Partition Prefetch Engine, which is a fancy way of saying "has way to store and load contents of L2 or L2+MRU way of L3 cache". We initiate the storing of the log (list of addresses) using the logmpp instruction and start restore by writing to a SPR. The logmpp instruction takes parameters in a single 64bit register: - starting address of the table to store log of L2/L2+L3 cache contents - 32kb for L2 - 128kb for L2+L3 - Aligned relative to maximum size of the table (32kb or 128kb) - Log control (no-op, L2 only, L2 and L3, abort logout) We should abort any ongoing logging before initiating one. To initiate restore, we write to the MPPR SPR. The format of what to write to the SPR is similar to the logmpp instruction parameter: - starting address of the table to read from (same alignment requirements) - table size (no data, until end of table) - prefetch rate (from fastest possible to slower. about every 8, 16, 24 or 32 cycles) The idea behind loading and storing the contents of L2/L3 cache is to reduce memory latency in a system that is frequently swapping vcores on a physical CPU. The best case scenario for doing this is when some vcores are doing very cache heavy workloads. The worst case is when they have about 0 cache hits, so we just generate needless memory operations. This implementation just does L2 store/load. In my benchmarks this proves to be useful. Benchmark 1: - 16 core POWER8 - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each - No split core/SMT - two guests running sysbench memory test. sysbench --test=memory --num-threads=8 run - one guest running apache bench (of default HTML page) ab -n 490000 -c 400 http://localhost/ This benchmark aims to measure performance of real world application (apache) where other guests are cache hot with their own workloads. The sysbench memory benchmark does pointer sized writes to a (small) memory buffer in a loop. In this benchmark with this patch I can see an improvement both in requests per second (~5%) and in mean and median response times (again, about 5%). The spread of minimum and maximum response times were largely unchanged. benchmark 2: - Same VM config as benchmark 1 - all three guests running sysbench memory benchmark This benchmark aims to see if there is a positive or negative affect to this cache heavy benchmark. Although due to the nature of the benchmark (stores) we may not see a difference in performance, but rather hopefully an improvement in consistency of performance (when vcore switched in, don't have to wait many times for cachelines to be pulled in) The results of this benchmark are improvements in consistency of performance rather than performance itself. With this patch, the few outliers in duration go away and we get more consistent performance in each guest. benchmark 3: - same 3 guests and CPU configuration as benchmark 1 and 2. - two idle guests - 1 guest running STREAM benchmark This scenario also saw performance improvement with this patch. On Copy and Scale workloads from STREAM, I got 5-6% improvement with this patch. For Add and triad, it was around 10% (or more). benchmark 4: - same 3 guests as previous benchmarks - two guests running sysbench --memory, distinctly different cache heavy workload - one guest running STREAM benchmark. Similar improvements to benchmark 3. benchmark 5: - 1 guest, 8 VCPUs, Ubuntu 14.04 - Host configured with split core (SMT8, subcores-per-core=4) - STREAM benchmark In this benchmark, we see a 10-20% performance improvement across the board of STREAM benchmark results with this patch. Based on preliminary investigation and microbenchmarks by Prerna Saxena <prerna@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Remove 440 supportAlexander Graf
The 440 target hasn't been properly functioning for a few releases and before I was the only one who fixes a very serious bug that indicates to me that nobody used it before either. Furthermore KVM on 440 is slow to the extent of unusable. We don't have to carry along completely unused code. Remove 440 and give us one less thing to worry about. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointerBharat Bhushan
Scott Wood pointed out that We are no longer using SPRG1 for vcpu pointer, but using SPRN_SPRG_THREAD <=> SPRG3 (thread->vcpu). So this comment is not valid now. Note: SPRN_SPRG3R is not supported (do not see any need as of now), and if we want to support this in future then we have to shift to using SPRG1 for VCPU pointer. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exitBharat Bhushan
SPRN_SPRG is used by debug interrupt handler, so this is required for debug support. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Allow kvmppc_get_last_inst() to failMihai Caraman
On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1Mihai Caraman
Add mising defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() for Book3E. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28kvm: ppc: Add SPRN_EPR get helper functionBharat Bhushan
kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So rename and move get_epr helper function to same file. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> [agraf: remove duplicate return] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28kvm: ppc: booke: Add shared struct helpers of SPRN_ESRBharat Bhushan
Add and use kvmppc_set_esr() and kvmppc_get_esr() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28kvm: ppc: bookehv: Added wrapper macros for shadow registersBharat Bhushan
There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on BOOKE-HV and these shadow registers are guest accessible. So these shadow registers needs to be updated on BOOKE-HV. This patch adds new macro for get/set helper of shadow register . Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Make magic page properly 4k mappableAlexander Graf
The magic page is defined as a 4k page of per-vCPU data that is shared between the guest and the host to accelerate accesses to privileged registers. However, when the host is using 64k page size granularity we weren't quite as strict about that rule anymore. Instead, we partially treated all of the upper 64k as magic page and mapped only the uppermost 4k with the actual magic contents. This works well enough for Linux which doesn't use any memory in kernel space in the upper 64k, but Mac OS X got upset. So this patch makes magic page actually stay in a 4k range even on 64k page size hosts. This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE hosts for me. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Add hack for split real modeAlexander Graf
Today we handle split real mode by mapping both instruction and data faults into a special virtual address space that only exists during the split mode phase. This is good enough to catch 32bit Linux guests that use split real mode for copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our instruction pointer and can map the user space process freely below there. However, that approach fails when we're running KVM inside of KVM. Here the 1st level last_inst reader may well be in the same virtual page as a 2nd level interrupt handler. It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a kernel copy_from/to_user implementation can easily overlap with user space addresses. The architecturally correct way to fix this would be to implement an instruction interpreter in KVM that kicks in whenever we go into split real mode. This interpreter however would not receive a great amount of testing and be a lot of bloat for a reasonably isolated corner case. So I went back to the drawing board and tried to come up with a way to make split real mode work with a single flat address space. And then I realized that we could get away with the same trick that makes it work for Linux: Whenever we see an instruction address during split real mode that may collide, we just move it higher up the virtual address space to a place that hopefully does not collide (keep your fingers crossed!). That approach does work surprisingly well. I am able to successfully run Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I apply a tiny timing probe hack to QEMU. I'd say this is a win over even more broken split real mode :). Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Move vcore definition to end of kvm_arch structAlexander Graf
When building KVM with a lot of vcores (NR_CPUS is big), we can potentially get out of the ld immediate range for dereferences inside that struct. Move the array to the end of our kvm_arch struct. This fixes compilation issues with NR_CPUS=2048 for me. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: e500: Emulate power management control SPRMihai Caraman
For FSL e6500 core the kernel uses power management SPR register (PWRMGTCR0) to enable idle power down for cores and devices by setting up the idle count period at boot time. With the host already controlling the power management configuration the guest could simply benefit from it, so emulate guest request as a general store. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Make HTAB code LE host awareAlexander Graf
When running on an LE host all data structures are kept in little endian byte order. However, the HTAB still needs to be maintained in big endian. So every time we access any HTAB we need to make sure we do so in the right byte order. Fix up all accesses to manually byte swap. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28PPC: Add asm helpers for BE 32bit load/storeAlexander Graf
From assembly code we might not only have to explicitly BE access 64bit values, but sometimes also 32bit ones. Add helpers that allow for easy use of lwzx/stwx in their respective byte-reverse or native form. Signed-off-by: Alexander Graf <agraf@suse.de> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28KVM: PPC: e500: Fix default tlb for victim hintMihai Caraman
Tlb search operation used for victim hint relies on the default tlb set by the host. When hardware tablewalk support is enabled in the host, the default tlb is TLB1 which leads KVM to evict the bolted entry. Set and restore the default tlb when searching for victim hint. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Add H_SET_MODE hcall handlingMichael Neuling
This adds support for the H_SET_MODE hcall. This hcall is a multiplexer that has several functions, some of which are called rarely, and some which are potentially called very frequently. Here we add support for the functions that set the debug registers CIABR (Completed Instruction Address Breakpoint Register) and DAWR/DAWRX (Data Address Watchpoint Register and eXtension), since they could be updated by the guest as often as every context switch. This also adds a kvmppc_power8_compatible() function to test to see if a guest is compatible with POWER8 or not. The CIABR and DAWR/X only exist on POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabledPaul Mackerras
This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handlingPaul Mackerras
This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3s HV: Fix tlbie compile errorAlexander Graf
Some compilers complain about uninitialized variables in the compute_tlbie_rb function. When you follow the code path you'll realize that we'll never get to that point, but the compiler isn't all that smart. So just default to 4k page sizes for everything, making the compiler happy and the code slightly easier to read. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>