summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2012-09-10powerpc/eeh: Introduce eeh_pe structGavin Shan
As defined in PAPR 2.4, Partitionable Endpoint (PE) is an I/O subtree that can be treated as a unit for the purposes of partitioning and error recovery. Therefore, eeh core should be aware of PE. With eeh_pe struct, we can support PE explicitly. Further more, it makes all the stuff much more data centralized. Another important reason is for eeh core to support multiple platforms. Some of them like pSeries figures out PEs through OF nodes while others like powernv have to do that through PCI bus/device tree. With explicit PE support, eeh core will be implemented based on the centrialized data and platform dependent implementations figure it out by their feasible ways. When the struct is designed, following factors are taken in account: * Reflecting the relationships of PEs. PE might have parent as well children. * Reflecting the association of PE and (eeh) devices. * PEs have PHB boundary. * PE should have unique address assigned in the corresponding PHB domain. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: More logs for EEH initializationGavin Shan
The patch adds more logs to EEH initialization functions for debugging purpose. Also, the machine type (pSeries) is checked in the platform initialization to assure it's the correct platform to invoke it. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Use slab to allocate eeh devicesGavin Shan
The EEH initialization functions have been postponed until slab/slub are ready. So we use slab/slub to allocate the memory chunks for newly creatd EEH devices. That would save lots of memory. The patch also does cleanup to replace "kmalloc" with "kzalloc" so that we needn't clear the allocated memory chunk explicitly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc/eeh: Move EEH initialization aroundGavin Shan
Currently, we have 3 phases for EEH initialization on pSeries platform. All of them are done through builtin functions: platform initialization, EEH device creation, and EEH subsystem enablement. All of them are done no later than ppc_md.setup_arch. That means that the slab/slub isn't ready yet, so we have to allocate memory chunks on basis of PAGE_SIZE for those dynamically created EEH devices. That's pretty expensive. In order to utilize slab/slub for memory allocation, we have to move the EEH initialization functions around, but all of them should be called after slab is ready. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10powerpc: Initialise paca.data_offset with poisonMichael Ellerman
It's possible for the cpu_possible_mask to change between the time we initialise the pacas and the time we setup per_cpu areas. Obviously impossible cpus shouldn't ever be running, but stranger things have happened. So be paranoid and initialise data_offset with a poison value in case we don't set it up later. Based on a patch from Anton Blanchard. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-09powerpc: Option FB_FSL_DIU is not really optional for mpc512xPaul Gortmaker
In powerpc randconfig builds, this keeps showing up: CC arch/powerpc/platforms/512x/mpc512x_shared.o arch/powerpc/platforms/512x/mpc512x_shared.c:70:9: warning: 'enum fsl_diu_monitor_port' declared inside parameter list arch/powerpc/platforms/512x/mpc512x_shared.c:70:9: warning: its scope is only this definition or declaration, which is probably not what you want arch/powerpc/platforms/512x/mpc512x_shared.c:69:56: error: parameter 1 ('port') has incomplete type arch/powerpc/platforms/512x/mpc512x_shared.c:69:5: warning: function declaration isn't a prototype arch/powerpc/platforms/512x/mpc512x_shared.c:84:9: warning: 'enum fsl_diu_monitor_port' declared inside parameter list arch/powerpc/platforms/512x/mpc512x_shared.c:83:56: error: parameter 1 ('port') has incomplete type arch/powerpc/platforms/512x/mpc512x_shared.c:83:6: warning: function declaration isn't a prototype arch/powerpc/platforms/512x/mpc512x_shared.c:88:36: warning: 'enum fsl_diu_monitor_port' declared inside parameter list arch/powerpc/platforms/512x/mpc512x_shared.c:88:57: error: parameter 1 ('port') has incomplete type arch/powerpc/platforms/512x/mpc512x_shared.c:88:6: warning: function declaration isn't a prototype arch/powerpc/platforms/512x/mpc512x_shared.c:187:54: error: parameter 1 ('port') has incomplete type arch/powerpc/platforms/512x/mpc512x_shared.c:187:1: error: return type is an incomplete type arch/powerpc/platforms/512x/mpc512x_shared.c:187:1: warning: function declaration isn't a prototype arch/powerpc/platforms/512x/mpc512x_shared.c: In function 'mpc512x_valid_monitor_port': arch/powerpc/platforms/512x/mpc512x_shared.c:189:9: error: 'FSL_DIU_PORT_DVI' undeclared (first use in this function) arch/powerpc/platforms/512x/mpc512x_shared.c:189:9: note: each undeclared identifier is reported only once for each function it appears in arch/powerpc/platforms/512x/mpc512x_shared.c:189:2: warning: 'return' with a value, in function returning void make[2]: *** [arch/powerpc/platforms/512x/mpc512x_shared.o] Error 1 The reason is that mpc512x_shared.c has a couple token #ifdef on FB_FSL_DIU/FB_FSL_DIU_MODULE, but they don't come close to masking all the DIU dependencies, as the above fail shows. Rather than sprinkle more pointless #ifdef in this file, just remove the existing two, and make FB_FSL_DIU part of the dependency. The mpc512x_defconfig already has the line "CONFIG_FB_FSL_DIU=y" so this change should be zero impact on real world configs. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Anatolij Gustschin <agust@denx.de>
2012-09-09powerpc: 512x: Fix mpc5121_clk_get()Richard Weinberger
If try_module_get() fails, mpc5121_clk_get() might return a wrong clock. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Anatolij Gustschin <agust@denx.de>
2012-09-08Merge branch 'fixes-for-3.6' of ↵Linus Torvalds
git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull DMA-mapping fixes from Marek Szyprowski: "Another set of fixes for ARM dma-mapping subsystem. Commit e9da6e9905e6 replaced custom consistent buffer remapping code with generic vmalloc areas. It however introduced some regressions caused by limited support for allocations in atomic context. This series contains fixes for those regressions. For some subplatforms the default, pre-allocated pool for atomic allocations turned out to be too small, so a function for setting its size has been added. Another set of patches adds support for atomic allocations to IOMMU-aware DMA-mapping implementation. The last part of this pull request contains two fixes for Contiguous Memory Allocator, which relax too strict requirements." * 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages() ARM: dma-mapping: Refactor out to introduce __in_atomic_pool ARM: dma-mapping: atomic_pool with struct page **pages ARM: Kirkwood: increase atomic coherent pool size ARM: DMA-Mapping: print warning when atomic coherent allocation fails ARM: DMA-Mapping: add function for setting coherent pool size from platform code ARM: relax conditions required for enabling Contiguous Memory Allocator mm: cma: fix alignment requirements for contiguous regions
2012-09-07powerpc: Use the XDABR hcallMichael Neuling
We never use the XDABR hcall since we check for DABR hcall first. XDABR syscall is better since it allows us to also set the DABRX. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Use consistent name info for arch_hw_breakpointMichael Neuling
Change bp_info to info to be consistent with the rest of this file. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Pack arch_hw_breakpoint to avoid holes in structMichael Neuling
No functional change Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Export memory limit via device treeSuzuki Poulose
The powerpc kernel doesn't export the memory limit enforced by 'mem=' kernel parameter. This is required for building the ELF header in kexec-tools to limit the vmcore to capture only the used memory. On powerpc the kexec-tools depends on the device-tree for memory related information, unlike /proc/iomem on the x86. Without this information, the kexec-tools assumes the entire System RAM and vmcore creates an unnecessarily larger dump. This patch exports the memory limit, if present, via chosen/linux,memory-limit property, so that the vmcore can be limited to the memory limit. The prom_init seems to export this value in the same node. But doesn't really appear there. Also the memory_limit gets adjusted with the processing of crashkernel= parameter. This patch makes sure we get the actual limit. The kexec-tools will use the value to limit the 'end' of the memory regions. Tested this patch on ppc64 and ppc32(ppc440) with a kexec-tools patch by Mahesh. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Tested-by: Mahesh J. Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Change memory_limit from phys_addr_t to unsigned long longSuzuki Poulose
There are some device-tree nodes, whose values are of type phys_addr_t. The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT. Change these to a fixed unsigned long long for consistency. This patch does the change only for memory_limit. The following is a list of such variables which need the change: 1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c 2) (struct resource *)crashk_res.start - We could export a local static variable from machine_kexec.c. Changing the above values might break the kexec-tools. So, I will fix kexec-tools first to handle the different sized values and then change the above. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Fix build dependencies for c files requiring libfdt.hMatthew McClintock
Several files in obj-plat depend on libfdt header file. Sometimes when building one can see the following issue. This patch adds libfdt as dependency to those object files | In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0: | arch/powerpc/boot/libfdt.h:854:1: error: unterminated comment | In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0: | arch/powerpc/boot/libfdt.h:1:0: error: unterminated #ifndef | BOOTCC arch/powerpc/boot/inffast.o | make[1]: *** [arch/powerpc/boot/treeboot-iss4xx.o] Error 1 | make[1]: *** Waiting for unfinished jobs.... | BOOTCC arch/powerpc/boot/inflate.o | make: *** [uImage] Error 2 | ERROR: oe_runmake failed | ERROR: Function failed: do_compile (see /srv/home/pokybuild/yocto-autobuilder/yocto-slave/p1022ds/build/build/tmp/work/p1022ds-poky-linux-gnuspe/linux-qoriq-sdk-3.0.34-r5/temp/log.do_compile.2167 for further information) NOTE: recipe linux-qoriq-sdk-3.0.34-r5: task do_compile: Failed Signed-off-by: Matthew McClintock <msm@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc/oprofile: Fix marked events support on Power7+ not set.Carl E. Love
Starting with Power 7+ we need to check for marked events if the SIAR register is valid, i.e. it contains the correct address of the instruction at the time the performance counter overflowed. The mmcra register on Power 7+, contains a new bit to indicate that the contents of the SIAR is valid. If the event is not marked, then the sample is recorded independently of the SIAR valid bit setting. For older processors, there is no SIAR valid bit to check so the samples are always recorded. This is done by forcing the cntr_marked_events bit mask to zero. The code will always record the sample in this case since the bit mask says the event is not a marked event even if it really is a marked event. Signed-off-by: Carl Love <cel@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Define Power7+ PV constant PV_POWER7psukadev@linux.vnet.ibm.com
This definition will be used by subsequent perf and oprofile patches Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc/pseries: Round up MSI-X requestsAnton Blanchard
The pseries firmware currently refuses any non power of two MSI-X request. Unfortunately most network drivers end up asking for that because they want a power of two for RX queues and one or two extra for everything else. This patch rounds up the firmware request to the next power of two if the quota allows it. If this fails we fall back to using the original request size. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc/pci: Save P2P bridge resource if possibleGavin Shan
When PCI probe flag PCI_REASSIGN_ALL_RSRC has been passed into PCI core, it's hoped that all resources to be reassigned by PCI core. As to particular P2P (PCI-to-PCI) bridge, the size of the corresponding BAR (I/O, MMIO, prefetchable MMIO) is calculated by the resources required by the PCI devices behind the P2P bridge. That means that the information like start/end address retrieved from the hardware registers of the P2P bridge is meainingless in the case. However, we still count that in and the BARs might have been configured by firmware with non-zero size. That leads to space waste. The patch explicitly sets the size of P2P bridge BARs to zero in case that resource reassignment is expected with PCI probe flag PCI_REASSIGN_ALL_RSRC. In the result, it will save overall resource required by the system without waste. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-06Merge tag 'stable/for-linus-3.6-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: * Fix for TLB flushing introduced in v3.6 * Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but in a 32-bit kernel we need to allocate for coherent pages from a 32-bit pool. * When trying to re-use P2M nodes we had a one-off error and triggered a BUG_ON check with specific CONFIG_ option. * When doing FLR in Xen-PCI-backend we would first do FLR then save the PCI configuration space. We needed to do it the other way around. * tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pciback: Fix proper FLR steps. xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory.
2012-09-07Merge branch 'merge' into nextBenjamin Herrenschmidt
Brings in various bug fixes from 3.6-rcX
2012-09-07powerpc/kprobes: Rename opcode_t in probes.h to ppc_opcode_tAnanth N Mavinakayanahalli
commit: 8b7b80b9ebb46dd88fbb94e918297295cf312b59 [24/29] powerpc: Uprobes port to powerpc Caused a clash with the fore200e driver: In file included from drivers/atm/fore200e.c:70:0: drivers/atm/fore200e.h:263:3: error: redefinition of typedef 'opcode_t' with different type arch/powerpc/include/asm/probes.h:25:13: note: previous declaration of 'opcode_t' was here Fix the namespace clash by making opcode_t in probes.h to ppc_opcode_t. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-07powerpc: Restore VDSO information on critical exception om BookEMihai Caraman
Critical exception on 64-bit booke uses user-visible SPRG3 as scratch. Restore VDSO information in SPRG3 on exception prolog. Use a common sprg3 field in PACA for all powerpc64 architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-06Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC bug fixes from Olof Johansson: "Mostly Renesas and Atmel bugfixes this time, targeting boot and build problems. A couple of patches for gemini and kirkwood as well. On a whole nothing very controversial." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: gemini: fix the gemini build ARM: shmobile: armadillo800eva: enable rw rootfs mount ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c ARM: shmobile: mackerel: fixup usb module order ARM: shmobile: armadillo800eva: fixup: sound card detection order ARM: shmobile: marzen: fixup smsc911x id for regulator ARM: at91/feature-removal-schedule: delay at91_mci removal ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts ARM: at91/clock: fix PLLA overclock warning ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support ARM: at91: fix system timer irq issue due to sparse irq support ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
2012-09-05uml: fix compile error in deliver_alarm()Miklos Szeredi
Fix the following compile error on UML. arch/um/os-Linux/time.c: In function 'deliver_alarm': arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler' arch/um/os-Linux/internal.h:1:6: note: declared here The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest process") in 3.6-rc1. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Martin Pärtel <martin.partel@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-05xen: fix logical error in tlb flushingAlex Shi
While TLB_FLUSH_ALL gets passed as 'end' argument to flush_tlb_others(), the Xen code was made to check its 'start' parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not be flushed from TLB. This patch fixed this issue. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Jan Beulich <jbeulich@suse.com> Tested-by: Yongjie Ren <yongjie.ren@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-05Merge commit '4cb38750d49010ae72e718d46605ac9ba5a851b4' into ↵Konrad Rzeszutek Wilk
stable/for-linus-3.6 * commit '4cb38750d49010ae72e718d46605ac9ba5a851b4': (6849 commits) bcma: fix invalid PMU chip control masks [libata] pata_cmd64x: whitespace cleanup libata-acpi: fix up for acpi_pm_device_sleep_state API sata_dwc_460ex: device tree may specify dma_channel ahci, trivial: fixed coding style issues related to braces ahci_platform: add hibernation callbacks libata-eh.c: local functions should not be exposed globally libata-transport.c: local functions should not be exposed globally sata_dwc_460ex: support hardreset ata: use module_pci_driver drivers/ata/pata_pcmcia.c: adjust suspicious bit operation pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2 [libata] Prevent interface errors with Seagate FreeAgent GoFlex drivers/acpi/glue: revert accidental license-related 6b66d95895c bits libata-acpi: add missing inlines in libata.h i2c-omap: Add support for I2C_M_STOP message flag i2c: Fall back to emulated SMBus if the operation isn't supported natively i2c: Add SCCB support i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter ...
2012-09-05xen/p2m: Fix one-off error in checking the P2M tree directory.Konrad Rzeszutek Wilk
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES inclusive) when trying to figure out whether we can re-use some of the P2M middle leafs. Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512 we would try to use the 512th entry. Fortunately for us the p2m_top_index has a check for this: BUG_ON(pfn >= MAX_P2M_PFN); which we hit and saw this: (XEN) domain_crash_sync called from entry.S (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.1.2-OVM x86_64 debug=n Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff819cadeb>] (XEN) RFLAGS: 0000000000000212 EM: 1 CONTEXT: pv guest (XEN) rax: ffffffff81db5000 rbx: ffffffff81db4000 rcx: 0000000000000000 (XEN) rdx: 0000000000480211 rsi: 0000000000000000 rdi: ffffffff81db4000 (XEN) rbp: ffffffff81793db8 rsp: ffffffff81793d38 r8: 0000000008000000 (XEN) r9: 4000000000000000 r10: 0000000000000000 r11: ffffffff81db7000 (XEN) r12: 0000000000000ff8 r13: ffffffff81df1ff8 r14: ffffffff81db6000 (XEN) r15: 0000000000000ff8 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 0000000661795000 cr2: 0000000000000000 Fixes-Oracle-Bug: 14570662 CC: stable@vger.kernel.org # only for v3.5 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-05powerpc: Don't use __put_user() in patch_instructionBenjamin Herrenschmidt
patch_instruction() can be called very early on ppc32, when the kernel isn't yet running at it's linked address. That can cause the ! is_kernel_addr() test in __put_user() to trip and call might_sleep() which is very bad at that point during boot. Use a lower level function instead for now, at least until we get to rework ppc32 boot process to do the code patching later, like ppc64 does. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Make sure IPI handlers see data written by IPI sendersPaul Mackerras
We have been observing hangs, both of KVM guest vcpu tasks and more generally, where a process that is woken doesn't properly wake up and continue to run, but instead sticks in TASK_WAKING state. This happens because the update of rq->wake_list in ttwu_queue_remote() is not ordered with the update of ipi_message in smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in scheduler_ipi() is not ordered with the reading of ipi_message in smp_ipi_demux(). Thus it is possible for the IPI receiver not to see the updated rq->wake_list and therefore conclude that there is nothing for it to do. In order to make sure that anything done before smp_send_reschedule() is ordered before anything done in the resulting call to scheduler_ipi(), this adds barriers in smp_muxed_message_pass() and smp_ipi_demux(). The barrier in smp_muxed_message_pass() is a full barrier to ensure that there is a full ordering between the smp_send_reschedule() caller and scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than xchg_local() because xchg() includes release and acquire barriers. Using xchg() rather than xchg_local() makes sense given that ipi_message is not just accessed locally. This moves the barrier between setting the message and calling the cause_ipi() function into the individual cause_ipi implementations. Most of them -- those that used outb, out_8 or similar -- already had a full barrier because out_8 etc. include a sync before the MMIO store. This adds an explicit barrier in the two remaining cases. These changes made no measurable difference to the speed of IPIs as measured using a simple ping-pong latency test across two CPUs on different cores of a POWER7 machine. The analysis of the reason why processes were not waking up properly is due to Milton Miller. Cc: stable@vger.kernel.org # v3.0+ Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Restore correct DSCR in context switchAnton Blanchard
During a context switch we always restore the per thread DSCR value. If we aren't doing explicit DSCR management (ie thread.dscr_inherit == 0) and the default DSCR changed while the process has been sleeping we end up with the wrong value. Check thread.dscr_inherit and select the default DSCR or per thread DSCR as required. This was found with the following test case, when running with more threads than CPUs (ie forcing context switching): http://ozlabs.org/~anton/junkcode/dscr_default_test.c With the four patches applied I can run a combination of all test cases successfully at the same time: http://ozlabs.org/~anton/junkcode/dscr_default_test.c http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> # 3.0+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Fix DSCR inheritance in copy_thread()Anton Blanchard
If the default DSCR is non zero we set thread.dscr_inherit in copy_thread() meaning the new thread and all its children will ignore future updates to the default DSCR. This is not intended and is a change in behaviour that a number of our users have hit. We just need to inherit thread.dscr and thread.dscr_inherit from the parent which ends up being much simpler. This was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_default_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> # 3.0+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Keep thread.dscr and thread.dscr_inherit in syncAnton Blanchard
When we update the DSCR either via emulation of mtspr(DSCR) or via a change to dscr_default in sysfs we don't update thread.dscr. We will eventually update it at context switch time but there is a period where thread.dscr is incorrect. If we fork at this point we will copy the old value of thread.dscr into the child. To avoid this, always keep thread.dscr in sync with reality. This issue was found with the following testcase: http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> # 3.0+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Update DSCR on all CPUs when writing sysfs dscr_defaultAnton Blanchard
Writing to dscr_default in sysfs doesn't actually change the DSCR - we rely on a context switch on each CPU to do the work. There is no guarantee we will get a context switch in a reasonable amount of time so fire off an IPI to force an immediate change. This issue was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> # 3.0+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/powernv: Always go into nap mode when CPU is offlinePaul Mackerras
The CPU hotplug code for the powernv platform currently only puts offline CPUs into nap mode if the powersave_nap variable is set. However, HV-style KVM on this platform requires secondary CPU threads to be offline and in nap mode. Since we know nap mode works just fine on all POWER7 machines, and the only machines that support the powernv platform are POWER7 machines, this changes the code to always put offline CPUs into nap mode, regardless of powersave_nap. Powersave_nap still controls whether or not CPUs go into nap mode when idle, as before. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Give hypervisor decrementer interrupts their own handlerPaul Mackerras
At the moment the handler for hypervisor decrementer interrupts is the same as for decrementer interrupts, i.e. timer_interrupt(). This is bogus; if we ever do get a hypervisor decrementer interrupt it won't have anything to do with the next timer event. In fact the only time we get hypervisor decrementer interrupts is when one is left pending on exit from a KVM guest. When we get a hypervisor decrementer interrupt we don't need to do anything special to clear it, since they are edge-triggered on the transition of HDEC from 0 to -1. Thus this adds an empty handler function for them. We don't need to have them masked when interrupts are soft-disabled, so we use STD_EXCEPTION_HV instead of MASKABLE_EXCEPTION_HV. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/vphn: Fix arch_update_cpu_topology() return valueJesse Larrew
arch_update_cpu_topology() should only return 1 when the topology has actually changed, and should return 0 otherwise. This patch fixes a potential bug where rebuild_sched_domains() would reinitialize the sched domains even when the topology hasn't changed. Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/booke64: Use SPRG0/3 scratch for bolted TLB miss & crit intMihai Caraman
Embedded.Hypervisor category defines GSPRG0..3 physical registers for guests. Avoid SPRG4-7 usage as scratch in host exception handlers, otherwise guest SPRG4-7 registers will be clobbered. For bolted TLB miss exception handlers, which is the version currently supported by KVM, use SPRN_SPRG_GEN_SCRATCH aka SPRG0 instead of SPRN_SPRG_TLB_SCRATCH aka SPRG6. Keep using TLB PACA slots to fit in one 64-byte cache line. For critical exception handlers use SPRG3 instead of SPRG7. Provide a routine to store and restore user-visible SPRGs. This will be subsequently used to restore VDSO information in SPRG3. Add EX_R13 to paca slots to free up SPRG3 and change the critical exception epilog to use it. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/booke64: Eemove mfspr srr1 duplicate in exception prologMihai Caraman
Refactor exception prolog to get rid of mfspr srr1 duplicate. This was introduced by KVM integration, with DO_KVM macro logic expecting srr1 value earlier in r11. Reserve r11 to hold srr1's value also required at the end of the prolog and free up r10 to serve as spare in addition macros. For syscalls case this change does not add any performance penalty. For irq soft-disabled case the change adds a store/load of conditional register value to/from a paca slot. Paca slots fit in one 64-byte cache line so these additional operations have little impact on performance. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/booke64: Add DO_KVM kernel hooksMihai Caraman
Hook DO_KVM macro into 64-bit booke for KVM integration. Extend interrupt handlers' parameter list with interrupt vector numbers to accomodate the macro. Only the bolted version of tlb miss handers is addressed now. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/booke64: Use GSRR registers in Guest Doorbell interruptsMihai Caraman
Guest Doorbell interrupts use guest save and restore registers. Add a new Guest Doorbell exception type to accommodate GSRR0/1 SPRs usage in exception prolog and fix the exception handler. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc/booke64: Fix machine check handler to use the right prologMihai Caraman
Machine check exception handler was using a wrong prolog. Hypervisors like KVM which are called early from the exception handler rely on the interrupt source. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Uprobes port to powerpcAnanth N Mavinakayanahalli
This is the port of uprobes to powerpc. Usage is similar to x86. [root@xxxx ~]# ./bin/perf probe -x /lib64/libc.so.6 malloc Added new event: probe_libc:malloc (on 0xb4860) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 [root@xxxx ~]# ./bin/perf record -e probe_libc:malloc -aR sleep 20 [ perf record: Woken up 22 times to write data ] [ perf record: Captured and wrote 5.843 MB perf.data (~255302 samples) ] [root@xxxx ~]# ./bin/perf report --stdio ... 69.05% tar libc-2.12.so [.] malloc 28.57% rm libc-2.12.so [.] malloc 1.32% avahi-daemon libc-2.12.so [.] malloc 0.58% bash libc-2.12.so [.] malloc 0.28% sshd libc-2.12.so [.] malloc 0.08% irqbalance libc-2.12.so [.] malloc 0.05% bzip2 libc-2.12.so [.] malloc 0.04% sleep libc-2.12.so [.] malloc 0.03% multipathd libc-2.12.so [.] malloc 0.01% sendmail libc-2.12.so [.] malloc 0.01% automount libc-2.12.so [.] malloc The trap_nr addition patch is a prereq. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Add trap_nr to thread_structAnanth N Mavinakayanahalli
Add thread_struct.trap_nr and use it to store the last exception the thread experienced. In this patch, we populate the field at various places where we force_sig_info() to the process. This is also used in uprobes to determine if the probed instruction caused an exception. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Consolidate {k,u}probe definitionsAnanth N Mavinakayanahalli
Move is_trap() and relatives to a common file to be shared between kprobes and uprobes. Code movement only; no change in functionality. Suggested by Michael Ellerman. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Rename 64-bit PVR constants to PVR_fooMichael Ellerman
We have an old FIXME in reg.h which points out that we should standardise on PVR_foo for our PVR #defines. Currently we use PVR_ on 32-bit and PV_ on 64-bit. So do that rename and remove the FIXME. Seeing as we're touching all but one usage of __is_processor(), rename it to something less ugly and more indicative of what it does, which is simply to check the PVR version. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Remove <asm/abs_addr.h>Michael Ellerman
It contains no code and is not included by anyone. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Remove all includes of <asm/abs_addr.h>Michael Ellerman
It's empty now, apart from other includes. Fixup a few files that were getting things via this header. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Remove virt_to_abs() now all users have been fixedMichael Ellerman
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Remove abs_to_virt() now all users have been fixedMichael Ellerman
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05powerpc: Remove phys_to_abs() now all users have been removedMichael Ellerman
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>