Age | Commit message (Collapse) | Author |
|
As part of the support for split core on POWER8, we want to be able to
block splitting of the core while KVM VMs are active.
The logic to do that would be exactly the same as the code we currently
have for inhibiting onlining of secondaries.
Instead of adding an identical mechanism to block split core, rework the
secondary inhibit code to be a "HV KVM is active" check. We can then use
that in both the cpu hotplug code and the upcoming split core code.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Alexander Graf <agraf@suse.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Based off 3bccd996 for ia64, convert powerpc to use the generic per-CPU
topology tracking, specifically:
initialize per cpu numa_node entry in start_secondary
remove the powerpc cpu_to_node()
define CONFIG_USE_PERCPU_NUMA_NODE_ID if NUMA
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Merge the binutils and kexec fixes.
|
|
With binutils 2.24, various 64 bit builds fail with relocation errors
such as
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
(.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI
against symbol `interrupt_base_book3e' defined in .text section
in arch/powerpc/kernel/built-in.o
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
(.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI
against symbol `interrupt_end_book3e' defined in .text section
in arch/powerpc/kernel/built-in.o
The assembler maintainer says:
I changed the ABI, something that had to be done but unfortunately
happens to break the booke kernel code. When building up a 64-bit
value with lis, ori, shl, oris, ori or similar sequences, you now
should use @high and @higha in place of @h and @ha. @h and @ha
(and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA)
now report overflow if the value is out of 32-bit signed range.
ie. @h and @ha assume you're building a 32-bit value. This is needed
to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h
and @toc@ha expressions, and for consistency I did the same for all
other @h and @ha relocs.
Replacing @h with @high in one strategic location fixes the relocation
errors. This has to be done conditionally since the assembler either
supports @h or @high but not both.
Cc: <stable@vger.kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
pcibios_penalize_isa_irq() is only implemented by x86 now, and legacy ISA
is not used by some architectures. Make pcibios_penalize_isa_irq() a
__weak function to simplify the code. This removes the need for new
platforms to add stub implementations of pcibios_penalize_isa_irq().
[bhelgaas: changelog, comments]
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
|
|
There is now a way to ensure all platform devices get a unique name when
populated from the device tree, and the DCR_NATIVE code path is broken
anyway. PowerPC Cell (PS3) is the only platform that actually uses this
path. Most likely nobody will notice if it is killed. Remove the code
and associated ugly #ifdef.
The user-visible impact of this patch is that any DCR device on Cell
will get a new name in the /sys/devices hierarchy.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
powerpc: Clear ELF personality flag if ELFv2 is not requested.
The POWER kernel uses a personality flag to determine whether it should
be setting up function descriptors or not (per the updated ABI). This
flag wasn't being cleared on a new process but instead was being
inherited. The visible effect was that an ELFv2 binary could not execve
to an ELFv1 binary.
Signed-off-by: Jeff Bailey <jeffbailey@google.com>
arch/powerpc/include/asm/elf.h | 2 ++
1 file changed, 2 insertions(+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Currently, on 8641D, which doesn't set CONFIG_HAVE_HW_BREAKPOINT
we get the following splat:
BUG: using smp_processor_id() in preemptible [00000000] code: login/1382
caller is set_breakpoint+0x1c/0xa0
CPU: 0 PID: 1382 Comm: login Not tainted 3.15.0-rc3-00041-g2aafe1a4d451 #1
Call Trace:
[decd5d80] [c0008dc4] show_stack+0x50/0x158 (unreliable)
[decd5dc0] [c03c6fa0] dump_stack+0x7c/0xdc
[decd5de0] [c01f8818] check_preemption_disabled+0xf4/0x104
[decd5e00] [c00086b8] set_breakpoint+0x1c/0xa0
[decd5e10] [c00d4530] flush_old_exec+0x2bc/0x588
[decd5e40] [c011c468] load_elf_binary+0x2ac/0x1164
[decd5ec0] [c00d35f8] search_binary_handler+0xc4/0x1f8
[decd5ef0] [c00d4ee8] do_execve+0x3d8/0x4b8
[decd5f40] [c001185c] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfeee554
LR = 0xfeee7d4
The call path in this case is:
flush_thread
--> set_debug_reg_defaults
--> set_breakpoint
--> __get_cpu_var
Since preemption is enabled in the cleanup of flush thread, and
there is no need to disable it, introduce the distinction between
set_breakpoint and __set_breakpoint, leaving only the flush_thread
instance as the current user of set_breakpoint.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
None of the callers check the return value, so it might as
well not have one at all.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The arch/powerpc/include/asm/linkage.h is being unintentionally exported
in the kernel headers since commit e1b5bb6d1236 (consolidate
cond_syscall and SYSCALL_ALIAS declarations) when
arch/powerpc/include/uapi/asm/linkage.h was deleted but the header-y not
removed from the Kbuild file. This happens because Makefile.headersinst
still checks the old asm/ directory if the specified header doesn't
exist in the uapi directory.
The asm/linkage.h shouldn't ever have been exported anyway. No other
arch does and it doesn't contain anything useful to userland, so remove
the header-y line from the Kbuild file which triggers the export.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This reverts commit b2b5efcf208ddc9444aca77336627428782a39f4.
This code was way too board specific, there are quirks as to how
the PERST line is wired on different boards, we'll have to revisit
this using/creating appropriate firmware interfaces.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
|
|
Patch queue for 3.15 - 2014-05-12
This request includes a few bug fixes that really shouldn't wait for the next
release.
It fixes KVM on 32bit PowerPC when built as module. It also fixes the PV KVM
acceleration when NX gets honored by the host. Furthermore we fix transactional
memory support and numa support on HV KVM.
|
|
This series adds support for building the powerpc 64-bit
LE kernel using the new ABI v2. We already supported
running ABI v2 userspace programs but this adds support
for building the kernel itself using the new ABI.
|
|
Now powerpc is the only user of struct boot_param_header and FDT defines,
so they can be moved into the powerpc architecture code.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Tested-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Stephen Chivers <schivers@csc.com>
|
|
Unaligned stores take alignment exceptions on POWER7 running in little-endian.
This is a dumb little-endian base memcpy that prevents unaligned stores.
Once booted the feature fixup code switches over to the VMX copy loops
(which are already endian safe).
The question is what we do before that switch over. The base 64bit
memcpy takes alignment exceptions on POWER7 so we can't use it as is.
Fixing the causes of alignment exception would slow it down, because
we'd need to ensure all loads and stores are aligned either through
rotate tricks or bytewise loads and stores. Either would be bad for
all other 64bit platforms.
[ I simplified the loop a bit - Anton ]
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Our PV guest patching code assembles chunks of instructions on the fly when it
encounters more complicated instructions to hijack. These instructions need
to live in a section that we don't mark as non-executable, as otherwise we
fault when jumping there.
Right now we put it into the .bss section where it automatically gets marked
as non-executable. Add a check to the NX setting function to ensure that we
leave these particular pages executable.
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
If we do a treclaim and we are not in TM suspend mode, it results in a TM bad
thing (ie. a 0x700 program check). Similarly if we do a trechkpt and we have
an active transaction or TEXASR Failure Summary (FS) is not set, we also take a
TM bad thing.
This should never happen, but if it does (ie. a kernel bug), the cause is
almost impossible to debug as the GPR state is mostly userspace and hence we
don't get a call chain.
This adds some checks in these cases case a BUG_ON() (in asm) in case we ever
hit these cases. It moves the register saving around to preserve r1 till later
also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Currently, the code in setup-common.c for powerpc assumes that all
clock rates are same in a smp system. This value is cached in the
variable named ppc_proc_freq and is the value that is reported in
/proc/cpuinfo.
However on the PowerNV platform, the clock rate is same only across
the threads of the same core. Hence the value that is reported in
/proc/cpuinfo is incorrect on PowerNV platforms. We need a better way
to query and report the correct value of the processor clock in
/proc/cpuinfo.
The patch achieves this by creating a machdep_call named
get_proc_freq() which is expected to returns the frequency in Hz. The
code in show_cpuinfo() can invoke this method to display the correct
clock rate on platforms that have implemented this method. On the
other powerpc platforms it can use the value cached in ppc_proc_freq.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Firmware update on PowerNV platform takes several minutes. During
this time one CPU is stuck in FW and the kernel complains about "soft
lockups".
This patch returns all secondary CPUs to firmware before starting
firmware update process.
[ Reworked a bit and cleaned up -- BenH ]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The patch intends to support fundamental reset on PLX downstream
ports. If the PCI device matches any one of the internal table,
which includes PLX vendor ID, bridge device ID, register offset
for fundamental reset and bit, fundamental reset will be done
accordingly. Otherwise, it will fail back to hot reset.
Additional flag (EEH_DEV_FRESET) is introduced to record the last
reset type on the PCI bridge.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The problem was initially reported by Wendy who tried pass through
IPR adapter, which was connected to PHB root port directly, to KVM
based guest. When doing that, pci_reset_bridge_secondary_bus() was
called by VFIO driver and linkDown was detected by the root port.
That caused all PEs to be frozen.
The patch fixes the issue by routing the reset for the secondary bus
of root port to underly firmware. For that, one more weak function
pci_reset_secondary_bus() is introduced so that the individual platforms
can override that and do specific reset for bridge's secondary bus.
Reported-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Basically, we have 3 types of resets to fulfil PE reset: fundamental,
hot and PHB reset. For the later 2 cases, we need PCI bus reset hold
and settlement delay as specified by PCI spec. PowerNV and pSeries
platforms are running on top of different firmware and some of the
delays have been covered by underly firmware (PowerNV).
The patch makes the delays unified to be done in backend, instead of
EEH core.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The issue was detected in a bit complicated test case where
we have multiple hierarchical PEs shown as following figure:
+-----------------+
| PE#3 p2p#0 |
| p2p#1 |
+-----------------+
|
+-----------------+
| PE#4 pdev#0 |
| pdev#1 |
+-----------------+
PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p
bridges. We accidentally had less-known scenario: PE#4 was removed
permanently from the system because of permanent failure (e.g.
exceeding the max allowd failure times in last hour), then we detects
EEH errors on PE#3 and tried to recover it. However, eeh_dev instances
for pdev#0/1 were not detached from PE#4, which was still connected to
PE#3. All of that was because of the fact that we rely on count-based
pcibios_release_device(), which isn't reliable enough. When doing
recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which
are not valid any more. Eventually, we run into kernel crash.
The patch fixes above issue from two aspects. For unplug, we simply
skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED
&& !EEH_PE_STATE_RECOVERING) and its frozen count should be greater
than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently
removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read
its PCI config so that PCI core will omit them.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There're 2 EEH subsystem variables: eeh_subsystem_enabled and
eeh_probe_mode. We needn't maintain 2 variables and we can just
have one variable and introduce different flags. The patch also
introduces additional flag EEH_FORCE_DISABLE, which will be used
to disable EEH subsystem via boot parameter ("eeh=off") in future.
Besides, the patch also introduces flag EEH_ENABLED, which is
changed to disable or enable EEH functionality on the fly through
debugfs entry in future.
With the patch applied, the creteria to check the enabled EEH
functionality is changed to:
!EEH_FORCE_DISABLED && EEH_ENABLED : Enabled
Other cases : Disabled
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When calling into eeh_gather_pci_data() on pSeries platform, we
possiblly don't have pci_dev instance yet, but eeh_dev is always
ready. So we use cached capability from eeh_dev instead of pci_dev
for log dump there. In order to keep things unified, we also cache
PCI capability positions to eeh_dev for PowerNV as well.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We've observed multiple PE reset failures because of PCI-CFG
access during that period. Potentially, some device drivers
can't support EEH very well and they can't put the device to
motionless state before PE reset. So those device drivers might
produce PCI-CFG accesses during PE reset. Also, we could have
PCI-CFG access from user space (e.g. "lspci"). Since access to
frozen PE should return 0xFF's, we can block PCI-CFG access
during the period of PE reset so that we won't get recrusive EEH
errors.
The patch adds flag EEH_PE_RESET, which is kept during PE reset.
The PowerNV/pSeries PCI-CFG accessors reuse the flag to block
PCI-CFG accordingly.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The PE state (for eeh_pe instance) EEH_PE_PHB_DEAD is duplicate to
EEH_PE_ISOLATED. Originally, those PHBs (PHB PE) with EEH_PE_PHB_DEAD
would be removed from the system. However, it's safe to replace
that with EEH_PE_ISOLATED.
The patch also clear EEH_PE_RECOVERING after fenced PHB has been handled,
either failure or success. It makes the PHB PE state consistent with:
PHB functions normally NONE
PHB has been removed EEH_PE_ISOLATED
PHB fenced, recovery in progress EEH_PE_ISOLATED | RECOVERING
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
I've had a report that the current limit is too small for
an automated network based installer. Bump it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.
The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Fix little endian issues with the OPAL error log code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We had some duplication of the internal OPAL functions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
ftrace has way too much knowledge of our kernel module trampoline
layout hidden inside it. Create module_trampoline_target() that gives
the target address of a kernel module trampoline.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
ftrace has way too much knowledge of our kernel module trampoline
layout hidden inside it. Create is_module_trampoline() that can
abstract this away inside the module loader code.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
Use ppc_function_entry in places where we previously assumed
function descriptors exist.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
If an assembly function that calls back into c code is exported to
modules, we need to ensure r2 is setup correctly. There are only
two places crazy enough to do it (two of which are my fault).
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
The new ELF ABI tends to use R_PPC64_REL16_LO and R_PPC64_REL16_HA
relocations (PC-relative), so implement them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
The kernel resolved the '.TOC.' to a fake symbol, so we need to fix it up
to point to our .toc section plus 0x8000.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Don't try and dereference a function descriptor on ABIv2.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
TRACE_WITH_FRAME_BUFFER creates 32 byte stack frames. On ppc64
ABIv1 this is too small and a callee could corrupt the stack by
writing to the parameter save area (starting at offset 48).
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
The r2 TOC/GOT save offset is 40 on ABIv1 and 24 on ABIv2.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
Fix STK_PARAM and use it instead of hardcoding ABIv1 offsets.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
Skip over the well known global entry point code for ABIv2.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
ABIv2 doesn't have function descriptors or dot symbols. One
new thing it does add is a function global and a local entry
point, so add that to our _GLOBAL macro.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
There are a few places we have to use dot symbols with the
current ABI - the syscall table and the kvm hcall table.
Wrap both of these with a new macro called DOTSYM so it will
be easy to transition away from dot symbols in a future ABI.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
Now there are no users of _INIT_GLOBAL(), _STATIC() and _INIT_STATIC()
we can remove them.
Signed-off-by: Anton Blanchard <anton@samba.org>
|
|
binutils is smart enough to know that a branch to a function
descriptor is actually a branch to the functions text address.
Alan tells me that binutils has been doing this for 9 years.
Signed-off-by: Anton Blanchard <anton@samba.org>
|