summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2014-11-04x86-64: Use RIP-relative addressing for most per-CPU accessesJan Beulich
Observing that per-CPU data (in the SMP case) is reachable by exploiting 64-bit address wraparound (building on the default kernel load address being at 16Mb), the one byte shorter RIP-relative addressing form can be used for most per-CPU accesses. The one exception are the "stable" reads, where the use of the "P" operand modifier prevents the compiler from using RIP-relative addressing, but is unavoidable due to the use of the "p" constraint (side note: with gcc 4.9.x the intended effect of this isn't being achieved anymore, see gcc bug 63637). With the dependency on the minimum kernel load address, arbitrarily low values for CONFIG_PHYSICAL_START are now no longer possible. A link time assertion is being added, directing to the need to increase that value when it triggers. Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/5458A1780200007800044A9D@mail.emea.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04x86: Convert a few more per-CPU items to read-mostly onesJan Beulich
Both this_cpu_off and cpu_info aren't getting modified post boot, yet are being accessed on enough code paths that grouping them with other frequently read items seems desirable. For cpu_info this at the same time implies removing the cache line alignment (which afaict became pointless when it got converted to per-CPU data years ago). Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/54589BD20200007800044A84@mail.emea.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04x86: numachip: APIC driver cleanupsDaniel J Blueman
Drop printing that serves no purpose, as it's printing fixed or known values, and mark constant structure appropriately. Signed-off-by: Daniel J Blueman <daniel@numascale.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1415089784-28779-3-git-send-email-daniel@numascale.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04x86: numachip: Elide self-IPI ICR pollingDaniel J Blueman
The default self-IPI path polls the ICR to delay sending the IPI until there is no IPI in progress. This is redundant on x86-86 APICs, since IPIs are queued. See the AMD64 Architecture Programmer's Manual, vol 2, p525. Signed-off-by: Daniel J Blueman <daniel@numascale.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1415089784-28779-2-git-send-email-daniel@numascale.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-04x86: numachip: Fix 16-bit APIC ID truncationDaniel J Blueman
Prevent 16-bit APIC IDs being truncated by using correct mask. This fixes booting large systems, where the wrong core would receive the startup and init IPIs, causing hanging. Signed-off-by: Daniel J Blueman <daniel@numascale.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1415089784-28779-1-git-send-email-daniel@numascale.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-03x86_64,vsyscall: Make vsyscall emulation configurableAndy Lutomirski
This adds CONFIG_X86_VSYSCALL_EMULATION, guarded by CONFIG_EXPERT. Turning it off completely disables vsyscall emulation, saving ~3.5k for vsyscall_64.c, 4k for vsyscall_emu_64.S (the fake vsyscall page), some tiny amount of core mm code that supports a gate area, and possibly 4k for a wasted pagetable. The latter is because the vsyscall addresses are misaligned and fit poorly in the fixmap. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/406db88b8dd5f0cbbf38216d11be34bbb43c7eae.1414618407.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-03x86_64, vsyscall: Rewrite comment and clean up headers in vsyscall codeAndy Lutomirski
vsyscall_64.c is just vsyscall emulation. Tidy it up accordingly. [ tglx: Preserved the original copyright notices ] Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/9c448d5643d0fdb618f8cde9a54c21d2bcd486ce.1414618407.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-03x86_64, vsyscall: Turn vsyscalls all the way off when vsyscall==noneAndy Lutomirski
I see no point in having an unusable read-only page sitting at 0xffffffffff600000 when vsyscall=none. Instead, skip mapping it and remove it from /proc/PID/maps. I kept the ratelimited warning when programs try to use a vsyscall in this mode, since it may help admins avoid confusion. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/0dddbadc1d4e3bfbaf887938ff42afc97a7cc1f2.1414618407.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-03x86,vdso: Use LSL unconditionally for vgetcpuAndy Lutomirski
LSL is faster than RDTSCP and works everywhere; there's no need to switch between them depending on CPU. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Andi Kleen <andi@firstfloor.org> Link: http://lkml.kernel.org/r/72f73d5ec4514e02bba345b9759177ef03742efb.1414706021.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-03kvm: kvmclock: use get_cpu() and put_cpu()Tiejun Chen
We can use get_cpu() and put_cpu() to replace preempt_disable()/cpu = smp_processor_id() and preempt_enable() for slightly better code. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-01x86, microcode, AMD: Fix early ucode loading on 32-bitBorislav Petkov
Konrad triggered the following splat below in a 32-bit guest on an AMD box. As it turns out, in save_microcode_in_initrd_amd() we're using the *physical* address of the container *after* we have enabled paging and thus we #PF in load_microcode_amd() when trying to access the microcode container in the ramdisk range. Because the ramdisk is exactly there: [ 0.000000] RAMDISK: [mem 0x35e04000-0x36ef9fff] and we fault at 0x35e04304. And since this guest doesn't relocate the ramdisk, we don't do the computation which will give us the correct virtual address and we end up with the PA. So, we should actually be using virtual addresses on 32-bit too by the time we're freeing the initrd. Do that then! Unpacking initramfs... BUG: unable to handle kernel paging request at 35d4e304 IP: [<c042e905>] load_microcode_amd+0x25/0x4a0 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.1-302.fc21.i686 #1 Hardware name: Xen HVM domU, BIOS 4.4.1 10/01/2014 task: f5098000 ti: f50d0000 task.ti: f50d0000 EIP: 0060:[<c042e905>] EFLAGS: 00010246 CPU: 0 EIP is at load_microcode_amd+0x25/0x4a0 EAX: 00000000 EBX: f6e9ec4c ECX: 00001ec4 EDX: 00000000 ESI: f5d4e000 EDI: 35d4e2fc EBP: f50d1ed0 ESP: f50d1e94 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 35d4e304 CR3: 00e33000 CR4: 000406d0 Stack: 00000000 00000000 f50d1ebc f50d1ec4 f5d4e000 c0d7735a f50d1ed0 15a3d17f f50d1ec4 00600f20 00001ec4 bfb83203 f6e9ec4c f5d4e000 c0d7735a f50d1ed8 c0d80861 f50d1ee0 c0d80429 f50d1ef0 c0d889a9 f5d4e000 c0000000 f50d1f04 Call Trace: ? unpack_to_rootfs ? unpack_to_rootfs save_microcode_in_initrd_amd save_microcode_in_initrd free_initrd_mem populate_rootfs ? unpack_to_rootfs do_one_initcall ? unpack_to_rootfs ? repair_env_string ? proc_mkdir kernel_init_freeable kernel_init ret_from_kernel_thread ? rest_init Reported-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> References: https://bugzilla.redhat.com/show_bug.cgi?id=1158204 Fixes: 75a1ba5b2c52 ("x86, microcode, AMD: Unify valid container checks") Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> # v3.14+ Link: http://lkml.kernel.org/r/20141101100100.GA4462@pd.tnic Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-01x86, MCE, AMD: Assign interrupt handler only when bank supports itChen Yucong
There are some AMD CPU models which have thresholding banks but which cannot generate a thresholding interrupt. This is denoted by the bit MCi_MISC[IntP]. Make sure to check that bit before assigning the thresholding interrupt handler. Signed-off-by: Chen Yucong <slaoub@gmail.com> [ Boris: save an indentation level and rewrite commit message. ] Link: http://lkml.kernel.org/r/1412662128.28440.18.camel@debian Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-31Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fixes from all around the place: - hyper-V 32-bit PAE guest kernel fix - two IRQ allocation fixes on certain x86 boards - intel-mid boot crash fix - intel-quark quirk - /proc/interrupts duplicate irq chip name fix - cma boot crash fix - syscall audit fix - boot crash fix with certain TSC configurations (seen on Qemu) - smpboot.c build warning fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE ACPI, irq, x86: Return IRQ instead of GSI in mp_register_gsi() x86, intel-mid: Create IRQs for APB timers and RTC timers x86: Don't enable F00F workaround on Intel Quark processors x86/irq: Fix XT-PIC-XT-PIC in /proc/interrupts x86, cma: Reserve DMA contiguous area after initmem_init() i386/audit: stop scribbling on the stack frame x86, apic: Handle a bad TSC more gracefully x86: ACPI: Do not translate GSI number if IOAPIC is disabled x86/smpboot: Move data structure to its primary usage scope
2014-10-31ftrace/x86: Show trampoline call function in enabled_functionsSteven Rostedt (Red Hat)
The file /sys/kernel/debug/tracing/eneabled_functions is used to debug ftrace function hooks. Add to the output what function is being called by the trampoline if the arch supports it. Add support for this feature in x86_64. Cc: H. Peter Anvin <hpa@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-31ftrace/x86: Add dynamic allocated trampoline for ftrace_opsSteven Rostedt (Red Hat)
The current method of handling multiple function callbacks is to register a list function callback that calls all the other callbacks based on their hash tables and compare it to the function that the callback was called on. But this is very inefficient. For example, if you are tracing all functions in the kernel and then add a kprobe to a function such that the kprobe uses ftrace, the mcount trampoline will switch from calling the function trace callback to calling the list callback that will iterate over all registered ftrace_ops (in this case, the function tracer and the kprobes callback). That means for every function being traced it checks the hash of the ftrace_ops for function tracing and kprobes, even though the kprobes is only set at a single function. The kprobes ftrace_ops is checked for every function being traced! Instead of calling the list function for functions that are only being traced by a single callback, we can call a dynamically allocated trampoline that calls the callback directly. The function graph tracer already uses a direct call trampoline when it is being traced by itself but it is not dynamically allocated. It's trampoline is static in the kernel core. The infrastructure that called the function graph trampoline can also be used to call a dynamically allocated one. For now, only ftrace_ops that are not dynamically allocated can have a trampoline. That is, users such as function tracer or stack tracer. kprobes and perf allocate their ftrace_ops, and until there's a safe way to free the trampoline, it can not be used. The dynamically allocated ftrace_ops may, although, use the trampoline if the kernel is not compiled with CONFIG_PREEMPT. But that will come later. Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-29perf/x86/intel: Revert incomplete and undocumented Broadwell client supportIngo Molnar
These patches: 86a349a28b24 ("perf/x86/intel: Add Broadwell core support") c46e665f0377 ("perf/x86: Add INST_RETIRED.ALL workarounds") fdda3c4aacec ("perf/x86/intel: Use Broadwell cache event list for Haswell") introduced magic constants and unexplained changes: https://lkml.org/lkml/2014/10/28/1128 https://lkml.org/lkml/2014/10/27/325 https://lkml.org/lkml/2014/8/27/546 https://lkml.org/lkml/2014/10/28/546 Peter Zijlstra has attempted to help out, to clean up the mess: https://lkml.org/lkml/2014/10/28/543 But has not received helpful and constructive replies which makes me doubt wether it can all be finished in time until v3.18 is released. Despite various review feedback the author (Andi Kleen) has answered only few of the review questions and has generally been uncooperative, only giving replies when prompted repeatedly, and only giving minimal answers instead of constructively explaining and helping along the effort. That kind of behavior is not acceptable. There's also a boot crash on Intel E5-1630 v3 CPUs reported for another commit from Andi Kleen: e735b9db12d7 ("perf/x86/intel/uncore: Add Haswell-EP uncore support") https://lkml.org/lkml/2014/10/22/730 Which is not yet resolved. The uncore driver is independent in theory, but the crash makes me worry about how well all these patches were tested and makes me uneasy about the level of interminging that the Broadwell and Haswell code has received by the commits above. As a first step to resolve the mess revert the Broadwell client commits back to the v3.17 version, before we run out of time and problematic code hits a stable upstream kernel. ( If the Haswell-EP crash is not resolved via a simple fix then we'll have to revert the Haswell-EP uncore driver as well. ) The Broadwell client series has to be submitted in a clean fashion, with single, well documented changes per patch. If they are submitted in time and are accepted during review then they can possibly go into v3.19 but will need additional scrutiny due to the rocky history of this patch set. Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: eranian@google.com Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1409683455-29168-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-29ACPI, irq, x86: Return IRQ instead of GSI in mp_register_gsi()Jiang Liu
Function mp_register_gsi() returns blindly the GSI number for the ACPI SCI interrupt. That causes a regression when the GSI for ACPI SCI is shared with other devices. The regression was caused by commit 84245af7297ced9e8fe "x86, irq, ACPI: Change __acpi_register_gsi to return IRQ number instead of GSI" and exposed on a SuperMicro system, which shares one GSI between ACPI SCI and PCI device, with following failure: http://sourceforge.net/p/linux1394/mailman/linux1394-user/?viewmonth=201410 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 20 low level) [ 2.699224] firewire_ohci 0000:06:00.0: failed to allocate interrupt 20 Return mp_map_gsi_to_irq(gsi, 0) instead of the GSI number. Reported-and-Tested-by: Daniel Robbins <drobbins@funtoo.org> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: <stable@vger.kernel.org> # 3.17 Link: http://lkml.kernel.org/r/1414387308-27148-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-29x86, intel-mid: Create IRQs for APB timers and RTC timersJiang Liu
Intel MID platforms has no legacy interrupts, so no IRQ descriptors preallocated. We need to call mp_map_gsi_to_irq() to create IRQ descriptors for APB timers and RTC timers, otherwise it may cause invalid memory access as: [ 0.116839] BUG: unable to handle kernel NULL pointer dereference at 0000003a [ 0.123803] IP: [<c1071c0e>] setup_irq+0xf/0x4d Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Cohen <david.a.cohen@linux.intel.com> Cc: <stable@vger.kernel.org> # 3.17 Link: http://lkml.kernel.org/r/1414387308-27148-3-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-29x86: Don't enable F00F workaround on Intel Quark processorsDave Jones
The Intel Quark processor is a part of family 5, but does not have the F00F bug present in Pentiums of the same family. Pentiums were models 0 through 8, Quark is model 9. Signed-off-by: Dave Jones <davej@redhat.com> Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie> Link: http://lkml.kernel.org/r/20141028175753.GA12743@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28x86/irq: Fix XT-PIC-XT-PIC in /proc/interruptsMaciej W. Rozycki
Fix duplicate XT-PIC seen in /proc/interrupts on x86 systems that make use of 8259A Programmable Interrupt Controllers. Specifically convert output like this: CPU0 0: 76573 XT-PIC-XT-PIC timer 1: 11 XT-PIC-XT-PIC i8042 2: 0 XT-PIC-XT-PIC cascade 4: 8 XT-PIC-XT-PIC serial 6: 3 XT-PIC-XT-PIC floppy 7: 0 XT-PIC-XT-PIC parport0 8: 1 XT-PIC-XT-PIC rtc0 10: 448 XT-PIC-XT-PIC fddi0 12: 23 XT-PIC-XT-PIC eth0 14: 2464 XT-PIC-XT-PIC ide0 NMI: 0 Non-maskable interrupts ERR: 0 to one like this: CPU0 0: 122033 XT-PIC timer 1: 11 XT-PIC i8042 2: 0 XT-PIC cascade 4: 8 XT-PIC serial 6: 3 XT-PIC floppy 7: 0 XT-PIC parport0 8: 1 XT-PIC rtc0 10: 145 XT-PIC fddi0 12: 31 XT-PIC eth0 14: 2245 XT-PIC ide0 NMI: 0 Non-maskable interrupts ERR: 0 that is one like we used to have from ~2.2 till it was changed sometime. The rationale is there is no value in this duplicate information, it merely clutters output and looks ugly. We only have one handler for 8259A interrupts so there is no need to give it a name separate from the name already given to irq_chip. We could define meaningful names for handlers based on bits in the ELCR register on systems that have it or the value of the LTIM bit we use in ICW1 otherwise (hardcoded to 0 though with MCA support gone), to tell edge-triggered and level-triggered inputs apart. While that information does not affect 8259A interrupt handlers it could help people determine which lines are shareable and which are not. That is material for a separate change though. Any tools that parse /proc/interrupts are supposed not to be affected since it was many years we used the format this change converts back to. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.11.1410260147190.21390@eddie.linux-mips.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28x86/asm: Fix typo in arch/x86/kernel/asm_offset_64.cNicholas Mc Guire
Drop double entry for pt_regs_bx. This seems to be a typo - resulting in a double entry in the generated include/generated/asm-offsets.h, which is not necessary. Build-tested and booted on x86 64 box to make sure it was not doing any strange magic.... after all it was in the kernel in this form for almost 10 years. Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Cc: Jan Beulich <JBeulich@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20141027172805.GA19760@opentech.at Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28x86_64/vdso: Remove jiffies from the vvar pageAndy Lutomirski
I think that the jiffies vvar was once used for the vgetcpu cache. That code is long gone, so let's just make jiffies be a normal variable. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/fcfee6f8749af14d96373a9e2656354ad0b95499.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28x86_64/vdso: Move getcpu code from vsyscall_64.c to vdso/vma.cAndy Lutomirski
This is pure cut-and-paste. At this point, vsyscall_64.c contains only code needed for vsyscall emulation, but some of the comments and function names are still confused. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/a244daf7d3cbe71afc08ad09fdfe1866ca1f1978.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28x86_64/vsyscall: Move all of the gate_area code to vsyscall_64.cAndy Lutomirski
This code exists for the sole purpose of making the vsyscall page look sort of like real userspace memory. Move it so that it lives with the rest of the vsyscall code. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/a7ee266773671a05f00b7175ca65a0dd812d2e4b.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28perf/x86: Fix compile warnings for intel_uncorePeter Zijlstra
The uncore drivers require PCI and generate compile time warnings when !CONFIG_PCI. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Stephane Eranian <eranian@google.com> Cc: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28perf: Fix bogus kernel printkPeter Zijlstra (Intel)
Andy spotted the fail in what was intended as a conditional printk level. Reported-by: Andy Lutomirski <luto@amacapital.net> Fixes: cc6cd47e7395 ("perf/x86: Tone down kernel messages when the PMU check fails in a virtual environment") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20141007124757.GH19379@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28x86, cma: Reserve DMA contiguous area after initmem_init()Weijie Yang
Fengguang Wu reported a boot crash on the x86 platform via the 0-day Linux Kernel Performance Test: cma: dma_contiguous_reserve: reserving 31 MiB for global area BUG: Int 6: CR2 (null) [<41850786>] dump_stack+0x16/0x18 [<41d2b1db>] early_idt_handler+0x6b/0x6b [<41072227>] ? __phys_addr+0x2e/0xca [<41d4ee4d>] cma_declare_contiguous+0x3c/0x2d7 [<41d6d359>] dma_contiguous_reserve_area+0x27/0x47 [<41d6d4d1>] dma_contiguous_reserve+0x158/0x163 [<41d33e0f>] setup_arch+0x79b/0xc68 [<41d2b7cf>] start_kernel+0x9c/0x456 [<41d2b2ca>] i386_start_kernel+0x79/0x7d (See details at: https://lkml.org/lkml/2014/10/8/708) It is because dma_contiguous_reserve() is called before initmem_init() in x86, the variable high_memory is not initialized but accessed by __pa(high_memory) in dma_contiguous_reserve(). This patch moves dma_contiguous_reserve() after initmem_init() so that high_memory is initialized before accessed. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: iamjoonsoo.kim@lge.com Cc: 'Linux-MM' <linux-mm@kvack.org> Cc: 'Weijie Yang' <weijie.yang.kh@gmail.com> Link: http://lkml.kernel.org/r/000101cfef69%2431e528a0%2495af79e0%24%25yang@samsung.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28Merge tag 'drm-intel-next-2014-10-03-no-ppgtt' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next Ok, new attempt, this time around with full ppgtt disabled again. drm-intel-next-2014-10-03: - first batch of skl stage 1 enabling - fixes from Rodrigo to the PSR, fbc and sink crc code - kerneldoc for the frontbuffer tracking code, runtime pm code and the basic interrupt enable/disable functions - smaller stuff all over drm-intel-next-2014-09-19: - bunch more i830M fixes from Ville - full ppgtt now again enabled by default - more ppgtt fixes from Michel Thierry and Chris Wilson - plane config work from Gustavo Padovan - spinlock clarifications - piles of smaller improvements all over, as usual * tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel: (114 commits) Revert "drm/i915: Enable full PPGTT on gen7" drm/i915: Update DRIVER_DATE to 20141003 drm/i915: Remove the duplicated logic between the two shrink phases drm/i915: kerneldoc for interrupt enable/disable functions drm/i915: Use dev_priv instead of dev in irq setup functions drm/i915: s/pm._irqs_disabled/pm.irqs_enabled/ drm/i915: Clear TX FIFO reset master override bits on chv drm/i915: Make sure hardware uses the correct swing margin/deemph bits on chv drm/i915: make sink_crc return -EIO on aux read/write failure drm/i915: Constify send buffer for intel_dp_aux_ch drm/i915: De-magic the PSR AUX message drm/i915: Reinstate error level message for non-simulated gpu hangs drm/i915: Kerneldoc for intel_runtime_pm.c drm/i915: Call runtime_pm_disable directly drm/i915: Move intel_display_set_init_power to intel_runtime_pm.c drm/i915: Bikeshed rpm functions name a bit. drm/i915: Extract intel_runtime_pm.c drm/i915: Remove intel_modeset_suspend_hw drm/i915: spelling fixes for frontbuffer tracking kerneldoc drm/i915: Tighting frontbuffer tracking around flips ...
2014-10-24i386/audit: stop scribbling on the stack frameEric Paris
git commit b4f0d3755c5e9cc86292d5fd78261903b4f23d4a was very very dumb. It was writing over %esp/pt_regs semi-randomly on i686 with the expected "system can't boot" results. As noted in: https://bugs.freedesktop.org/show_bug.cgi?id=85277 This patch stops fscking with pt_regs. Instead it sets up the registers for the call to __audit_syscall_entry in the most obvious conceivable way. It then does just a tiny tiny touch of magic. We need to get what started in PT_EDX into 0(%esp) and PT_ESI into 4(%esp). This is as easy as a pair of pushes. After the call to __audit_syscall_entry all we need to do is get that now useless junk off the stack (pair of pops) and reload %eax with the original syscall so other stuff can keep going about it's business. Reported-by: Paulo Zanoni <przanoni@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com> Link: http://lkml.kernel.org/r/1414037043-30647-1-git-send-email-eparis@redhat.com Cc: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-24Merge tag 'v3.18-rc1' into x86/urgentH. Peter Anvin
Reason: Need to apply audit patch on top of v3.18-rc1. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-22x86, apic: Handle a bad TSC more gracefullyAndy Lutomirski
If the TSC is unusable or disabled, then this patch fixes: - Confusion while trying to clear old APIC interrupts. - Division by zero and incorrect programming of the TSC deadline timer. This fixes boot if the CPU has a TSC deadline timer but a missing or broken TSC. The failure to boot can be observed with qemu using -cpu qemu64,-tsc,+tsc-deadline This also happens to me in nested KVM for unknown reasons. With this patch, I can boot cleanly (although without a TSC). Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Bandan Das <bsd@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/e2fa274e498c33988efac0ba8b7e3120f7f92d78.1413393027.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-10-21x86, MCE, AMD: Drop software-defined bank in error thresholdingBorislav Petkov
Aravind had the good question about why we're assigning a software-defined bank when reporting error thresholding errors instead of simply using the bank which reports the last error causing the overflow. Digging through git history, it pointed to 95268664390b ("[PATCH] x86_64: mce_amd support for family 0x10 processors") which added that functionality. The problem with this, however, is that tools don't know about software-defined banks and get puzzled. So drop that K8_MCE_THRESHOLD_BASE and simply use the hw bank reporting the thresholding interrupt. Save us a couple of MSR reads while at it. Reported-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: https://lkml.kernel.org/r/5435B206.60402@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-21x86, MCE, AMD: Move invariant code out from loop bodyChen Yucong
Assigning to mce_threshold_vector is loop-invariant code in mce_amd_feature_init(). So do it only once, out of loop body. Signed-off-by: Chen Yucong <slaoub@gmail.com> Link: http://lkml.kernel.org/r/1412263212.8085.6.camel@debian [ Boris: commit message corrections. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-21x86, MCE, AMD: Correct thresholding error loggingChen Yucong
mce_setup() does not gather the content of IA32_MCG_STATUS, so it should be read explicitly. Moreover, we need to clear IA32_MCx_STATUS to avoid that mce_log() logs the processed threshold event again at next time. But we do the logging ourselves and machine_check_poll() is completely useless there. So kill it. Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-21x86, MCE, AMD: Use macros to compute bank MSRsChen Yucong
Avoid open coded calculations for bank MSRs by hiding the index of higher bank MSRs in well-defined macros. No semantic changes. Signed-off-by: Chen Yucong <slaoub@gmail.com> Link: http://lkml.kernel.org/r/1411438561-24319-1-git-send-email-slaoub@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-20x86: ACPI: Do not translate GSI number if IOAPIC is disabledJiang Liu
When IOAPIC is disabled, acpi_gsi_to_irq() should return gsi directly instead of calling mp_map_gsi_to_irq() to translate gsi to IRQ by IOAPIC. It fixes https://bugzilla.kernel.org/show_bug.cgi?id=84381. This regression was introduced with commit 6b9fb7082409 "x86, ACPI, irq: Consolidate algorithm of mapping (ioapic, pin) to IRQ number" Reported-and-Tested-by: Thomas Richter <thor@math.tu-berlin.de> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Thomas Richter <thor@math.tu-berlin.de> Cc: rui.zhang@intel.com Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 3.17 Link: http://lkml.kernel.org/r/1413816327-12850-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-10-20x86, amd_nb: Add device IDs to NB tables for F15h M60hAravind Gopalakrishnan
Add F3 and F4 PCI device IDs to amd_nb_misc_ids[] and amd_nb_link_ids[] respectively. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1411070205-10217-1-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-19Merge git://git.infradead.org/users/eparis/auditLinus Torvalds
Pull audit updates from Eric Paris: "So this change across a whole bunch of arches really solves one basic problem. We want to audit when seccomp is killing a process. seccomp hooks in before the audit syscall entry code. audit_syscall_entry took as an argument the arch of the given syscall. Since the arch is part of what makes a syscall number meaningful it's an important part of the record, but it isn't available when seccomp shoots the syscall... For most arch's we have a better way to get the arch (syscall_get_arch) So the solution was two fold: Implement syscall_get_arch() everywhere there is audit which didn't have it. Use syscall_get_arch() in the seccomp audit code. Having syscall_get_arch() everywhere meant it was a useless flag on the stack and we could get rid of it for the typical syscall entry. The other changes inside the audit system aren't grand, fixed some records that had invalid spaces. Better locking around the task comm field. Removing some dead functions and structs. Make some things static. Really minor stuff" * git://git.infradead.org/users/eparis/audit: (31 commits) audit: rename audit_log_remove_rule to disambiguate for trees audit: cull redundancy in audit_rule_change audit: WARN if audit_rule_change called illegally audit: put rule existence check in canonical order next: openrisc: Fix build audit: get comm using lock to avoid race in string printing audit: remove open_arg() function that is never used audit: correct AUDIT_GET_FEATURE return message type audit: set nlmsg_len for multicast messages. audit: use union for audit_field values since they are mutually exclusive audit: invalid op= values for rules audit: use atomic_t to simplify audit_serial() kernel/audit.c: use ARRAY_SIZE instead of sizeof/sizeof[0] audit: reduce scope of audit_log_fcaps audit: reduce scope of audit_net_id audit: arm64: Remove the audit arch argument to audit_syscall_entry arm64: audit: Add audit hook in syscall_trace_enter/exit() audit: x86: drop arch from __audit_syscall_entry() interface sparc: implement is_32bit_task sparc: properly conditionalize use of TIF_32BIT ...
2014-10-19x86/smpboot: Move data structure to its primary usage scopeIngo Molnar
Makes the code more readable by moving variable and usage closer to each other, which also avoids this build warning in the !CONFIG_HOTPLUG_CPU case: arch/x86/kernel/smpboot.c:105:42: warning: ‘die_complete’ defined but not used [-Wunused-variable] Cc: Prarit Bhargava <prarit@redhat.com> Cc: Lan Tianyu <tianyu.lan@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: srostedt@redhat.com Cc: toshi.kani@hp.com Cc: imammedo@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1409039025-32310-1-git-send-email-tianyu.lan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-17x86, msr: Use seek definitions instead of hard-coded valuesFabian Frederick
Replace 0/1 by SEEK_SET/SEEK_CUR. Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1413576120-27147-1-git-send-email-fabf@skynet.be Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-17x86, msr: Convert printk to pr_foo()Fabian Frederick
Also define pr_fmt. Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1413576110-27103-1-git-send-email-fabf@skynet.be Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-17x86, msr: Use PTR_ERR_OR_ZEROFabian Frederick
Replace IS_ERR/PTR_ERR Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1413576099-27059-1-git-send-email-fabf@skynet.be Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-17x86/simplefb: Use PTR_ERR_OR_ZEROFabian Frederick
Replace IS_ERR/PTR_ERR Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1413576066-26925-1-git-send-email-fabf@skynet.be Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-17x86/sysfb: Use PTR_ERR_OR_ZEROFabian Frederick
Replace IS_ERR/PTR_ERR Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1413576053-26761-1-git-send-email-fabf@skynet.be Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-17x86, cpuid: Use PTR_ERR_OR_ZEROFabian Frederick
Replace IS_ERR/PTR_ERR Signed-off-by: Fabian Frederick <fabf@skynet.be> Link: http://lkml.kernel.org/r/1413576077-26969-1-git-send-email-fabf@skynet.be Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-10-15Merge branch 'for-3.18-consistent-ops' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_*() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...
2014-10-14Merge branch 'akpm' (patches from Andrew Morton)Linus Torvalds
Merge second patch-bomb from Andrew Morton: - a few hotfixes - drivers/dma updates - MAINTAINERS updates - Quite a lot of lib/ updates - checkpatch updates - binfmt updates - autofs4 - drivers/rtc/ - various small tweaks to less used filesystems - ipc/ updates - kernel/watchdog.c changes * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (135 commits) mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared kernel/param: consolidate __{start,stop}___param[] in <linux/moduleparam.h> ia64: remove duplicate declarations of __per_cpu_start[] and __per_cpu_end[] frv: remove unused declarations of __start___ex_table and __stop___ex_table kvm: ensure hard lockup detection is disabled by default kernel/watchdog.c: control hard lockup detection default staging: rtl8192u: use %*pEn to escape buffer staging: rtl8192e: use %*pEn to escape buffer staging: wlan-ng: use %*pEhp to print SN lib80211: remove unused print_ssid() wireless: hostap: proc: print properly escaped SSID wireless: ipw2x00: print SSID via %*pE wireless: libertas: print esaped string via %*pE lib/vsprintf: add %*pE[achnops] format specifier lib / string_helpers: introduce string_escape_mem() lib / string_helpers: refactoring the test suite lib / string_helpers: move documentation to c-file include/linux: remove strict_strto* definitions arch/x86/mm/numa.c: fix boot failure when all nodes are hotpluggable fs: check bh blocknr earlier when searching lru ...
2014-10-14Merge branches 'x86-ras-for-linus', 'x86-uv-for-linus' and ↵Linus Torvalds
'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 ras, uv and vdso fixlets from Ingo Molnar: "ras: tone down a kernel message to only occur during initial bootup, not during suspend/resume cycles. uv: a cleanup commit vdso: a fix to error checking" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Avoid showing repetitive message from intel_init_thermal() * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/uv: Remove unnecessary #ifdef * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Fix vdso2c's special_pages[] error checking
2014-10-14Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc smaller fixes that missed the v3.17 cycle" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Add arch/x86/purgatory/ make generated files to gitignore x86: Fix section conflict for numachip x86: Reject x32 executables if x32 ABI not supported x86_64, entry: Filter RFLAGS.NT on entry from userspace x86, boot, kaslr: Fix nuisance warning on 32-bit builds
2014-10-14Merge branch 'x86-seccomp-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 seccomp changes from Ingo Molnar: "This tree includes x86 seccomp filter speedups and related preparatory work, which touches core seccomp facilities as well. The main idea is to split seccomp into two phases, to be able to enter a simple fast path for syscalls with ptrace side effects. There's no substantial user-visible (and ABI) effects expected from this, except a change in how we emit a better audit record for SECCOMP_RET_TRACE events" * 'x86-seccomp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86_64, entry: Use split-phase syscall_trace_enter for 64-bit syscalls x86_64, entry: Treat regs->ax the same in fastpath and slowpath syscalls x86: Split syscall_trace_enter into two phases x86, entry: Only call user_exit if TIF_NOHZ x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit seccomp: Document two-phase seccomp and arch-provided seccomp_data seccomp: Allow arch code to provide seccomp_data seccomp: Refactor the filter callback and the API seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing