summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-04-21ring-buffer: only warn on wrap if buffer is bigger than two pagesSteven Rostedt
On boot up, to save memory, ftrace allocates the minimum buffer which is two pages. Ftrace also goes through a series of tests (when configured) on boot up. These tests can fill up a page within a single interrupt. The ring buffer also has a WARN_ON when it detects that the buffer was completely filled within a single commit (other commits are allowed to be nested). Combine the small buffer on start up, with the tests that can fill more than a single page within an interrupt, this can trigger the WARN_ON. This patch makes the WARN_ON only happen when the ring buffer consists of more than two pages. [ Impact: prevent false WARN_ON in ftrace startup tests ] Reported-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20090421094616.GA14561@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21tracing/filters: allow user-input to be integer-like stringLi Zefan
Suppose we would like to trace all tasks named '123', but this will fail: # echo 'parent_comm == 123' > events/sched/sched_process_fork/filter bash: echo: write error: Invalid argument Don't guess the type of the filter pred in filter_parse(), but instead we check it in __filter_add_pred(). [ Impact: extend allowed filter field string values ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <49ED8DEB.6000700@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21tracing/filters: don't remove old filters when failed to write subsys->filterLi Zefan
If writing subsys->filter returns EINVAL or ENOSPC, the original filters in subsys/ and subsys/events/ will be removed. This is definitely wrong. [ Impact: fix filter setting semantics on error condition ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <49ED8DD2.2070700@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-20No need for crossing to mountpoint in audit_tag_tree()Al Viro
is_under() will DTRT anyway. And yes, is_subdir() behaviour is intentional. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-20tracing: use nowakeup version of commit for function event trace testsSteven Rostedt
The startup tests for the event tracer also runs with the function tracer enabled. The "wakeup" version of the trace commit was used which can grab spinlocks. If a task was preempted by an NMI that called a function being traced, it could deadlock due to the function tracer trying to grab the same lock. Thanks to Frederic Weisbecker for pointing out where the bug was. Reported-by: Ingo Molnar <mingo@elte.hu> Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing: use recursive counter over irq levelSteven Rostedt
Althought using the irq level (hardirq_count, softirq_count and in_nmi) was nice to detect bad recursion right away, but since the counters are not atomically updated with respect to the interrupts, the function tracer might trigger the test from an interrupt handler before the hardirq_count is updated. This will trigger a false warning. This patch converts the recursive detection to a simple counter. If the depth is greater than 16 then the recursive detection will trigger. 16 is more than enough for any nested interrupts. [ Impact: fix false positive trace recursion detection ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing: remove recursive test from ring_buffer_event_discardSteven Rostedt
The ring_buffer_event_discard is not tied to ring_buffer_lock_reserve. It can be called inside or outside the reserve/commit. Even if it is called inside the reserve/commit the commit part must also be called. Only ring_buffer_discard_commit can be used as a replacement for ring_buffer_unlock_commit. This patch removes the trace_recursive_unlock from ring_buffer_event_discard since it would be the wrong place to do so. [Impact: prevent breakage in trace recursive testing ] Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing: fix recursive test level calculationSteven Rostedt
The recursive tests to detect same level recursion in the ring buffers did not account for the hard/softirq_counts to be shifted. Thus the numbers could be larger than then mask to be tested. This patch includes the shift for the calculation of the irq depth. [ Impact: stop false positives in trace recursion detection ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing/events: call the correct event trace selftest init functionSteven Rostedt
The late_initcall calls a helper function instead of the proper init event selftest function. This update may have been lost due to conflicting merges. [ Impact: fix compiler warning and call extended event trace self tests ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing: rename EVENT_TRACER config to ENABLE_EVENT_TRACINGSteven Rostedt
Currently we have two configs: EVENT_TRACING and EVENT_TRACER. All tracers enable EVENT_TRACING. The EVENT_TRACER is only a convenience to enable the EVENT_TRACING when no other tracers are enabled. The names EVENT_TRACER and EVENT_TRACING are too similar and confusing. This patch renames EVENT_TRACER to ENABLE_EVENT_TRACING to be more appropriate to what it actually does, as well as add a comment in the help menu to explain the option's purpose. [ Impact: rename config option to reduce confusion ] Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing: create menuconfig for tracing infrastructureSteven Rostedt
During testing we often use randconfig to test various kernels. The current configuration set up does not give an easy way to disable all tracing with a single config. The case where randconfig would test all tracing disabled is very unlikely. This patch adds a config option to enable or disable all tracing. It is hooked into the tracing menu just like other submenus are done. [ Impact: allow randconfig to easily produce all traces disabled ] Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing: change branch profiling to a choice selectionSteven Rostedt
This patch makes the branch profiling into a choice selection: None - no branch profiling likely/unlikely - only profile likely/unlikely branches all - profile all branches The all profiler will also enable the likely/unlikely branches. This does not change the way the profiler works or the dependencies between the profilers. What this patch does, is keep the branch profiling from being selected by an allyesconfig make. The branch profiler is very intrusive and it is known to break various architecture builds when selected as an allyesconfig. [ Impact: prevent branch profiler from being selected in allyesconfig ] Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reported-by: Al Viro <viro@zeniv.linux.org.uk> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-20tracing/ring-buffer: Add unlock recursion protection on discardFrederic Weisbecker
The pair of helpers trace_recursive_lock() and trace_recursive_unlock() have been introduced recently to provide generic tracing recursion protection. They are used in a symetric way: - trace_recursive_lock() on buffer reserve - trace_recursive_unlock() on buffer commit However sometimes, we don't commit but discard on entry to the buffer, ie: in case of filter checking. Then we must also unlock the recursion protection on discard time, otherwise the tracing gets definitely deactivated and a warning is raised spuriously, such as: 111.119821] ------------[ cut here ]------------ [ 111.119829] WARNING: at kernel/trace/ring_buffer.c:1498 ring_buffer_lock_reserve+0x1b7/0x1d0() [ 111.119835] Hardware name: AMILO Li 2727 [ 111.119839] Modules linked in: [ 111.119846] Pid: 5731, comm: Xorg Tainted: G W 2.6.30-rc1 #69 [ 111.119851] Call Trace: [ 111.119863] [<ffffffff8025ce68>] warn_slowpath+0xd8/0x130 [ 111.119873] [<ffffffff8028a30f>] ? __lock_acquire+0x19f/0x1ae0 [ 111.119882] [<ffffffff8028a30f>] ? __lock_acquire+0x19f/0x1ae0 [ 111.119891] [<ffffffff802199b0>] ? native_sched_clock+0x20/0x70 [ 111.119899] [<ffffffff80286dee>] ? put_lock_stats+0xe/0x30 [ 111.119906] [<ffffffff80286eb8>] ? lock_release_holdtime+0xa8/0x150 [ 111.119913] [<ffffffff802c8ae7>] ring_buffer_lock_reserve+0x1b7/0x1d0 [ 111.119921] [<ffffffff802cd110>] trace_buffer_lock_reserve+0x30/0x70 [ 111.119930] [<ffffffff802ce000>] trace_current_buffer_lock_reserve+0x20/0x30 [ 111.119939] [<ffffffff802474e8>] ftrace_raw_event_sched_switch+0x58/0x100 [ 111.119948] [<ffffffff808103b7>] __schedule+0x3a7/0x4cd [ 111.119957] [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b [ 111.119964] [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b [ 111.119971] [<ffffffff80810c08>] schedule+0x18/0x40 [ 111.119977] [<ffffffff80810e09>] preempt_schedule+0x39/0x60 [ 111.119985] [<ffffffff80813bd3>] _read_unlock+0x53/0x60 [ 111.119993] [<ffffffff807259d2>] sock_def_readable+0x72/0x80 [ 111.120002] [<ffffffff807ad5ed>] unix_stream_sendmsg+0x24d/0x3d0 [ 111.120011] [<ffffffff807219a3>] sock_aio_write+0x143/0x160 [ 111.120019] [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b [ 111.120026] [<ffffffff80721860>] ? sock_aio_write+0x0/0x160 [ 111.120033] [<ffffffff80721860>] ? sock_aio_write+0x0/0x160 [ 111.120042] [<ffffffff8031c283>] do_sync_readv_writev+0xf3/0x140 [ 111.120049] [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b [ 111.120057] [<ffffffff80276ff0>] ? autoremove_wake_function+0x0/0x40 [ 111.120067] [<ffffffff8045d489>] ? cap_file_permission+0x9/0x10 [ 111.120074] [<ffffffff8045c1e6>] ? security_file_permission+0x16/0x20 [ 111.120082] [<ffffffff8031cab4>] do_readv_writev+0xd4/0x1f0 [ 111.120089] [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b [ 111.120097] [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b [ 111.120105] [<ffffffff8031cc18>] vfs_writev+0x48/0x70 [ 111.120111] [<ffffffff8031cd65>] sys_writev+0x55/0xc0 [ 111.120119] [<ffffffff80211e32>] system_call_fastpath+0x16/0x1b [ 111.120125] ---[ end trace 15605f4e98d5ccb5 ]--- [ Impact: fix spurious warning triggering tracing shutdown ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-04-19tracing/core: Add current context on tracing recursion warningFrederic Weisbecker
In case of tracing recursion detection, we only get the stacktrace. But the current context may be very useful to debug the issue. This patch adds the softirq/hardirq/nmi context with the warning using lockdep context display to have a familiar output. v2: Use printk_once() v3: drop {hardirq,softirq}_context which depend on lockdep, only keep what is part of current->trace_recursion, sufficient to debug the warning source. [ Impact: print context necessary to debug recursion ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-04-19PM/Suspend: Introduce two new platform callbacks to avoid breakageRafael J. Wysocki
Commit 900af0d973856d6feb6fc088c2d0d3fde57707d3 (PM: Change suspend code ordering) changed the ordering of suspend code in such a way that the platform .prepare() callback is now executed after the device drivers' late suspend callbacks have run. Unfortunately, this turns out to break ARM platforms that need to talk via I2C to power control devices during the .prepare() callback. For this reason introduce two new platform suspend callbacks, .prepare_late() and .wake(), that will be called just prior to disabling non-boot CPUs and right after bringing them back on line, respectively, and use them instead of .prepare() and .finish() for ACPI suspend. Make the PM core execute the .prepare() and .finish() platform suspend callbacks where they were executed previously (that is, right after calling the regular suspend methods provided by device drivers and right before executing their regular resume methods, respectively). It is not necessary to make analogous changes to the hibernation code and data structures at the moment, because they are only used by ACPI platforms. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Len Brown <len.brown@intel.com>
2009-04-18Remove 'recurse into child resources' logic from 'reserve_region_with_split()'Linus Torvalds
This function is not actually used right now, since the original use case for it was done with insert_resource_expand_to_fit() instead. However, we now have another usage case that wants to basically do a "reserve IO resource, splitting around existing resources", however that one doesn't actually want the "recurse into the conflicting resource" logic at all. And since recursing into the conflicting resource was the most complex part, and isn't wanted, just remove it. Maybe we'll some day want both versions, but we can just resurrect the logic then. Tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-17tracing: protect trace_printk from recursionSteven Rostedt
trace_printk can be called from any context, including NMIs. If this happens, then we must test for for recursion before grabbing any spinlocks. This patch prevents trace_printk from being called recursively. [ Impact: prevent hard lockup in lockdep event tracer ] Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17tracing: add same level recursion detectionSteven Rostedt
The tracing infrastructure allows for recursion. That is, an interrupt may interrupt the act of tracing an event, and that interrupt may very well perform its own trace. This is a recursive trace, and is fine to do. The problem arises when there is a bug, and the utility doing the trace calls something that recurses back into the tracer. This recursion is not caused by an external event like an interrupt, but by code that is not expected to recurse. The result could be a lockup. This patch adds a bitmask to the task structure that keeps track of the trace recursion. To find the interrupt depth, the following algorithm is used: level = hardirq_count() + softirq_count() + in_nmi; Here, level will be the depth of interrutps and softirqs, and even handles the nmi. Then the corresponding bit is set in the recursion bitmask. If the bit was already set, we know we had a recursion at the same level and we warn about it and fail the writing to the buffer. After the data has been committed to the buffer, we clear the bit. No atomics are needed. The only races are with interrupts and they reset the bitmask before returning anywy. [ Impact: detect same irq level trace recursion ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17tracing: add EXPORT_SYMBOL_GPL for trace commitsSteven Rostedt
Not all the necessary symbols were exported to allow for tracing by modules. This patch adds them in. [ Impact: allow modules to commit data to the ring buffer ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17tracing/filters: add filter_mutex to protect filter predicatesTom Zanussi
This patch adds a filter_mutex to prevent the filter predicates from being accessed concurrently by various external functions. It's based on a previous patch by Li Zefan: "[PATCH 7/7] tracing/filters: make filter preds RCU safe" v2 changes: - fixed wrong value returned in a add_subsystem_pred() failure case noticed by Li Zefan. [ Impact: fix trace filter corruption/crashes on parallel access ] Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: paulmck@linux.vnet.ibm.com LKML-Reference: <1239946028.6639.13.camel@tropicana> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17tracing: fix file mode of trace and READMELi Zefan
trace is read-write and README is read-only. [ Impact: fix /debug/tracing/ file permissions. ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <49E7EAB6.4070605@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17lockdep: more robust lockdep_map init sequencePeter Zijlstra
Steven Rostedt reported: > OK, I think I figured this bug out. This is a lockdep issue with respect > to tracepoints. > > The trace points in lockdep are called all the time. Outside the lockdep > logic. But if lockdep were to trigger an error / warning (which this run > did) we might be in trouble. For new locks, like the dentry->d_lock, that > are created, they will not get a name: > > void lockdep_init_map(struct lockdep_map *lock, const char *name, > struct lock_class_key *key, int subclass) > { > if (unlikely(!debug_locks)) > return; > > When a problem is found by lockdep, debug_locks becomes false. Thus we > stop allocating names for locks. This dentry->d_lock I had, now has no > name. Worse yet, I have CONFIG_DEBUG_VM set, that scrambles non > initialized memory. Thus, when the trace point was hit, it had junk for > the lock->name, and the machine crashed. Ah, nice catch. I think we should put at least the name in regardless. Ensure we at least initialize the trivial entries of the depmap so that they can be relied upon, even when lockdep itself decided to pack up and go home. [ Impact: fix lock tracing after lockdep warnings. ] Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1239954049.23397.4156.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17tracing/events: perform function tracing in event selftestsSteven Rostedt
We can find some bugs in the trace events if we stress the writes as well. The function tracer is a good way to stress the events. [ Impact: extend scope of event tracer self-tests ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090416161746.604786131@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17tracing: add saved_cmdlines file to show cached task commsAvadh Patel
Export the cached task comms to userspace. This allows user apps to translate the pids from a trace into their respective task command lines. [ Impact: let userspace apps reading binary buffer know comm's of pids ] Signed-off-by: Avadh Patel <avadh4all@gmail.com> [ added error checking and use of buf pointer to index file_buf ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17tracing/events/ring-buffer: expose format of ring buffer headers to usersSteven Rostedt
Currently, every thing needed to read the binary output from the ring buffers is available, with the exception of the way the ring buffers handles itself internally. This patch creates two special files in the debugfs/tracing/events directory: # cat /debug/tracing/events/header_page field: u64 timestamp; offset:0; size:8; field: local_t commit; offset:8; size:8; field: char data; offset:16; size:4080; # cat /debug/tracing/events/header_event type : 2 bits len : 3 bits time_delta : 27 bits array : 32 bits padding : type == 0 time_extend : type == 1 data : type == 3 This is to allow a userspace app to see if the ring buffer format changes or not. [ Impact: allow userspace apps to know of ringbuffer format changes ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17tracing/events: add startup tests for eventsSteven Rostedt
As events start to become popular, and the new way to add tracing infrastructure into ftrace, it is important to catch any problems that might happen with a mistake in the TRACE_EVENT macro. This patch introduces a startup self test on the registered trace events. Note, it can only do a generic test, any type of testing that needs more involement is needed to be implemented by the tracepoint creators. The test goes down one by one enabling a trace point and running some random tasks (random in the sense that I just made them up). Those tasks are creating threads, grabbing mutexes and spinlocks and using workqueues. After testing each event individually, it does the same test after enabling each system of trace points. Like sched, irq, lockdep. Then finally it enables all tracepoints and performs the tasks again. The output to the console on bootup will look like this when everything works: Running tests on trace events: Testing event kfree_skb: OK Testing event kmalloc: OK Testing event kmem_cache_alloc: OK Testing event kmalloc_node: OK Testing event kmem_cache_alloc_node: OK Testing event kfree: OK Testing event kmem_cache_free: OK Testing event irq_handler_exit: OK Testing event irq_handler_entry: OK Testing event softirq_entry: OK Testing event softirq_exit: OK Testing event lock_acquire: OK Testing event lock_release: OK Testing event sched_kthread_stop: OK Testing event sched_kthread_stop_ret: OK Testing event sched_wait_task: OK Testing event sched_wakeup: OK Testing event sched_wakeup_new: OK Testing event sched_switch: OK Testing event sched_migrate_task: OK Testing event sched_process_free: OK Testing event sched_process_exit: OK Testing event sched_process_wait: OK Testing event sched_process_fork: OK Testing event sched_signal_send: OK Running tests on trace event systems: Testing event system skb: OK Testing event system kmem: OK Testing event system irq: OK Testing event system lockdep: OK Testing event system sched: OK Running tests on all trace events: Testing all events: OK [ folded in: tracing: add #include <linux/delay.h> to fix build failure in test_work() This build failure occured on a few rare configs: kernel/trace/trace_events.c: In function ‘test_work’: kernel/trace/trace_events.c:975: error: implicit declaration of function ‘udelay’ kernel/trace/trace_events.c:980: error: implicit declaration of function ‘msleep’ delay.h is included in way too many other headers, hiding cases where new usage is added without header inclusion. [ Impact: build fix ] Signed-off-by: Ingo Molnar <mingo@elte.hu> ] [ Impact: add event tracer self-tests ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17ftrace: use module notifier for function tracerSteven Rostedt
The hooks in the module code for the function tracer must be called before any of that module code runs. The function tracer hooks modify the module (replacing calls to mcount to nops). If the code is executed while the change occurs, then the CPU can take a GPF. To handle the above with a bit of paranoia, I originally implemented the hooks as calls directly from the module code. After examining the notifier calls, it looks as though the start up notify is called before any of the module's code is executed. This makes the use of the notify safe with ftrace. Only the startup notify is required to be "safe". The shutdown simply removes the entries from the ftrace function list, and does not modify any code. This change has another benefit. It removes a issue with a reverse dependency in the mutexes of ftrace_lock and module_mutex. [ Impact: fix lock dependency bug, cleanup ] Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-16Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix branch tracer header tracing: Fix power tracer header
2009-04-16Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Avoid printing sched_group::__cpu_power for default case tracing, sched: mark get_parent_ip() notrace
2009-04-16Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kernel/softirq.c: fix sparse warning rcu: Make hierarchical RCU less IPI-happy
2009-04-17kernel/softirq.c: fix sparse warningH Hartley Sweeten
Fix sparse warning in kernel/softirq.c. warning: do-while statement is not a compound statement Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> LKML-Reference: <BD79186B4FD85F4B8E60E381CAEE1909015F9033@mi8nycmail19.Mi8.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17sched: Avoid printing sched_group::__cpu_power for default caseGautham R Shenoy
Commit 46e0bb9c12f4 ("sched: Print sched_group::__cpu_power in sched_domain_debug") produces a messy dmesg output while attempting to print the sched_group::__cpu_power for each group in the sched_domain hierarchy. Fix this by avoid printing the __cpu_power for default cases. (i.e, __cpu_power == SCHED_LOAD_SCALE). [ Impact: reduce syslog clutter ] Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Fixed-by: Tony Luck <tony.luck@intel.com> Cc: a.p.zijlstra@chello.nl LKML-Reference: <20090414033936.GA534@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16blktrace: fix context-info when mixed-using blk tracer and trace eventsLi Zefan
When current tracer is set to blk tracer, TRACE_ITER_CONTEXT_INFO is unset, but actually context-info is printed: pdflush-431 [000] 821.181576: 8,0 P N [pdflush] And then if we enable TRACE_ITER_CONTEXT_INFO: # echo context-info > trace_options We'll see context-info printed twice. What's worse, when we use blk tracer and trace events at the same time, we'll see no context-info for trace events at all: jbd2_commit_logging: dev dm-0:8 transaction 333227 jbd2_end_commit: dev dm-0:8 transaction 333227 head 332814 rm-25433 [001] 9578.307485: 8,18 m N cfq25433 slice expired t=0 rm-25433 [001] 9578.307486: 8,18 m N cfq25433 put_queue This patch adds blk_tracer->set_flags(), and context-info flag is unset only when we set the output to classic mode. Note after this patch, one should unset context-info explicitly if he wants to get binary output that can be parsed by blkparse: # echo nocontext-info > trace_options # echo bin > trace_options # echo blk > current_tracer # cat trace_pipe | blkparse -i - Reported-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <49E54E60.50408@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16blktrace: add trace/ to /sys/block/sdaLi Zefan
Impact: allow ftrace-plugin blktrace to trace device-mapper devices To trace a single partition: # echo 1 > /sys/block/sda/sda1/enable To trace the whole sda instead: # echo 1 > /sys/block/sda/enable Thus we also fix an issue reported by Ted, that ftrace-plugin blktrace can't be used to trace device-mapper devices. Now: # echo 1 > /sys/block/dm-0/trace/enable echo: write error: No such device or address # mount -t ext4 /dev/dm-0 /mnt # echo 1 > /sys/block/dm-0/trace/enable # echo blk > /debug/tracing/current_tracer Reported-by: Theodore Tso <tytso@mit.edu> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Shawn Du <duyuyang@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> LKML-Reference: <49E42665.6020506@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16blktrace: support per-partition tracing for ftrace pluginLi Zefan
The previous patch adds support to trace a single partition for relay+ioctl blktrace, and this patch is for ftrace plugin blktrace: # echo 1 > /sys/block/sda/sda7/enable # cat start_lba 102398373 # cat end_lba 102703545 Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Shawn Du <duyuyang@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> LKML-Reference: <49E42646.4060608@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16blktrace: support per-partition tracingShawn Du
Though one can specify '-d /dev/sda1' when using blktrace, it still traces the whole sda. To support per-partition tracing, when we start tracing, we initialize bt->start_lba and bt->end_lba to the start and end sector of that partition. Note some actions are per device, thus we don't filter 0-sector events. The original patch and discussion can be found here: http://marc.info/?l=linux-btrace&m=122949374214540&w=2 Signed-off-by: Shawn Du <duyuyang@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> LKML-Reference: <49E42620.4050701@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-15RCU: Don't try and predeclare inline funcs as it upsets some versions of gccDavid Howells
Don't try and predeclare inline funcs like this: static inline void wait_migrated_callbacks(void) ... static void _rcu_barrier(enum rcu_barrier type) { ... wait_migrated_callbacks(); } ... static inline void wait_migrated_callbacks(void) { wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); } as it upsets some versions of gcc under some circumstances: kernel/rcupdate.c: In function `_rcu_barrier': kernel/rcupdate.c:125: sorry, unimplemented: inlining failed in call to 'wait_migrated_callbacks': function body not available kernel/rcupdate.c:152: sorry, unimplemented: called from here This can be dealt with by simply putting the static variables (rcu_migrate_*) at the top, and moving the implementation of the function up so that it replaces its forward declaration. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15swap: Remove code handling bio_alloc failure with __GFP_WAITNikanth Karthikesan
Remove code handling bio_alloc failure with __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-14tracing/events: move trace point headers into include/trace/eventsSteven Rostedt
Impact: clean up Create a sub directory in include/trace called events to keep the trace point headers in their own separate directory. Only headers that declare trace points should be defined in this directory. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing/events: fix compile for modules disabledSteven Rostedt
Impact: compile fix The addition of TRACE_EVENT for modules breaks the build for when modules are disabled. This code fixes that. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing/events: add support for modules to TRACE_EVENTSteven Rostedt
Impact: allow modules to add TRACE_EVENTS on load This patch adds the final hooks to allow modules to use the TRACE_EVENT macro. A notifier and a data structure are used to link the TRACE_EVENTs defined in the module to connect them with the ftrace event tracing system. It also adds the necessary automated clean ups to the trace events when a module is removed. Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing/events: add export symbols for trace events in modulesSteven Rostedt
Impact: let modules add trace events The trace event code requires some functions to be exported to allow modules to use TRACE_EVENT. This patch adds EXPORT_SYMBOL_GPL to the necessary functions. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing/events: convert event call sites to use a link listSteven Rostedt
Impact: makes it possible to define events in modules The events are created by reading down the section that they are linked in by the macros. But this is not scalable to modules. This patch converts the manipulations to use a global link list, and on boot up it adds the items in the section to the list. This change will allow modules to add their tracing events to the list as well. Note, this change alone does not permit modules to use the TRACE_EVENT macros, but the change is needed for them to eventually do so. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing/events: move the ftrace event tracing code to coreSteven Rostedt
This patch moves the ftrace creation into include/trace/ftrace.h and simplifies the work of developers in adding new tracepoints. Just the act of creating the trace points in include/trace and including define_trace.h will create the events in the debugfs/tracing/events directory. This patch removes the need of include/trace/trace_events.h Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing/events: move declarations from trace directory to core includeSteven Rostedt
In preparation to allowing trace events to happen in modules, we need to move some of the local declarations in the kernel/trace directory into include/linux. This patch simply moves the declarations and performs no context changes. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing: make trace_seq operations available for core kernelSteven Rostedt
In the process to make TRACE_EVENT macro work for modules, the trace_seq operations must be available for core kernel code. These operations are quite useful and can be used for other implementations. The main idea is that we create a trace_seq handle that acts very much like the seq_file handle. struct trace_seq *s = kmalloc(sizeof(*s, GFP_KERNEL); trace_seq_init(s); trace_seq_printf(s, "some data %d\n", variable); printk("%s", s->buffer); The main use is to allow a top level function call several other functions that may store printf like data into the buffer. Then at the end, the top level function can process all the data with any method it would like to. It could be passed to userspace, output via printk or even use seq_file: trace_seq_to_user(s, ubuf, cnt); seq_puts(m, s->buffer); Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing: create automated trace definesSteven Rostedt
This patch lowers the number of places a developer must modify to add new tracepoints. The current method to add a new tracepoint into an existing system is to write the trace point macro in the trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or DECLARE_TRACE, then they must add the same named item into the C file with the macro DEFINE_TRACE(name) and then add the trace point. This change cuts out the needing to add the DEFINE_TRACE(name). Every file that uses the tracepoint must still include the trace/<type>.h file, but the one C file must also add a define before the including of that file. #define CREATE_TRACE_POINTS #include <trace/mytrace.h> This will cause the trace/mytrace.h file to also produce the C code necessary to implement the trace point. Note, if more than one trace/<type>.h is used to create the C code it is best to list them all together. #define CREATE_TRACE_POINTS #include <trace/foo.h> #include <trace/bar.h> #include <trace/fido.h> Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with the cleaner solution of the define above the includes over my first design to have the C code include a "special" header. This patch converts sched, irq and lockdep and skb to use this new method. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14tracing: make the trace clocks available generallyIngo Molnar
Jeremy Fitzhardinge reported this build failure: LD .tmp_vmlinux1 arch/x86/kernel/built-in.o: In function `ds_take_timestamp': git/linux/arch/x86/kernel/ds.c:1380: undefined reference to `trace_clock_global' git/linux/arch/x86/kernel/ds.c:1380: undefined reference to `trace_clock_global' Which is due to !CONFIG_TRACING && CONFIG_X86_DS=y. Expose the trace clock code to CONFIG_X86_DS as well. [ Unfortunately librarizing doesnt work well - ancient architectures with no raw_local_irq_save() primitive break the build. ] Reported-by: Jeremy Fitzhardinge <jeremy@goop.org> LKML-Reference: <49E4413F.7070700@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-14tracing: consolidate trace and trace_event headersSteven Rostedt
Impact: clean up Neil Horman (et. al.) criticized the way the trace events were broken up into two files. The reason for that was that ftrace needed to separate out the declarations from where the #include <linux/tracepoint.h> was used. It then dawned on me that the tracepoint.h header only needs to define the TRACE_EVENT macro if it is not already defined. The solution is simply to test if TRACE_EVENT is defined, and if it is not then the linux/tracepoint.h header can define it. This change consolidates all the <traces>.h and <traces>_event_types.h into the <traces>.h file. Reported-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Theodore Tso <tytso@mit.edu> Reported-by: Jiaying Zhang <jiayingz@google.com> Cc: Zhaolei <zhaolei@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14x86, irq: Remove IRQ_DISABLED check in process context IRQ movePallipadi, Venkatesh
As discussed in the thread here: http://marc.info/?l=linux-kernel&m=123964468521142&w=2 Eric W. Biederman observed: > It looks like some additional bugs have slipped in since last I looked. > > set_irq_affinity does this: > ifdef CONFIG_GENERIC_PENDING_IRQ > if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) { > cpumask_copy(desc->affinity, cpumask); > desc->chip->set_affinity(irq, cpumask); > } else { > desc->status |= IRQ_MOVE_PENDING; > cpumask_copy(desc->pending_mask, cpumask); > } > #else > > That IRQ_DISABLED case is a software state and as such it has nothing to > do with how safe it is to move an irq in process context. [...] > > The only reason we migrate MSIs in interrupt context today is that there > wasn't infrastructure for support migration both in interrupt context > and outside of it. Yes. The idea here was to force the MSI migration to happen in process context. One of the patches in the series did disable_irq(dev->irq); irq_set_affinity(dev->irq, cpumask_of(dev->cpu)); enable_irq(dev->irq); with the above patch adding irq/manage code check for interrupt disabled and moving the interrupt in process context. IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there when we eventually submitted the patch upstream. But, looks like I did a blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code. Below patch fixes this. i.e., revert commit 932775a4ab622e3c99bd59f14cc and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity in generic code as set_affinity routines are doing it internally. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Li Shaohua" <shaohua.li@intel.com> Cc: Gary Hade <garyhade@us.ibm.com> Cc: "lcm@us.ibm.com" <lcm@us.ibm.com> Cc: suresh.b.siddha@intel.com LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>