summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2016-04-26tracing: Move event_trigger_unlock_commit{_regs}() to local headerSteven Rostedt (Red Hat)
The functions event_trigger_unlock_commit() and event_trigger_unlock_commit_regs() are no longer used outside the tracing system. Move them out of the generic headers and into the local one. Along with __event_trigger_test_discard() that is only used by them. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-27ftrace: Match dot symbols when searching functions on ppc64Thiago Jung Bauermann
In the ppc64 big endian ABI, function symbols point to function descriptors. The symbols which point to the function entry points have a dot in front of the function name. Consequently, when the ftrace filter mechanism searches for the symbol corresponding to an entry point address, it gets the dot symbol. As a result, ftrace filter users have to be aware of this ABI detail on ppc64 and prepend a dot to the function name when setting the filter. The perf probe command insulates the user from this by ignoring the dot in front of the symbol name when matching function names to symbols, but the sysfs interface does not. This patch makes the ftrace filter mechanism do the same when searching symbols. Fixes the following failure in ftracetest's kprobe_ftrace.tc: .../kprobe_ftrace.tc: line 9: echo: write error: Invalid argument That failure is on this line of kprobe_ftrace.tc: echo _do_fork > set_ftrace_filter This is because there's no _do_fork entry in the functions list: # cat available_filter_functions | grep _do_fork ._do_fork This change introduces no regressions on the perf and ftracetest testsuite results. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-26tracing: Don't use the address of the buffer array name in copy_from_userWang Xiaoqiang
With the following code snippet: ... char buf[64]; ... if (copy_from_user(&buf, ubuf, cnt)) ... Even though the value of "&buf" equals "buf", but there is no need to get the address of the "buf" again. Use "buf" instead of "&buf". Link: http://lkml.kernel.org/r/20160418152329.18b72bea@debian Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-26tracing: Handle tracing_map_alloc_elts() error path correctlyTom Zanussi
If tracing_map_elt_alloc() fails, it will return ERR_PTR() instead of NULL, so change the check to IS_ERROR(). We also need to set the failed entry in the map->elts array to NULL instead of ERR_PTR() so tracing_map_free_elts() doesn't try freeing an ERR_PTR(). tracing_map_free_elts() should also zero out what it frees so a reentrant call won't find previously freed elements. Link: http://lkml.kernel.org/r/f29d03b00bce3aac8cf151a8a30e6c83e5fee66d.1461610073.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-26tracing: Add check for NULL event field when creating hist fieldTom Zanussi
Smatch flagged create_hist_field() as possibly being able to dereference a NULL pointer, although the current code exits in all cases where the event field could be NULL, so it's not actually a problem. Still, to prevent future changes to the code from overlooking new cases, make the NULL pointer check explicit and warn once in that case. Link: http://lkml.kernel.org/r/cfbc003f534a3e441b4313272fd412310aba6336.1461610073.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-26tracing: checking for NULL instead of IS_ERR()Dan Carpenter
tracing_map_elt_alloc() returns ERR_PTRs on error, never NULL. Fixes: 08d43a5fa063 ('tracing: Add lock-free tracing_map') Link: http://lkml.kernel.org/r/20160423102347.GA11136@mwanda Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-25tracing: Do not inherit event-fork option for instancesSteven Rostedt (Red Hat)
As the event-fork option requires doing work when enabled and disabled, it can not be passed down to created instances. The instance must clear this flag when it is created, and must clear it when its removed. As more options may be created with this need, a macro ZEROED_TRACE_FLAGS is created that holds the flags that must not be inherited by the top level instance, and must be cleared on removal of instances. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19bpf: add event output helper for notifications/sampling/loggingDaniel Borkmann
This patch adds a new helper for cls/act programs that can push events to user space applications. For networking, this can be f.e. for sampling, debugging, logging purposes or pushing of arbitrary wake-up events. The idea is similar to a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") and 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example"). The eBPF program utilizes a perf event array map that user space populates with fds from perf_event_open(), the eBPF program calls into the helper f.e. as skb_event_output(skb, &my_map, BPF_F_CURRENT_CPU, raw, sizeof(raw)) so that the raw data is pushed into the fd f.e. at the map index of the current CPU. User space can poll/mmap/etc on this and has a data channel for receiving events that can be post-processed. The nice thing is that since the eBPF program and user space application making use of it are tightly coupled, they can define their own arbitrary raw data format and what/when they want to push. While f.e. packet headers could be one part of the meta data that is being pushed, this is not a substitute for things like packet sockets as whole packet is not being pushed and push is only done in a single direction. Intention is more of a generically usable, efficient event pipe to applications. Workflow is that tc can pin the map and applications can attach themselves e.g. after cls/act setup to one or multiple map slots, demuxing is done by the eBPF program. Adding this facility is with minimal effort, it reuses the helper introduced in a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") and we get its functionality for free by overloading its BPF_FUNC_ identifier for cls/act programs, ctx is currently unused, but will be made use of in future. Example will be added to iproute2's BPF example files. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-19bpf, trace: add BPF_F_CURRENT_CPU flag for bpf_perf_event_outputDaniel Borkmann
Add a BPF_F_CURRENT_CPU flag to optimize the use-case where user space has per-CPU ring buffers and the eBPF program pushes the data into the current CPU's ring buffer which saves us an extra helper function call in eBPF. Also, make sure to properly reserve the remaining flags which are not used. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-19tracing: Fix unsigned comparison to zero in hist trigger codeSteven Rostedt (Red Hat)
Fengguang Wu's bot found two comparisons of unsigned integers to zero. These were real bugs, as it would miss error conditions returned to zero. trace_events_hist.c:426:6-9: WARNING: Unsigned expression compared with zero: idx < 0 trace_events_hist.c:568:5-14: WARNING: Unsigned expression compared with zero: n_entries < 0 Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger 'log2' modifierNamhyung Kim
Allow users to have numeric fields displayed as log2 values in case value range is very wide by appending '.log2' to field names. For example, # echo 'hist:key=bytes_req' > kmalloc/trigger # cat kmalloc/hist { bytes_req: 504 } hitcount: 1 { bytes_req: 11 } hitcount: 1 { bytes_req: 104 } hitcount: 1 { bytes_req: 48 } hitcount: 1 { bytes_req: 2048 } hitcount: 1 { bytes_req: 4096 } hitcount: 1 { bytes_req: 240 } hitcount: 1 { bytes_req: 392 } hitcount: 1 { bytes_req: 13 } hitcount: 1 { bytes_req: 28 } hitcount: 1 { bytes_req: 12 } hitcount: 1 { bytes_req: 64 } hitcount: 2 { bytes_req: 128 } hitcount: 2 { bytes_req: 32 } hitcount: 2 { bytes_req: 8 } hitcount: 11 { bytes_req: 10 } hitcount: 13 { bytes_req: 24 } hitcount: 25 { bytes_req: 160 } hitcount: 29 { bytes_req: 16 } hitcount: 33 { bytes_req: 80 } hitcount: 36 When using '.log2' modifier, the output looks like: # echo 'hist:key=bytes_req.log2' > kmalloc/trigger # cat kmalloc/hist { bytes_req: ~ 2^12 } hitcount: 1 { bytes_req: ~ 2^11 } hitcount: 1 { bytes_req: ~ 2^9 } hitcount: 2 { bytes_req: ~ 2^6 } hitcount: 3 { bytes_req: ~ 2^3 } hitcount: 13 { bytes_req: ~ 2^5 } hitcount: 19 { bytes_req: ~ 2^8 } hitcount: 49 { bytes_req: ~ 2^7 } hitcount: 57 { bytes_req: ~ 2^4 } hitcount: 74 Link: http://lkml.kernel.org/r/7ff396b246c6a881f46b979735fddf05a0d6c71a.1457029949.git.tom.zanussi@linux.intel.com Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add support for named hist triggersTom Zanussi
Allow users to define 'named' hist triggers. All triggers created with the same 'name=xxx' option will update the same shared histogram data. This expands the hist trigger syntax from this: # echo hist:keys=xxx ... [ if filter] > event/trigger to this: # echo hist:name=xxx:keys=xxx ... [ if filter] > event/trigger Named histograms must use a 'compatible' set of keys and values, which means each event added to a set of named triggers must have the same names and types. Reading the 'hist' file of any of the participating events will produce the same output as any other participating event, which is to be expected since they share the same data. Link: http://lkml.kernel.org/r/1dbc84ee3322a75daaf5b3ef1d0cc0a2fb682fc7.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add support for named triggersTom Zanussi
Named triggers are sets of triggers that share a common set of trigger data. An example of functionality that could benefit from this type of capability would be a set of inlined probes that would each contribute event counts, for example, to a shared counter data structure. The first named trigger registered with a given name owns the common trigger data that the others subsequently registered with the same name will reference. The functions defined here allow users to add, delete, and find named triggers. It also adds functions to pause and unpause named triggers; since named triggers act upon common data, they should also be paused and unpaused as a group. Link: http://lkml.kernel.org/r/c09ff648360f65b10a3e321eddafe18060b4a04f.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add support for multiple hist triggers per eventTom Zanussi
Allow users to define any number of hist triggers per trace event. Any number of hist triggers may be added for a given event, which may differ by key, value, or filter. Reading the event's 'hist' file will display the output of all the hist triggers defined on an event concatenated in the order they were defined. Link: http://lkml.kernel.org/r/48a0c8dd34c344571de880fb35e211c6d9a28961.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add enable_hist/disable_hist triggersTom Zanussi
Similar to enable_event/disable_event triggers, these triggers enable and disable the aggregation of events into maps rather than enabling and disabling their writing into the trace buffer. They can be used to automatically start and stop hist triggers based on a matching filter condition. If there's a paused hist trigger on system:event, the following would start it when the filter condition was hit: # echo enable_hist:system:event [ if filter] > event/trigger And the following would disable a running system:event hist trigger: # echo disable_hist:system:event [ if filter] > event/trigger See Documentation/trace/events.txt for real examples. Link: http://lkml.kernel.org/r/f812f086e52c8b7c8ad5443487375e03c96a601f.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Remove restriction on string position in hist trigger keysTom Zanussi
If we assume the maximum size for a string field, we don't have to worry about its position. Since we only allow two keys in a compound key and having more than one string key in a given compound key doesn't make much sense anyway, trading a bit of extra space instead of introducing an arbitrary restriction makes more sense. We also need to use the event field size for static strings when copying the contents, otherwise we get random garbage in the key. Also, cast string return values to avoid warnings on 32-bit compiles. Finally, rearrange the code without changing any functionality by moving the compound key updating code into a separate function. Link: http://lkml.kernel.org/r/8976e1ab04b66bc2700ad1ed0768a2de85ac1983.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Support string type key properlyNamhyung Kim
The string in a trace event is usually recorded as dynamic array which is variable length. But current hist code only support fixed length array so it cannot support most strings. This patch fixes it by checking filter_type of the field and get proper pointer with it. With this, it can get a histogram of exec() based on filenames like below: # cd /sys/kernel/tracing/events/sched/sched_process_exec # cat 'hist:key=filename' > trigger # ps PID TTY TIME CMD 1 ? 00:00:00 init 29 ? 00:00:00 sh 38 ? 00:00:00 ps # ls enable filter format hist id trigger # cat hist # trigger info: hist:keys=filename:vals=hitcount:sort=hitcount:size=2048 [active] { filename: /usr/bin/ps } hitcount: 1 { filename: /usr/bin/ls } hitcount: 1 { filename: /usr/bin/cat } hitcount: 1 Totals: Hits: 3 Entries: 3 Dropped: 0 Link: http://lkml.kernel.org/r/610180d6df0cfdf11ee205452f3b241dea657233.1457029949.git.tom.zanussi@linux.intel.com Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> [ Added (unsigned long) typecast to fix compile warning ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger support for stacktraces as keysTom Zanussi
It's often useful to be able to use a stacktrace as a hash key, for keeping a count of the number of times a particular call path resulted in a trace event, for instance. Add a special key named 'stacktrace' which can be used as key in a 'keys=' param for this purpose: # echo hist:keys=stacktrace ... \ [ if filter] > event/trigger Link: http://lkml.kernel.org/r/87515e90b3785232a874a12156174635a348edb1.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger 'syscall' modifierTom Zanussi
Allow users to have syscall id fields displayed as syscall names in the output by appending '.syscall' to field names: # echo hist:keys=aaa.syscall ... \ [ if filter] > event/trigger Link: http://lkml.kernel.org/r/2bab1e59933d76a14b545bd2e02f80b8b08ac4d3.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger 'execname' modifierTom Zanussi
Allow users to have common_pid field values displayed as program names in the output by appending '.execname' to a common_pid field name: # echo hist:keys=common_pid.execname ... \ [ if filter] > event/trigger Link: http://lkml.kernel.org/r/e172e81f10f5b8d1f08450e3763c850f39fbf698.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger 'sym' and 'sym-offset' modifiersTom Zanussi
Allow users to have address fields displayed as symbols in the output by appending '.sym' or 'sym-offset' to field names: # echo hist:keys=aaa.sym,bbb.sym-offset ... \ [ if filter] > event/trigger Link: http://lkml.kernel.org/r/87d4935821491c0275513f0fbfb9bab8d3d3f079.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger 'hex' modifier for displaying numeric fieldsTom Zanussi
Allow users to have numeric fields displayed as hex values in the output by appending '.hex' to field names: # echo hist:keys=aaa,bbb.hex:vals=ccc.hex ... \ [ if filter] > event/trigger Link: http://lkml.kernel.org/r/67bd431edda2af5798d7694818f7e8d71b6b3463.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger support for clearing a traceTom Zanussi
Allow users to append 'clear' to an existing trigger in order to have the hash table cleared. This expands the hist trigger syntax from this: # echo hist:keys=xxx:vals=yyy:sort=zzz.descending:pause/cont \ [ if filter] >> event/trigger to this: # echo hist:keys=xxx:vals=yyy:sort=zzz.descending:pause/cont/clear \ [ if filter] >> event/trigger Link: http://lkml.kernel.org/r/ae15dd0d9b2f7af07a37c1ff682063e2dbcdf160.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger support for pausing and continuing a traceTom Zanussi
Allow users to append 'pause' or 'continue' to an existing trigger in order to have it paused or to have a paused trace continue. This expands the hist trigger syntax from this: # echo hist:keys=xxx:vals=yyy:sort=zzz.descending \ [ if filter] >> event/trigger to this: # echo hist:keys=xxx:vals=yyy:sort=zzz.descending:pause or cont \ [ if filter] >> event/trigger Link: http://lkml.kernel.org/r/b672a92c14702cb924cdf6fc27ea1809bed04907.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger support for user-defined sorting ('sort=' param)Tom Zanussi
Allow users to specify keys and/or values to sort on. With this addition, keys and values specified using the 'keys=' and 'vals=' keywords can be used to sort the hist trigger output via a new 'sort=' keyword. If multiple sort keys are specified, the output will be sorted using the second key as a secondary sort key, etc. The default sort order is ascending; if the user wants a different sort order, '.descending' can be appended to the specific sort key. Before this addition, output was always sorted by 'hitcount' in ascending order. This expands the hist trigger syntax from this: # echo hist:keys=xxx:vals=yyy \ [ if filter] > event/trigger to this: # echo hist:keys=xxx:vals=yyy:sort=zzz.descending \ [ if filter] > event/trigger Link: http://lkml.kernel.org/r/b30a41db66ba486979c4f987aff5fab500ea53b3.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger support for compound keysTom Zanussi
Allow users to specify multiple trace event fields to use in keys by allowing multiple fields in the 'keys=' keyword. With this addition, any unique combination of any of the fields named in the 'keys' keyword will result in a new entry being added to the hash table. Link: http://lkml.kernel.org/r/0cfa24e6ac3b0dcece7737d94aa1f322ae3afc4b.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add hist trigger support for multiple values ('vals=' param)Tom Zanussi
Allow users to specify trace event fields to use in aggregated sums via a new 'vals=' keyword. Before this addition, the only aggregated sum supported was the implied value 'hitcount'. With this addition, 'hitcount' is also supported as an explicit value field, as is any numeric trace event field. This expands the hist trigger syntax from this: # echo hist:keys=xxx [ if filter] > event/trigger to this: # echo hist:keys=xxx:vals=yyy [ if filter] > event/trigger Link: http://lkml.kernel.org/r/2a5d1adb5ba6c65d7bb2148e379f2fed47f29a68.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add 'hist' event trigger commandTom Zanussi
'hist' triggers allow users to continually aggregate trace events, which can then be viewed afterwards by simply reading a 'hist' file containing the aggregation in a human-readable format. The basic idea is very simple and boils down to a mechanism whereby trace events, rather than being exhaustively dumped in raw form and viewed directly, are automatically 'compressed' into meaningful tables completely defined by the user. This is done strictly via single-line command-line commands and without the aid of any kind of programming language or interpreter. A surprising number of typical use cases can be accomplished by users via this simple mechanism. In fact, a large number of the tasks that users typically do using the more complicated script-based tracing tools, at least during the initial stages of an investigation, can be accomplished by simply specifying a set of keys and values to be used in the creation of a hash table. The Linux kernel trace event subsystem happens to provide an extensive list of keys and values ready-made for such a purpose in the form of the event format files associated with each trace event. By simply consulting the format file for field names of interest and by plugging them into the hist trigger command, users can create an endless number of useful aggregations to help with investigating various properties of the system. See Documentation/trace/events.txt for examples. hist triggers are implemented on top of the existing event trigger infrastructure, and as such are consistent with the existing triggers from a user's perspective as well. The basic syntax follows the existing trigger syntax. Users start an aggregation by writing a 'hist' trigger to the event of interest's trigger file: # echo hist:keys=xxx [ if filter] > event/trigger Once a hist trigger has been set up, by default it continually aggregates every matching event into a hash table using the event key and a value field named 'hitcount'. To view the aggregation at any point in time, simply read the 'hist' file in the same directory as the 'trigger' file: # cat event/hist The detailed syntax provides additional options for user control, and is described exhaustively in Documentation/trace/events.txt and in the virtual tracing/README file in the tracing subsystem. Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Update some tracing_map constants and commentsTom Zanussi
Make it clear exactly how many keys and values are supported through better defines, and add 1 to the vals count, since normally clients want support for at least a hitcount and two other values. Also, note the error return value for tracing_map_add_key/val_field() in the comments. Link: http://lkml.kernel.org/r/6696fa02ebc716aa344c27a571a2afaa25e5b4d4.1457029949.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Fix TRACING_MAP KconfigSteven Rostedt (Red Hat)
The config option for TRACING_MAP has "default n", which is not needed because the default of configs is 'n'. Also, since the TRACING_MAP has no config prompt, there's no reason to include "If in doubt, say N" in the help text. Fixed a typo in the comments of tracing_map.h. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add lock-free tracing_mapTom Zanussi
Add tracing_map, a special-purpose lock-free map for tracing. tracing_map is designed to aggregate or 'sum' one or more values associated with a specific object of type tracing_map_elt, which is associated by the map to a given key. It provides various hooks allowing per-tracer customization and is separated out into a separate file in order to allow it to be shared between multiple tracers, but isn't meant to be generally used outside of that context. The tracing_map implementation was inspired by lock-free map algorithms originated by Dr. Cliff Click: http://www.azulsystems.com/blog/cliff/2007-03-26-non-blocking-hashtable http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf Link: http://lkml.kernel.org/r/b43d68d1add33582a396f553c8ef705a33a6a748.1449767187.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Add infrastructure to allow set_event_pid to follow childrenSteven Rostedt
Add the infrastructure needed to have the PIDs in set_event_pid to automatically add PIDs of the children of the tasks that have their PIDs in set_event_pid. This will also remove PIDs from set_event_pid when a task exits This is implemented by adding hooks into the fork and exit tracepoints. On fork, the PIDs are added to the list, and on exit, they are removed. Add a new option called event_fork that when set, PIDs in set_event_pid will automatically get their children PIDs added when they fork, as well as any task that exits will have its PID removed from set_event_pid. This works for instances as well. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Use pid bitmap instead of a pid array for set_event_pidSteven Rostedt
In order to add the ability to let tasks that are filtered by the events have their children also be traced on fork (and then not traced on exit), convert the array into a pid bitmask. Most of the time the number of pids is only 32768 pids or a 4k bitmask, which is the same size as the default list currently is, and that list could grow if more pids are listed. This also greatly simplifies the code. Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-19tracing: Rename check_ignore_pid() to ignore_this_task()Steven Rostedt
The name "check_ignore_pid" is confusing in trying to figure out if the pid should be ignored or not. Rename it to "ignore_this_task" which is pretty straight forward, as a task (not a pid) is passed in, and should if true should be ignored. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-18bpf: avoid warning for wrong pointer castArnd Bergmann
Two new functions in bpf contain a cast from a 'u64' to a pointer. This works on 64-bit architectures but causes a warning on all 32-bit architectures: kernel/trace/bpf_trace.c: In function 'bpf_perf_event_output_tp': kernel/trace/bpf_trace.c:350:13: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] u64 ctx = *(long *)r1; This changes the cast to first convert the u64 argument into a uintptr_t, which is guaranteed to be the same size as a pointer. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 9940d67c93b5 ("bpf: support bpf_get_stackid() and bpf_perf_event_output() in tracepoint programs") Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-18Merge branch 'topic/livepatch' into nextMichael Ellerman
Merge the support for live patching on ppc64le using mprofile-kernel. This branch has also been merged into the livepatching tree for v4.7.
2016-04-15Merge branch 'topic/livepatch' of ↵Jiri Kosina
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into for-4.7/livepatching-ppc64le Pull livepatching support for ppc64 architecture from Michael Ellerman. Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-14bpf: convert relevant helper args to ARG_PTR_TO_RAW_STACKDaniel Borkmann
This patch converts all helpers that can use ARG_PTR_TO_RAW_STACK as argument type. For tc programs this is bpf_skb_load_bytes(), bpf_skb_get_tunnel_key(), bpf_skb_get_tunnel_opt(). For tracing, this optimizes bpf_get_current_comm() and bpf_probe_read(). The check in bpf_skb_load_bytes() for MAX_BPF_STACK can also be removed since the verifier already makes sure we stay within bounds on stack buffers. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14ftrace: Make ftrace_location_range() globalMichael Ellerman
In order to support live patching on powerpc we would like to call ftrace_location_range(), so make it global. Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-13Merge tag 'v4.6-rc3' into perf/core, to refresh the treeIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-07bpf: sanitize bpf tracepoint accessAlexei Starovoitov
during bpf program loading remember the last byte of ctx access and at the time of attaching the program to tracepoint check that the program doesn't access bytes beyond defined in tracepoint fields This also disallows access to __dynamic_array fields, but can be relaxed in the future. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07bpf: support bpf_get_stackid() and bpf_perf_event_output() in tracepoint ↵Alexei Starovoitov
programs needs two wrapper functions to fetch 'struct pt_regs *' to convert tracepoint bpf context into kprobe bpf context to reuse existing helper functions Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07bpf: register BPF_PROG_TYPE_TRACEPOINT program typeAlexei Starovoitov
register tracepoint bpf program type and let it call the same set of helper functions as BPF_PROG_TYPE_KPROBE Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07perf: split perf_trace_buf_prepare into alloc and update partsAlexei Starovoitov
split allows to move expensive update of 'struct trace_entry' to later phase. Repurpose unused 1st argument of perf_tp_event() to indicate event type. While splitting use temp variable 'rctx' instead of '*rctx' to avoid unnecessary loads done by the compiler due to -fno-strict-aliasing Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07perf: optimize perf_fetch_caller_regsAlexei Starovoitov
avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints. It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs, so we can safely drop memset from all of the above cases and move it into perf_ftrace_function_call that calls it with stack allocated pt_regs. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-02cpufreq: schedutil: New governor based on scheduler utilization dataRafael J. Wysocki
Add a new cpufreq scaling governor, called "schedutil", that uses scheduler-provided CPU utilization information as input for making its decisions. Doing that is possible after commit 34e2c555f3e1 (cpufreq: Add mechanism for registering utilization update callbacks) that introduced cpufreq_update_util() called by the scheduler on utilization changes (from CFS) and RT/DL task status updates. In particular, CPU frequency scaling decisions may be based on the the utilization data passed to cpufreq_update_util() by CFS. The new governor is relatively simple. The frequency selection formula used by it depends on whether or not the utilization is frequency-invariant. In the frequency-invariant case the new CPU frequency is given by next_freq = 1.25 * max_freq * util / max where util and max are the last two arguments of cpufreq_update_util(). In turn, if util is not frequency-invariant, the maximum frequency in the above formula is replaced with the current frequency of the CPU: next_freq = 1.25 * curr_freq * util / max The coefficient 1.25 corresponds to the frequency tipping point at (util / max) = 0.8. All of the computations are carried out in the utilization update handlers provided by the new governor. One of those handlers is used for cpufreq policies shared between multiple CPUs and the other one is for policies with one CPU only (and therefore it doesn't need to use any extra synchronization means). The governor supports fast frequency switching if that is supported by the cpufreq driver in use and possible for the given policy. In the fast switching case, all operations of the governor take place in its utilization update handlers. If fast switching cannot be used, the frequency switch operations are carried out with the help of a work item which only calls __cpufreq_driver_target() (under a mutex) to trigger a frequency update (to a value already computed beforehand in one of the utilization update handlers). Currently, the governor treats all of the RT and DL tasks as "unknown utilization" and sets the frequency to the allowed maximum when updated from the RT or DL sched classes. That heavy-handed approach should be replaced with something more subtle and specifically targeted at RT and DL tasks. The governor shares some tunables management code with the "ondemand" and "conservative" governors and uses some common definitions from cpufreq_governor.h, but apart from that it is stand-alone. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-03-31ftrace/perf: Check sample types only for sampling eventsJiri Olsa
Currently we check sample type for ftrace:function events even if it's not created as a sampling event. That prevents creating ftrace_function event in counting mode. Make sure we check sample types only for sampling events. Before: $ sudo perf stat -e ftrace:function ls ... Performance counter stats for 'ls': <not supported> ftrace:function 0.001983662 seconds time elapsed After: $ sudo perf stat -e ftrace:function ls ... Performance counter stats for 'ls': 44,498 ftrace:function 0.037534722 seconds time elapsed Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1458138873-1553-2-git-send-email-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-25arch, ftrace: for KASAN put hard/soft IRQ entries into separate sectionsAlexander Potapenko
KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-24Merge tag 'trace-v4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Nothing major this round. Mostly small clean ups and fixes. Some visible changes: - A new flag was added to distinguish traces done in NMI context. - Preempt tracer now shows functions where preemption is disabled but interrupts are still enabled. Other notes: - Updates were done to function tracing to allow better performance with perf. - Infrastructure code has been added to allow for a new histogram feature for recording live trace event histograms that can be configured by simple user commands. The feature itself was just finished, but needs a round in linux-next before being pulled. This only includes some infrastructure changes that will be needed" * tag 'trace-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (22 commits) tracing: Record and show NMI state tracing: Fix trace_printk() to print when not using bprintk() tracing: Remove redundant reset per-CPU buff in irqsoff tracer x86: ftrace: Fix the misleading comment for arch/x86/kernel/ftrace.c tracing: Fix crash from reading trace_pipe with sendfile tracing: Have preempt(irqs)off trace preempt disabled functions tracing: Fix return while holding a lock in register_tracer() ftrace: Use kasprintf() in ftrace_profile_tracefs() ftrace: Update dynamic ftrace calls only if necessary ftrace: Make ftrace_hash_rec_enable return update bool tracing: Fix typoes in code comment and printk in trace_nop.c tracing, writeback: Replace cgroup path to cgroup ino tracing: Use flags instead of bool in trigger structure tracing: Add an unreg_all() callback to trigger commands tracing: Add needs_rec flag to event triggers tracing: Add a per-event-trigger 'paused' field tracing: Add get_syscall_name() tracing: Add event record param to trigger_ops.func() tracing: Make event trigger functions available tracing: Make ftrace_event_field checking functions available ...
2016-03-22kernel/...: convert pr_warning to pr_warnJoe Perches
Use the more common logging method with the eventual goal of removing pr_warning altogether. Miscellanea: - Realign arguments - Coalesce formats - Add missing space between a few coalesced formats Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [kernel/power/suspend.c] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>