Age | Commit message (Collapse) | Author |
|
The -G/--cgroup-filter is to limit lock contention collection on the
tasks in the specific cgroups only.
$ sudo ./perf lock con -abt -G /user.slice/.../vte-spawn-52221fb8-b33f-4a52-b5c3-e35d1e6fc0e0.scope \
./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.174 [sec]
contended total wait max wait avg wait pid comm
4 114.45 us 60.06 us 28.61 us 214847 sched-messaging
2 111.40 us 60.84 us 55.70 us 214848 sched-messaging
2 106.09 us 59.42 us 53.04 us 214837 sched-messaging
1 81.70 us 81.70 us 81.70 us 214709 sched-messaging
68 78.44 us 6.83 us 1.15 us 214633 sched-messaging
69 73.71 us 2.69 us 1.07 us 214632 sched-messaging
4 72.62 us 60.83 us 18.15 us 214850 sched-messaging
2 71.75 us 67.60 us 35.88 us 214840 sched-messaging
2 69.29 us 67.53 us 34.65 us 214804 sched-messaging
2 69.00 us 68.23 us 34.50 us 214826 sched-messaging
...
Export cgroup__new() function as it's needed from outside.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The --lock-cgroup option shows lock contention stats break down by
cgroups.
Add LOCK_AGGR_CGROUP mode and use it instead of use_cgroup field.
$ sudo ./perf lock con -ab --lock-cgroup sleep 1
contended total wait max wait avg wait cgroup
8 15.70 us 6.34 us 1.96 us /
2 1.48 us 747 ns 738 ns /user.slice/.../app.slice/app-gnome-google\x2dchrome-6442.scope
1 848 ns 848 ns 848 ns /user.slice/.../session.slice/org.gnome.Shell@x11.service
1 220 ns 220 ns 220 ns /user.slice/.../session.slice/pipewire-pulse.service
For now, the cgroup mode only works with BPF (-b).
Committer notes:
Remove -g as it is used in the other tools with a clear meaning of
collect/show callchains. As agreed with Namhyung off list.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Save cgroup info and display cgroup names if requested. This is a
preparation for the next patch.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The read_all_cgroups() is to build a tree of cgroups in the system and
users can look up a cgroup using __cgroup_find().
Committer notes:
Had to do this to cover that #else block:
-static inline u64 __read_cgroup_id(const char *path) { return -1ULL; }
+static inline u64 __read_cgroup_id(const char *path __maybe_unused) { return -1ULL; }
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use BPF to collect statistics on softirq events based on perf BPF skeletons.
Example usage:
# perf kwork top -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 135445.704 ms, 8 cpus
%Cpu(s): 28.35% id, 0.00% hi, 0.25% si
%Cpu0 [|||||||||||||||||||| 69.85%]
%Cpu1 [|||||||||||||||||||||| 74.10%]
%Cpu2 [||||||||||||||||||||| 71.18%]
%Cpu3 [|||||||||||||||||||| 69.61%]
%Cpu4 [|||||||||||||||||||||| 74.05%]
%Cpu5 [|||||||||||||||||||| 69.33%]
%Cpu6 [|||||||||||||||||||| 69.71%]
%Cpu7 [|||||||||||||||||||||| 73.77%]
PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
0 0 30.43 5271.005 ms [swapper/5]
0 0 30.17 5226.644 ms [swapper/3]
0 0 30.08 5210.257 ms [swapper/6]
0 0 29.89 5177.177 ms [swapper/0]
0 0 28.51 4938.672 ms [swapper/2]
0 0 25.93 4223.464 ms [swapper/7]
0 0 25.69 4181.411 ms [swapper/4]
0 0 25.63 4173.804 ms [swapper/1]
16665 16265 2.16 360.600 ms sched-messaging
16537 16265 2.05 356.275 ms sched-messaging
16503 16265 2.01 343.063 ms sched-messaging
16424 16265 1.97 336.876 ms sched-messaging
16580 16265 1.94 323.658 ms sched-messaging
16515 16265 1.92 321.616 ms sched-messaging
16659 16265 1.91 325.538 ms sched-messaging
16634 16265 1.88 327.766 ms sched-messaging
16454 16265 1.87 326.843 ms sched-messaging
16382 16265 1.87 322.591 ms sched-messaging
16642 16265 1.86 320.506 ms sched-messaging
16582 16265 1.86 320.164 ms sched-messaging
16315 16265 1.86 326.872 ms sched-messaging
16637 16265 1.85 323.766 ms sched-messaging
16506 16265 1.82 311.688 ms sched-messaging
16512 16265 1.81 304.643 ms sched-messaging
16560 16265 1.80 314.751 ms sched-messaging
16320 16265 1.80 313.405 ms sched-messaging
16442 16265 1.80 314.403 ms sched-messaging
16626 16265 1.78 295.380 ms sched-messaging
16600 16265 1.77 309.444 ms sched-messaging
16550 16265 1.76 301.161 ms sched-messaging
16525 16265 1.75 296.560 ms sched-messaging
16314 16265 1.75 298.338 ms sched-messaging
16595 16265 1.74 304.390 ms sched-messaging
16555 16265 1.74 287.564 ms sched-messaging
16520 16265 1.74 295.734 ms sched-messaging
16507 16265 1.73 293.956 ms sched-messaging
16593 16265 1.72 296.443 ms sched-messaging
16531 16265 1.72 299.950 ms sched-messaging
16281 16265 1.72 301.339 ms sched-messaging
<SNIP>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-17-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use BPF to collect statistics on hardirq events based on perf BPF skeletons.
Example usage:
# perf kwork top -k sched,irq -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 136717.945 ms, 8 cpus
%Cpu(s): 17.10% id, 0.01% hi, 0.00% si
%Cpu0 [||||||||||||||||||||||||| 84.26%]
%Cpu1 [||||||||||||||||||||||||| 84.77%]
%Cpu2 [|||||||||||||||||||||||| 83.22%]
%Cpu3 [|||||||||||||||||||||||| 80.37%]
%Cpu4 [|||||||||||||||||||||||| 81.49%]
%Cpu5 [||||||||||||||||||||||||| 84.68%]
%Cpu6 [||||||||||||||||||||||||| 84.48%]
%Cpu7 [|||||||||||||||||||||||| 80.21%]
PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
0 0 19.78 3482.833 ms [swapper/7]
0 0 19.62 3454.219 ms [swapper/3]
0 0 18.50 3258.339 ms [swapper/4]
0 0 16.76 2842.749 ms [swapper/2]
0 0 15.71 2627.905 ms [swapper/0]
0 0 15.51 2598.206 ms [swapper/6]
0 0 15.31 2561.820 ms [swapper/5]
0 0 15.22 2548.708 ms [swapper/1]
13253 13018 2.95 513.108 ms sched-messaging
13092 13018 2.67 454.167 ms sched-messaging
13401 13018 2.66 454.790 ms sched-messaging
13240 13018 2.64 454.587 ms sched-messaging
13251 13018 2.61 442.273 ms sched-messaging
13075 13018 2.61 438.932 ms sched-messaging
13220 13018 2.60 443.245 ms sched-messaging
13235 13018 2.59 443.268 ms sched-messaging
13222 13018 2.50 426.344 ms sched-messaging
13410 13018 2.49 426.191 ms sched-messaging
13228 13018 2.46 425.121 ms sched-messaging
13379 13018 2.38 409.950 ms sched-messaging
13236 13018 2.37 413.159 ms sched-messaging
13095 13018 2.36 396.572 ms sched-messaging
13325 13018 2.35 408.089 ms sched-messaging
13242 13018 2.32 394.750 ms sched-messaging
13386 13018 2.31 396.997 ms sched-messaging
13046 13018 2.29 383.833 ms sched-messaging
13109 13018 2.28 388.482 ms sched-messaging
13388 13018 2.28 393.576 ms sched-messaging
13238 13018 2.26 388.487 ms sched-messaging
<SNIP>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-16-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use BPF to collect statistics on the CPU usage based on perf BPF skeletons.
Example usage:
# perf kwork top -h
Usage: perf kwork top [<options>]
-b, --use-bpf Use BPF to measure task cpu usage
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): rate, runtime, tid
--time <str> Time span for analysis (start,stop)
#
# perf kwork -k sched top -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 160702.425 ms, 8 cpus
%Cpu(s): 36.00% id, 0.00% hi, 0.00% si
%Cpu0 [|||||||||||||||||| 61.66%]
%Cpu1 [|||||||||||||||||| 61.27%]
%Cpu2 [||||||||||||||||||| 66.40%]
%Cpu3 [|||||||||||||||||| 61.28%]
%Cpu4 [|||||||||||||||||| 61.82%]
%Cpu5 [||||||||||||||||||||||| 77.41%]
%Cpu6 [|||||||||||||||||| 61.73%]
%Cpu7 [|||||||||||||||||| 63.25%]
PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
0 0 38.72 8089.463 ms [swapper/1]
0 0 38.71 8084.547 ms [swapper/3]
0 0 38.33 8007.532 ms [swapper/0]
0 0 38.26 7992.985 ms [swapper/6]
0 0 38.17 7971.865 ms [swapper/4]
0 0 36.74 7447.765 ms [swapper/7]
0 0 33.59 6486.942 ms [swapper/2]
0 0 22.58 3771.268 ms [swapper/5]
9545 9351 2.48 447.136 ms sched-messaging
9574 9351 2.09 418.583 ms sched-messaging
9724 9351 2.05 372.407 ms sched-messaging
9531 9351 2.01 368.804 ms sched-messaging
9512 9351 2.00 362.250 ms sched-messaging
9514 9351 1.95 357.767 ms sched-messaging
9538 9351 1.86 384.476 ms sched-messaging
9712 9351 1.84 386.490 ms sched-messaging
9723 9351 1.83 380.021 ms sched-messaging
9722 9351 1.82 382.738 ms sched-messaging
9517 9351 1.81 354.794 ms sched-messaging
9559 9351 1.79 344.305 ms sched-messaging
9725 9351 1.77 365.315 ms sched-messaging
<SNIP>
# perf kwork -k sched top -b -n perf
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 151563.332 ms, 8 cpus
%Cpu(s): 26.49% id, 0.00% hi, 0.00% si
%Cpu0 [ 0.01%]
%Cpu1 [ 0.00%]
%Cpu2 [ 0.00%]
%Cpu3 [ 0.00%]
%Cpu4 [ 0.00%]
%Cpu5 [ 0.00%]
%Cpu6 [ 0.00%]
%Cpu7 [ 0.00%]
PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
9754 9754 0.01 2.303 ms perf
#
# perf kwork -k sched top -b -C 2,3,4
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 48016.721 ms, 3 cpus
%Cpu(s): 27.82% id, 0.00% hi, 0.00% si
%Cpu2 [|||||||||||||||||||||| 74.68%]
%Cpu3 [||||||||||||||||||||| 71.06%]
%Cpu4 [||||||||||||||||||||| 70.91%]
PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
0 0 29.08 4734.998 ms [swapper/4]
0 0 28.93 4710.029 ms [swapper/3]
0 0 25.31 3912.363 ms [swapper/2]
10248 10158 1.62 264.931 ms sched-messaging
10253 10158 1.62 265.136 ms sched-messaging
10158 10158 1.60 263.013 ms bash
10360 10158 1.49 243.639 ms sched-messaging
10413 10158 1.48 238.604 ms sched-messaging
10531 10158 1.47 234.067 ms sched-messaging
10400 10158 1.47 240.631 ms sched-messaging
10355 10158 1.47 230.586 ms sched-messaging
10377 10158 1.43 234.835 ms sched-messaging
10526 10158 1.42 232.045 ms sched-messaging
10298 10158 1.41 222.396 ms sched-messaging
10410 10158 1.38 221.853 ms sched-messaging
10364 10158 1.38 226.042 ms sched-messaging
10480 10158 1.36 213.633 ms sched-messaging
10370 10158 1.36 223.620 ms sched-messaging
10553 10158 1.34 217.169 ms sched-messaging
10291 10158 1.34 211.516 ms sched-messaging
10251 10158 1.34 218.813 ms sched-messaging
10522 10158 1.33 218.498 ms sched-messaging
10288 10158 1.33 216.787 ms sched-messaging
<SNIP>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-15-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Calculate the runtime of the softirq events and subtract it from
the corresponding task runtime to improve the precision.
Example usage:
# perf kwork -k sched,irq,softirq record -- perf record -e cpu-clock -o perf_record.data -a sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.467 MB perf_record.data (7154 samples) ]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.152 MB perf.data (22846 samples) ]
# perf kwork top
Total : 136601.588 ms, 8 cpus
%Cpu(s): 95.66% id, 0.04% hi, 0.05% si
%Cpu0 [ 0.02%]
%Cpu1 [ 0.01%]
%Cpu2 [| 4.61%]
%Cpu3 [ 0.04%]
%Cpu4 [ 0.01%]
%Cpu5 [||||| 17.31%]
%Cpu6 [ 0.51%]
%Cpu7 [||| 11.42%]
PID %CPU RUNTIME COMMMAND
----------------------------------------------------
0 99.98 17073.515 ms swapper/4
0 99.98 17072.173 ms swapper/1
0 99.93 17064.229 ms swapper/3
0 99.62 17011.013 ms swapper/0
0 99.47 16985.180 ms swapper/6
0 95.17 16250.874 ms swapper/2
0 88.51 15111.684 ms swapper/7
0 82.62 14108.577 ms swapper/5
4342 33.00 5644.045 ms perf
4344 0.43 74.351 ms perf
16 0.13 22.296 ms rcu_preempt
4345 0.05 10.093 ms perf
4343 0.05 8.769 ms perf
4341 0.02 4.882 ms perf
4095 0.02 4.605 ms kworker/7:1
75 0.02 4.261 ms kworker/2:1
120 0.01 1.909 ms systemd-journal
98 0.01 2.540 ms jbd2/sda-8
61 0.01 3.404 ms kcompactd0
667 0.01 2.542 ms kworker/u16:2
4340 0.00 1.052 ms kworker/7:2
97 0.00 0.489 ms kworker/7:1H
51 0.00 0.209 ms ksoftirqd/7
50 0.00 0.646 ms migration/7
76 0.00 0.753 ms kworker/6:1
45 0.00 0.572 ms migration/6
87 0.00 0.145 ms kworker/5:1H
73 0.00 0.596 ms kworker/5:1
41 0.00 0.041 ms ksoftirqd/5
40 0.00 0.718 ms migration/5
64 0.00 0.115 ms kworker/4:1
35 0.00 0.556 ms migration/4
353 0.00 2.600 ms sshd
74 0.00 0.205 ms kworker/3:1
33 0.00 1.576 ms kworker/3:0H
30 0.00 0.996 ms migration/3
26 0.00 1.665 ms ksoftirqd/2
25 0.00 0.662 ms migration/2
397 0.00 0.057 ms kworker/1:1
20 0.00 1.005 ms migration/1
2909 0.00 1.053 ms kworker/0:2
17 0.00 0.720 ms migration/0
15 0.00 0.039 ms ksoftirqd/0
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-13-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Calculate the runtime of the hardirq events and subtract it from
the corresponding task runtime to improve the precision.
Example usage:
# perf kwork -k sched,irq record -- perf record -o perf_record.data -a sleep 10
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 1.054 MB perf_record.data (18019 samples) ]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.798 MB perf.data (16334 samples) ]
#
# perf kwork top
Total : 139240.869 ms, 8 cpus
%Cpu(s): 94.91% id, 0.05% hi
%Cpu0 [ 0.05%]
%Cpu1 [| 5.00%]
%Cpu2 [ 0.43%]
%Cpu3 [ 0.57%]
%Cpu4 [ 1.19%]
%Cpu5 [|||||| 20.46%]
%Cpu6 [ 0.48%]
%Cpu7 [||| 12.10%]
PID %CPU RUNTIME COMMMAND
----------------------------------------------------
0 99.54 17325.622 ms swapper/2
0 99.54 17327.527 ms swapper/0
0 99.51 17319.909 ms swapper/6
0 99.42 17304.934 ms swapper/3
0 98.80 17197.385 ms swapper/4
0 94.99 16534.991 ms swapper/1
0 87.89 15295.264 ms swapper/7
0 79.53 13843.182 ms swapper/5
4252 36.50 6361.768 ms perf
4256 1.17 205.215 ms bash
151 0.53 93.298 ms systemd-resolve
4254 0.39 69.468 ms perf
423 0.34 59.368 ms bash
412 0.29 51.204 ms sshd
249 0.20 35.288 ms sd-resolve
16 0.17 30.287 ms rcu_preempt
153 0.09 17.266 ms systemd-timesyn
1 0.09 17.078 ms systemd
4253 0.07 12.457 ms perf
4255 0.06 11.559 ms perf
4234 0.03 6.105 ms kworker/u16:1
69 0.03 6.259 ms kworker/1:1H
4251 0.02 4.615 ms perf
4095 0.02 4.890 ms kworker/7:1
61 0.02 4.005 ms kcompactd0
75 0.02 3.546 ms kworker/2:1
97 0.01 3.106 ms kworker/7:1H
98 0.01 1.995 ms jbd2/sda-8
4088 0.01 1.779 ms kworker/u16:3
2909 0.01 1.795 ms kworker/0:2
4246 0.00 1.117 ms kworker/7:2
51 0.00 0.327 ms ksoftirqd/7
50 0.00 0.369 ms migration/7
102 0.00 0.160 ms kworker/6:1H
76 0.00 0.609 ms kworker/6:1
45 0.00 0.779 ms migration/6
87 0.00 0.504 ms kworker/5:1H
73 0.00 1.130 ms kworker/5:1
41 0.00 0.152 ms ksoftirqd/5
40 0.00 0.702 ms migration/5
64 0.00 0.316 ms kworker/4:1
35 0.00 0.791 ms migration/4
353 0.00 2.211 ms sshd
74 0.00 0.272 ms kworker/3:1
30 0.00 0.819 ms migration/3
25 0.00 0.784 ms migration/2
397 0.00 0.539 ms kworker/1:1
21 0.00 1.600 ms ksoftirqd/1
20 0.00 0.773 ms migration/1
17 0.00 1.682 ms migration/0
15 0.00 0.076 ms ksoftirqd/0
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-12-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add evsel__intval_common() helper to search for common_field in
tracepoint format.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-11-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some common tools for collecting statistics on CPU usage, such as top,
obtain statistics from timer interrupt sampling, and then periodically
read statistics from /proc/stat.
This method has some deviations:
1. In the tick interrupt, the time between the last tick and the current
tick is counted in the current task. However, the task may be running
only part of the time.
2. For each task, the top tool periodically reads the /proc/{PID}/status
information. For tasks with a short life cycle, it may be missed.
In conclusion, the top tool cannot accurately collect statistics on the
CPU usage and running time of tasks.
The statistical method based on sched_switch tracepoint can accurately
calculate the CPU usage of all tasks. This method is applicable to
scenarios where performance comparison data is of high precision.
Example usage:
# perf kwork
Usage: perf kwork [<options>] {record|report|latency|timehist|top}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, sched, etc)
-v, --verbose be more verbose (show symbol address, etc)
# perf kwork -k sched record -- perf bench sched messaging -g 1 -l 10000
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 1 groups == 40 processes run
Total time: 14.074 [sec]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 15.886 MB perf.data (129472 samples) ]
# perf kwork top
Total : 115708.178 ms, 8 cpus
%Cpu(s): 9.78% id
%Cpu0 [||||||||||||||||||||||||||| 90.55%]
%Cpu1 [||||||||||||||||||||||||||| 90.51%]
%Cpu2 [|||||||||||||||||||||||||| 88.57%]
%Cpu3 [||||||||||||||||||||||||||| 91.18%]
%Cpu4 [||||||||||||||||||||||||||| 91.09%]
%Cpu5 [||||||||||||||||||||||||||| 90.88%]
%Cpu6 [|||||||||||||||||||||||||| 88.64%]
%Cpu7 [||||||||||||||||||||||||||| 90.28%]
PID %CPU RUNTIME COMMMAND
----------------------------------------------------
4113 22.23 3221.547 ms sched-messaging
4105 21.61 3131.495 ms sched-messaging
4119 21.53 3120.937 ms sched-messaging
4103 21.39 3101.614 ms sched-messaging
4106 21.37 3095.209 ms sched-messaging
4104 21.25 3077.269 ms sched-messaging
4115 21.21 3073.188 ms sched-messaging
4109 21.18 3069.022 ms sched-messaging
4111 20.78 3010.033 ms sched-messaging
4114 20.74 3007.073 ms sched-messaging
4108 20.73 3002.137 ms sched-messaging
4107 20.47 2967.292 ms sched-messaging
4117 20.39 2955.335 ms sched-messaging
4112 20.34 2947.080 ms sched-messaging
4118 20.32 2942.519 ms sched-messaging
4121 20.23 2929.865 ms sched-messaging
4110 20.22 2930.078 ms sched-messaging
4122 20.15 2919.542 ms sched-messaging
4120 19.77 2866.032 ms sched-messaging
4116 19.72 2857.660 ms sched-messaging
4127 16.19 2346.334 ms sched-messaging
4142 15.86 2297.600 ms sched-messaging
4141 15.62 2262.646 ms sched-messaging
4136 15.41 2231.408 ms sched-messaging
4130 15.38 2227.008 ms sched-messaging
4129 15.31 2217.692 ms sched-messaging
4126 15.21 2201.711 ms sched-messaging
4139 15.19 2200.722 ms sched-messaging
4137 15.10 2188.633 ms sched-messaging
4134 15.06 2182.082 ms sched-messaging
4132 15.02 2177.530 ms sched-messaging
4131 14.73 2131.973 ms sched-messaging
4125 14.68 2125.439 ms sched-messaging
4128 14.66 2122.255 ms sched-messaging
4123 14.65 2122.113 ms sched-messaging
4135 14.56 2107.144 ms sched-messaging
4133 14.51 2103.549 ms sched-messaging
4124 14.27 2066.671 ms sched-messaging
4140 14.17 2052.251 ms sched-messaging
4138 13.81 2000.361 ms sched-messaging
0 11.42 1652.009 ms swapper/2
0 11.35 1641.694 ms swapper/6
0 9.71 1405.108 ms swapper/7
0 9.48 1372.338 ms swapper/1
0 9.44 1366.013 ms swapper/0
0 9.11 1318.382 ms swapper/5
0 8.90 1287.582 ms swapper/4
0 8.81 1274.356 ms swapper/3
4100 2.61 379.328 ms perf
4101 1.16 169.487 ms perf-exec
151 0.65 94.741 ms systemd-resolve
249 0.36 53.030 ms sd-resolve
153 0.14 21.405 ms systemd-timesyn
1 0.10 16.200 ms systemd
16 0.09 15.785 ms rcu_preempt
4102 0.06 9.727 ms perf
4095 0.03 5.464 ms kworker/7:1
98 0.02 3.231 ms jbd2/sda-8
353 0.02 4.115 ms sshd
75 0.02 3.889 ms kworker/2:1
73 0.01 1.552 ms kworker/5:1
64 0.01 1.591 ms kworker/4:1
74 0.01 1.952 ms kworker/3:1
61 0.01 2.608 ms kcompactd0
397 0.01 1.602 ms kworker/1:1
69 0.01 1.817 ms kworker/1:1H
10 0.01 2.553 ms kworker/u16:0
2909 0.01 2.684 ms kworker/0:2
1211 0.00 0.426 ms kworker/7:0
97 0.00 0.153 ms kworker/7:1H
51 0.00 0.100 ms ksoftirqd/7
120 0.00 0.856 ms systemd-journal
76 0.00 1.414 ms kworker/6:1
46 0.00 0.246 ms ksoftirqd/6
45 0.00 0.164 ms migration/6
41 0.00 0.098 ms ksoftirqd/5
40 0.00 0.207 ms migration/5
86 0.00 1.339 ms kworker/4:1H
36 0.00 0.252 ms ksoftirqd/4
35 0.00 0.090 ms migration/4
31 0.00 0.156 ms ksoftirqd/3
30 0.00 0.073 ms migration/3
26 0.00 0.180 ms ksoftirqd/2
25 0.00 0.085 ms migration/2
21 0.00 0.106 ms ksoftirqd/1
20 0.00 0.118 ms migration/1
302 0.00 1.440 ms systemd-logind
17 0.00 0.132 ms migration/0
15 0.00 0.255 ms ksoftirqd/0
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-10-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The kwork_class type of sched is added to support recording and parsing of
sched_switch events.
As follows:
# perf kwork -h
Usage: perf kwork [<options>] {record|report|latency|timehist}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, sched, etc)
-v, --verbose be more verbose (show symbol address, etc)
# perf kwork -k sched record true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.083 MB perf.data (47 samples) ]
# perf evlist
sched:sched_switch
dummy:HG
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-8-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To support different types of reports, two parameters `struct perf_kwork
* kwork` and `enum kwork_trace_type src_type` are added to work_init()
of struct kwork_class for initialization in different scenarios.
No functional change intended.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-5-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, intel-bts, intel-pt, and arm-spe may add tracking event to the
evlist. We may need to search for the tracking event for some settings.
Therefore, add evlist__findnew_tracking_event() helper.
If system_wide is true, evlist__findnew_tracking_event() set the cpu map
of the evsel to all online CPUs.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230904023340.12707-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick the changes in:
a3e7e6b17946f48b ("libbpf: Remove HASHMAP_INIT static initialization helper")
That don't entail any changes in tools/perf.
This addresses this perf build warning:
Warning: Kernel ABI header differences:
diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
Not a kernel ABI, its just that this uses the mechanism in place for
checking kernel ABI files drift.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
parse_events_terms() existed in function names but was passed a
'struct list_head'.
As many parse_events functions take an evsel_config list as well as a
parse_event_term list, and the naming head_terms and head_config is
inconsistent, there's a potential to switch the lists and get errors.
Introduce a 'struct parse_events_terms', that just wraps a list_head, to
avoid this. Add the regular init/exit functions and transition the code
to use them.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When trying to add events to multiple PMUs the term list is copied first
as adding the event will rewrite the event's name term into the sysfs
and/or json encoding terms (see perf_pmu__check_alias).
Change the parse events add API so the passed in term list is const,
then copy the list when modification is necessary.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add term_type to union of values returned by the lexer to avoid casts
to and from an integer.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a const and rename str to event_name.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The parameter head_terms is always used in get_config_terms.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix the following coccicheck warnings:
./tools/perf/util/machine.c:2000:9-10: WARNING: return of 0/1 in
function 'symbol__match_regex' with return type bool.
Committer notes:
Found this in the pile, it was already returning bool, but this patch
simplifies it further, from 3 lines to just 1.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/1614247483-102665-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"perf tools maintainership:
- Add git information for perf-tools and perf-tools-next trees and
branches to the MAINTAINERS file. That is where development now
takes place and myself and Namhyung Kim have write access, more
people to come as we emulate other maintainer groups.
perf record:
- Record kernel data maps when 'perf record --data' is used, so that
global variables can be resolved and used in tools that do data
profiling.
perf trace:
- Remove the old, experimental support for BPF events in which a .c
file was passed as an event: "perf trace -e hello.c" to then get
compiled and loaded.
The only known usage for that, that shipped with the kernel as an
example for such events, augmented the raw_syscalls tracepoints and
was converted to a libbpf skeleton, reusing all the user space
components and the BPF code connected to the syscalls.
In the end just the way to glue the BPF part and the user space
type beautifiers changed, now being performed by libbpf skeletons.
The next step is to use BTF to do pretty printing of all syscall
types, as discussed with Alan Maguire and others.
Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
path/filenames/strings, some of the networking data structures,
perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
seconds:
# perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0
10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0
1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0
3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0
3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0
4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0
Performance counter stats for 'sleep 5':
2,617,347 cycles
1,855,997 instructions # 0.71 insn per cycle
5.002282128 seconds time elapsed
0.000855000 seconds user
0.000852000 seconds sys
perf annotate:
- Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
for licensing reasons, and we missed a build test on
tools/perf/tests makefile.
Since we now default to NDEBUG=1, we ended up segfaulting when
building with BUILD_NONDISTRO=1 because a needed initialization
routine was being "error checked" via an assert.
Fix it by explicitly checking the result and aborting instead if it
fails.
We better back propagate the error, but at least 'perf annotate' on
samples collected for a BPF program is back working when perf is
built with BUILD_NONDISTRO=1.
perf report/top:
- Add back TUI hierarchy mode header, that is seen when using 'perf
report/top --hierarchy'.
- Fix the number of entries for 'e' key in the TUI that was
preventing navigation of lines when expanding an entry.
perf report/script:
- Support cross platform register handling, allowing a perf.data file
collected on one architecture to have registers sampled correctly
displayed when analysis tools such as 'perf report' and 'perf
script' are used on a different architecture.
- Fix handling of event attributes in pipe mode, i.e. when one uses:
perf record -o - | perf report -i -
When no perf.data files are used.
- Handle files generated via pipe mode with a version of perf and
then read also via pipe mode with a different version of perf,
where the event attr record may have changed, use the record size
field to properly support this version mismatch.
perf probe:
- Accessing global variables from uprobes isn't supported, make the
error message state that instead of stating that some minimal
kernel version is needed to have that feature. This seems just a
tool limitation, the kernel probably has all that is needed.
perf tests:
- Fix a reference count related leak in the dlfilter v0 API where the
result of a thread__find_symbol_fb() is not matched with an
addr_location__exit() to drop the reference counts of the resolved
components (machine, thread, map, symbol, etc). Add a dlfilter test
to make sure that doesn't regresses.
- Lots of fixes for the 'perf test' written in shell script related
to problems found with the shellcheck utility.
- Fixes for 'perf test' shell scripts testing features enabled when
perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
counters.
- Add perf record sample filtering test, things like the following
example, that gets implemented as a BPF filter attached to the
event:
# perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
- Improve the way the task_analyzer test checks if libtraceevent is
linked, using 'perf version --build-options' instead of the more
expensinve 'perf record -e "sched:sched_switch"'.
- Add support for riscv in the mmap-basic test. (This went as well
via the RiscV tree, same contents).
libperf:
- Implement riscv mmap support (This went as well via the RiscV tree,
same contents).
perf script:
- New tool that converts perf.data files to the firefox profiler
format so that one can use the visualizer at
https://profiler.firefox.com/. Done by Anup Sharma as part of this
year's Google Summer of Code.
One can generate the output and upload it to the web interface but
Anup also automated everything:
perf script gecko -F 99 -a sleep 60
- Support syscall name parsing on arm64.
- Print "cgroup" field on the same line as "comm".
perf bench:
- Add new 'uprobe' benchmark to measure the overhead of uprobes
with/without BPF programs attached to it.
- breakpoints are not available on power9, skip that test.
perf stat:
- Add #num_cpus_online literal to be used in 'perf stat' metrics, and
add this extra 'perf test' check that exemplifies its purpose:
TEST_ASSERT_VAL("#num_cpus_online",
expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
Miscellaneous:
- Improve tool startup time by lazily reading PMU, JSON, sysfs data.
- Improve error reporting in the parsing of events, passing YYLTYPE
to error routines, so that the output can show were the parsing
error was found.
- Add 'perf test' entries to check the parsing of events
improvements.
- Fix various leak for things detected by -fsanitize=address, mostly
things that would be freed at tool exit, including:
- Free evsel->filter on the destructor.
- Allow tools to register a thread->priv destructor and use it in
'perf trace'.
- Free evsel->priv in 'perf trace'.
- Free string returned by synthesize_perf_probe_point() when the
caller fails to do all it needs.
- Adjust various compiler options to not consider errors some
warnings when building with broken headers found in things like
python, flex, bison, as we otherwise build with -Werror. Some for
gcc, some for clang, some for some specific version of those, some
for some specific version of flex or bison, or some specific
combination of these components, bah.
- Allow customization of clang options for BPF target, this helps
building on gentoo where there are other oddities where BPF targets
gets passed some compiler options intended for the native build, so
building with WERROR=0 helps while these oddities are fixed.
- Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
and 'perf lock', fixing some segfaults when handling some odd
failures.
- Add LTO build option.
- Fix format of unordered lists in the perf docs
(tools/perf/Documentation)
- Overhaul the bison files, using constructs such as YYNOMEM.
- Remove unused tokens from the bison .y files.
- Add more comments to various structs.
- A few LoongArch enablement patches.
Vendor events (JSON):
- Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
EventName, BriefDescription
visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.",
op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.",
op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
- Add AmpereOne metrics (aarch64).
- Update N2 and V2 metrics (aarch64) and events using Arm telemetry
repo.
- Update scale units and descriptions of common topdown metrics on
aarch64. Things like:
- "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
- "BriefDescription": "Frontend bound L1 topdown metric",
+ "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
+ "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
- Update events for intel: meteorlake to 1.04, sapphirerapids to
1.15, Icelake+ metric constraints.
- Update files for the power10 platform"
* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
perf parse-events: Fix driver config term
perf parse-events: Fixes relating to no_value terms
perf parse-events: Fix propagation of term's no_value when cloning
perf parse-events: Name the two term enums
perf list: Don't print Unit for "default_core"
perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
perf dlfilter: Avoid leak in v0 API test use of resolve_address()
perf metric: Add #num_cpus_online literal
perf pmu: Remove str from perf_pmu_alias
perf parse-events: Make common term list to strbuf helper
perf parse-events: Minor help message improvements
perf pmu: Avoid uninitialized use of alias->str
perf jevents: Use "default_core" for events with no Unit
perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
perf test shell stat_bpf_counters: Fix test on Intel
perf test shell record_bpf_filter: Skip 6.2 kernel
libperf: Get rid of attr.id field
perf tools: Convert to perf_record_header_attr_id()
libperf: Add perf_record_header_attr_id()
perf tools: Handle old data in PERF_RECORD_ATTR
...
|
|
Inadvertently deleted in commit 30f4ade33d649aa0 ("perf tools: Revert
enable indices setting syntax for BPF map").
Fixes: 30f4ade33d649aa0 ("perf tools: Revert enable indices setting syntax for BPF map")
Reported-by: James Clark <james.clark@arm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230905033805.3094293-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A term may have no value in which case it is assumed to have a value
of 1. It doesn't just apply to alias/event terms so change the
parse_events_term__to_strbuf assert.
Commit 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long
terms without value") made it so that no_value terms could only be for a
single bit. Prior to commit 64199ae4b8a3 ("perf parse-events: Fix
propagation of term's no_value when cloning") this missed a test case
where config1 had no_value.
Fixes: 64199ae4b8a36038 ("perf parse-events: Fix propagation of term's no_value when cloning")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The no_value field in 'struct parse_events_term' indicates that the val
variable isn't used, the case for an event name.
Cloning wasn't propagating this, making cloned event name terms
appearing to have a constant assinged to them.
Working around the bug would check for a value of 1 assigned to value,
but then this meant a user value of 1 couldn't be differentiated causing
the value to be lost in debug printing and perf list.
The change fixes the cloning and updates the "val.num ==/!= 1" tests to
use no_value instead.
To better check the no_value is set appropriately parameter comments are
added for constant values.
This found that no_value wasn't set correctly in parse_events_multi_pmu_add,
which matters now that no_value is used to indicate an event name.
Fixes: 7a6e91644708d514 ("perf parse-events: Make common term list to strbuf helper")
Fixes: 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230831071421.2201358-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Name the enums used by 'struct parse_events_term' to
parse_events__term_val_type and parse_events__term_type.
This allows greater compile time error checking.
Fix -Wswitch related issues by explicitly listing all enum values prior
to default.
Add config_term_name to safely look up a parse_events__term_type name,
bounds checking the array access first.
Add documentation to 'struct parse_events_terms' and reorder to save
space.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230831071421.2201358-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The introduction of reference counting causes the v0 API
perf_dlfilter_fns.resolve_address() to leak.
v2 API introduced perf_dlfilter_fns.al_cleanup() to prevent that.
For the v0 API, avoid the leak by exiting the addr_location immediately,
since the documentation makes it clear that pointers obtained via
perf_dlfilter_fns are not necessarily valid (dereferenceable) after
'filter_event' and 'filter_event_early' return.
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Closes: https://lore.kernel.org/oe-lkp/202308232146.94d82cb4-oliver.sang@intel.com
Link: http://lore.kernel.org/lkml/20230830090539.68206-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Returns the number of CPUs online, unlike #num_cpus that returns the
number present.
Add a test of the property.
This will be used in future Intel metrics.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830073026.1829912-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently the value is only used in perf list.
Compute the value just when needed to avoid unnecessary overhead.
Recycle the strbuf to avoid memory allocation overhead.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A term list is turned into a string for debug output and for the str
value in the alias.
Add a helper to do this based on existing code, but then fix for
situations like events being identified.
Use strbuf to manage the dynamic memory allocation and remove the 256
byte limit.
Use in various places the string of the term list is required.
Before:
$ sudo perf stat -vv -e inst_retired.any true
Using CPUID GenuineIntel-6-8D-1
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
Attempting to add event pmu 'cpu' with 'inst_retired.any,' that may result in non-fatal errors
After aliases, add event pmu 'cpu' with 'event,period,' that may result in non-fatal errors
inst_retired.any -> cpu/inst_retired.any/
...
After:
$ sudo perf stat -vv -e inst_retired.any true
Using CPUID GenuineIntel-6-8D-1
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
Attempt to add: cpu/inst_retired.any/
..after resolving event: cpu/event=0xc0,period=0x1e8483/
inst_retired.any -> cpu/event=0xc0,period=0x1e8483/
...
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Be more specific and fix a typo.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
alias is allocated with malloc allowing uninitialized memory to be
accessed.
The initialization of str was moved late after it could have been
updated by a JSON event, however, this create a potential for an
uninitialized use.
Fix this by assigning str to NULL early.
Testing on ARM (Raspberry Pi) showed a memory leak in the same code so
add a zfree.
Fixes: f63a536f03a2f64f ("perf pmu: Merge JSON events with sysfs at load time")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230830000545.1638964-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The JSON Unit field encodes the name of the PMU to match the events
to. When no name is given it has meant the "cpu" core PMU except for
tests.
On ARM, Intel hybrid and s390 the core PMU is named differently which
means that using "cpu" for this case causes the events not to get
matched to the PMU.
Introduce a new "default_core" string for this case and in the
pmu__name_match force all core PMUs to match this name.
Fixes: 2e255b4f9f41f137 ("perf jevents: Group events by PMU")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230826062203.1058041-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Instead of accessing the attr.id directly, use the
perf_record_header_attr_id() helper to handle old versions.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230825152552.112913-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The PERF_RECORD_ATTR is used for a pipe mode to describe an event with
attribute and IDs. The ID table comes after the attr and it calculate
size of the table using the total record size and the attr size.
n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64)
This is fine for most use cases, but sometimes it saves the pipe output
in a file and then process it later. And it becomes a problem if there
is a change in attr size between the record and report.
$ perf record -o- > perf-pipe.data # old version
$ perf report -i- < perf-pipe.data # new version
For example, if the attr size is 128 and it has 4 IDs, then it would
save them in 168 byte like below:
8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
128 byte: perf event attr { .size = 128, ... },
32 byte: event IDs [] = { 1234, 1235, 1236, 1237 },
But when report later, it thinks the attr size is 136 then it only read
the last 3 entries as ID.
8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
136 byte: perf event attr { .size = 136, ... },
24 byte: event IDs [] = { 1235, 1236, 1237 }, // 1234 is missing
So it should use the recorded version of the attr. The attr has the
size field already then it should honor the size when reading data.
Fixes: 2c46dbb517a10b18 ("perf: Convert perf header attrs into attr events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a PMUs scan that ignores duplicates. When there are multiple PMUs
that differ only by suffix, by default just list the first one and
skip all others. The scan routine checks that the PMU names match but
doesn't enforce that the numbers are consecutive as for some PMUs
there are gaps. If "-v" is passed to "perf list" then list all PMUs.
With the previous change duplicate PMUs are no longer printed but the
suffix of the first is printed. When duplicate PMUs are being skipped
avoid printing the suffix.
Before:
$ perf list
...
uncore_imc_free_running_0/data_read/ [Kernel PMU event]
uncore_imc_free_running_0/data_total/ [Kernel PMU event]
uncore_imc_free_running_0/data_write/ [Kernel PMU event]
uncore_imc_free_running_1/data_read/ [Kernel PMU event]
uncore_imc_free_running_1/data_total/ [Kernel PMU event]
uncore_imc_free_running_1/data_write/ [Kernel PMU event]
After:
$ perf list
...
uncore_imc_free_running/data_read/ [Kernel PMU event]
uncore_imc_free_running/data_total/ [Kernel PMU event]
uncore_imc_free_running/data_write/ [Kernel PMU event]
...
$ perf list -v
uncore_imc_free_running_0/data_read/ [Kernel PMU event]
uncore_imc_free_running_0/data_total/ [Kernel PMU event]
uncore_imc_free_running_0/data_write/ [Kernel PMU event]
uncore_imc_free_running_1/data_read/ [Kernel PMU event]
uncore_imc_free_running_1/data_total/ [Kernel PMU event]
uncore_imc_free_running_1/data_write/ [Kernel PMU event]
...
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230825135237.921058-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sort PMUs by name. If two PMUs have the same name but differ by
suffix, sort the suffixes numerically.
For example, "breakpoint" comes before "cpu",
"uncore_imc_free_running_0" comes before "uncore_imc_free_running_1".
Suffixes need to be treated specially as otherwise they will be ordered
like 0, 1, 10, 11, .., 2, 20, 21, .., etc. Only PMUs starting 'uncore_'
are considered to have a potential suffix.
Sorting of PMUs is done so that later patches can skip duplicate uncore
PMUs that differ only by there suffix.
Committer notes:
Used the more compact, intention revealing strstarts() function we got
from the kernel sources:
- if (strncmp(str, "uncore_", 7))
+ if (!strstarts(str, "uncore_"))
Also in pmus_cmp() the lhs_num and rhs_num variables may end up not
being set for non "uncore_" prefixed PMUs in pmu_name_len_no_suffix(),
or at least gcc 7.5 in some distros (opensuse 15.5, to be EOLed in
Dec/2024) thins so, so initialize both to zero.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230825135237.921058-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Define these macros so that the CPU name can be displayed when running
'perf report' and 'perf timechart'.
Committer notes:
No need to have:
if (strcasestr(buf, "Model Name")) {
strlcpy(cpu_m, &buf[13], 255);
break;
} else if (strcasestr(buf, "model name")) {
strlcpy(cpu_m, &buf[13], 255);
break;
}
As the point of strcasestr() is to be case insensitive to both the
haystack and the needle, so simplify the above to just:
if (strcasestr(buf, "model name")) {
strlcpy(cpu_m, &buf[13], 255);
break;
}
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: loongarch@lists.linux.dev
Cc: loongson-kernel@lists.loongnix.cn
Link: https://lore.kernel.org/r/db968a186a10e4629fe10c26a1210f7126ad41ec.1692962043.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Initialize realname to NULL, rather than name.
This avoids a cast and as realpath is either NULL or an allocated
string, free can be called unconditionally.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The struct pmu id is initialized from pmu_id that is read into allocated
memory from a file, as such it needs free-ing in pmu__delete().
Make the id value const so that we can remove casts in tests.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This avoids casts in tests. Use zfree in a few places to avoid
warnings about a freeing a const pointer.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The PMU name could be NULL in the case of the fake_pmu. Initialize the
name for the fake_pmu to "fake" so that all other logic can assume it
is initialized. Add a const to the type of name so that a literal can
be used to avoid additional initialization code. Propagate the cost
through related routines and remove now unnecessary "(char *)"
casts. Doing this located a bug in builtin-list for the pmu_glob that
was missing a strdup.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ming Wang <wangming01@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
PMU caps are written as HEADER_PMU_CAPS or for the special case of the
PMU "cpu" as HEADER_CPU_PMU_CAPS. As the PMU "cpu" is special, and not
any "core" PMU, the logic had become broken and core PMUs not called
"cpu" were not having their caps written.
This affects ARM and s390 non-hybrid PMUs.
Simplify the PMU caps writing logic to scan one fewer time and to be
more explicit in its behavior.
Fixes: 178ddf3bad981380 ("perf header: Avoid hybrid PMU list in write_pmu_caps")
Reported-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Don't load sysfs aliases for a PMU when the PMU is first created, defer
until an alias needs to be found. For the pmu-scan benchmark, average
core PMU scanning is reduced by 30.8%, and average PMU scanning by
12.6%.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Event info is only needed when an event is parsed or when merging data
from an JSON and sysfs event. Be lazy in its loading to reduce file
accesses.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Scan sysfs PMU's type early so that format and aliases aren't
attempted to be loaded if the PMU name is invalid.
This is the case for event_pmu tokens in parse-events.y where a wildcard
name is first assumed to be a PMU name.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than scanning all JSON events and adding them when a PMU is
created, add the alias when the JSON event is needed.
Average core PMU scanning run time reduced by 60.2%. Average PMU
scanning run time reduced by 15%. Page faults with no events reduced by
74 page faults, 4% of total.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cache the JSON events table so that finding it isn't done per
event/alias.
Change the events table find so that when the PMU is given, if the PMU
has no JSON events return null.
Update usage to always use the PMU variable.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Rather than load all sysfs events then parsing all JSON events and
merging with ones that already exist. When a sysfs event is loaded, look
for a corresponding JSON event and merge immediately.
To simplify the logic, early exit the perf_pmu__new_alias function if an
alias is attempted to be added twice - as merging has already been
explicitly handled.
Fix the copying of terms to a merged alias and some ENOMEM paths.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The aliases list is part of the PMU. Rather than pass the aliases
list, pass the full PMU simplifying some callbacks.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|