1.介绍
perf是linxu下的一款性能分析工具。Linux的性能计数器是一个新的基于内核的子系统,它为所有性能分析提供了一个框架。它包括硬件级别(CPU/PMU、性能监控单元)功能和软件(软件计数器、跟踪点)功能。
通过perf,应用程序可以利用PMU,tracepoint和内核中的计数器来进行性能统计。它不但可以分析制定应用程序的性能问题,也可以用来分析内核的性能问题,当然也可以同时分析应用程序和内核,从而全面理解应用程序中的性能瓶颈。使用perf,可以分析程序运行期间发生的硬件事件,比如instructions retired,processor clock cycles等,也可以分析软件时间,比如page fault和进程切换。
perf是一款综合性分析工具,大到系统全局性性能,在小到进程线程级别,甚至函数及汇编级别。
2.perf的编译
perf的源代码存在linux的源代码目录中(tools/perf),执行make即可生成执行文件。由于perf和内核是关联,需要保持跟内核版本同步。
在arm平台执行perf时,有可能会遇到类型的错误:
The sys_perf_event_open() syscall returned with 38 (Function not implemented) for event (cycles).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?
以上错误需要打开内核的相关配置,例如上面的 CONFIG_PERF_EVENTS=y选项。其他类型错误,可根据执行命令提示来打开相关内核配置。
3.perf的原理
linux性能计数器是一个基于内核的子系统,它提供一个性能分析框架,比如硬件(CPU,PMU)功能和软件(软件计数器,tracepoints)功能
3.1 tracepoints
tracepoints是散落在内核源码中的一些hook,它们可以在特定的代码被执行到的时候触发,这一特性可以被各种trace/debug工具所使用,perf将tracepoint产生的时间记录下来,通过分析这些报告,调优人员便可以了解程序运行期间内核的各种细节,对性能症状做出准确的诊断,这些trackpoints的对应的sysfs节点在/sys/kernel/debug/tracing/events目录下。
4.主要关注点
基于性能分析,可以进行算法优化(空间复杂度和时间复杂度权衡),代码优化(提高执行速度,减少内存占用)。
评估程序对硬件资源的使用情况,例如各级cache的访问次数,各级cache的丢失次数,流水线停顿周期,前端总线访问次数等。
评估程序对操作系统资源的使用情况,系统调用次数,上下文切换次数,任务迁移次数。
事件可以分为三种:
- Hardware Event由PMU部件产生,在特定的条件下探测性能事件是否发生以及发生的次数,比如cache命中;
- Software Event是内核产生的事件,分布在各个功能模块中,统计和操作系统相关性能事件,比如进程切换,tick数等
- Tracepoint Event是内核中静态tracepoint所触发的事件,这些tracepoint用来判断程序运行期间内核的行为细节,比如slab分配器的分配次数等
5.perf的使用
通过敲命令perf --help可以看到perf的二级命令:
序号 | 命令 | 作用 |
1 | annotate | 解析perf record生成的perf.data文件,显示被注释的代码 |
2 | archive | 根据数据文件记录的build-id,将所有被采样到的elf文件打包,利用此压缩包,可以再任何机器上分析数据文件中记录的采样数据 |
3 | bench | perf中内置的benchmark,目前包括两套针对调度器和内存管理子系统的benchmark |
4 | buildid-cache | 管理perf的buildid缓存,每个elf文件都有一个独一无二的buildid,buildid被perf用来关联性能数据与elf文件 |
5 | buildid-list | 列出数据文件中记录的所有buildid |
6 | data | 数据文件相关处理 |
7 | diff | 对比两个数据文件的差异,能够给出每个符号(函数)在热点分析上的具体差异 |
8 | evlist | 列出数据文件perf.data中所有性能事件 |
9 | inject | 该工具读取perf record工具记录的事件流,并将其定向到标准输出,在被分析代码中的任何一点,都可以向事件流中注入其它事件 |
10 | kmem | 针对内核内存(slab)子系统进行追踪测量的工具 |
11 | kvm | 用来追踪测试运行在kvm虚拟机上的Guest OS |
12 | list | 列出当前系统支持的所有性能事件,包括硬件性能事件,软件性能事件以及检查点 |
13 | lock | 分析内核中的锁信息,包括锁的争用情况,等待延迟等 |
14 | mem | 内存存取情况 |
15 | record | 收集采样信息,并将其记录在数据文件中,随后可通过其他工具对数据文件进行分析 |
16 | report | 读取perf record创建的数据文件,并给出热点分析结果 |
17 | sched | 针对调度器子系统的分析工具 |
18 | script | 执行perf或python写的功能扩展脚本,生成脚本框架,读取数据文件中的数据信息等 |
19 | stat | 执行某个命令,收集特定进程的性能概况,包括CPI,Cache丢失率等 |
20 | test | perf对当前软硬件平台进行健全性测试,可用此工具测试当前的软硬件平台是否能支持perf的所有功能 |
21 | timechart | 针对测试期间系统行为进行可视化的工具 |
22 | top | 关于syscall的工具 |
23 | probe | 用于定义动态检查点 |
5.1 查看全局性概况命令
命令 | 作用 |
perf list | 查看当前系统支持的性能事件,(就是使用perf stat -e x,y,z),其中x,y,z为追踪事件,可以同perf list来查看 |
perf bench | 对系统性能进行摸底 |
perf test | 对系统进行健全性测试 |
perf stat | 对全局性能进行统计 |
5.2 查看全局细节命令
命令 | 作用 |
perf top | 可以实时查看当前系统进程函数占用率情况 |
perf probe | 可以自定义动态事件 |
5.3 查看特定功能分析命令:
命令 | 作用 |
perf kmem | 针对slab子系统性能分析 |
perf kvm | 针对kvm虚拟化分析 |
perf lock | 分析锁性能 |
perf mem | 分析内存slab性能 |
perf sched | 分析内核调度器性能 |
perf trace | 记录系统调用轨迹 |
5.4常用功能命令
perf record,可以系统全局,也可以具体到某个进程,更甚具体到某一进程某一事件,可宏观,也可以很微观
命令 | 作用 |
perf record | 记录信息到perf.data |
perf report | 生成报告 |
perf diff | 对两个记录进行diff |
perf evlist | 列出记录的性能事件 |
perf annotate | 显示perf.data函数代码 |
perf archive | 将相关符号打包,方便在其他机器进行分析 |
perf script | 将perf.data输出可读性文本 |
5.5可视化工具
命令 | 作用 |
perf timechart record | 记录事件 |
perf timechart | 生成output.svg文档 |
6.perf引入的负载
perf测试不可避免的会引入额外负载,有三种形式:
- counting:内核提供计数总结,多是Hardware Event,Software Events,PMU计数等,相关命令perf stat;
- sampling:perf将事件数据缓存到一块buffer中,然后异步写入到perf.data文件中,使用perf report等工具进行离线分析;
- bpf:Kernel 4.4+新增功能,可以提供更多有效filter和输出总结;
其中,counting引入的额外负荷最小,sampling在某些情况下会引入非常大的负荷,bpf可以有效缩减负荷。
针对sampling,可以通过挂在建立在RAM上的文件系统来有效降低读写I/o引入的负荷
mkdir /tmpfs
mount -t tmpfs tmpfs /tmpfs
7.perf list列出所有事件
perf list显示的事件类型分类如下:hw/cache/pmu都是硬件相关的;tracepoint基于内核的trace;sw实际上是内核计数器
- hw/hardware显示支持的硬件事件相关,如:
ccion@ccion:~$ sudo perf list hardwareList of pre-defined events (to be used in -e):branch-instructions OR branches [Hardware event]branch-misses [Hardware event]bus-cycles [Hardware event]cache-misses [Hardware event]cache-references [Hardware event]cpu-cycles OR cycles [Hardware event]instructions [Hardware event]ref-cycles [Hardware event]
- sw/software显示支持的软件事件列表:
ccion@ccion:~$ sudo perf list swList of pre-defined events (to be used in -e):alignment-faults [Software event]bpf-output [Software event]context-switches OR cs [Software event]cpu-clock [Software event]cpu-migrations OR migrations [Software event]dummy [Software event]emulation-faults [Software event]major-faults [Software event]minor-faults [Software event]page-faults OR faults [Software event]task-clock [Software event]
- cache/hwcache显示硬件cache相关事件列表:
ccion@ccion:~$ sudo perf list cacheList of pre-defined events (to be used in -e):L1-dcache-load-misses [Hardware cache event]L1-dcache-loads [Hardware cache event]L1-dcache-stores [Hardware cache event]L1-icache-load-misses [Hardware cache event]LLC-load-misses [Hardware cache event]LLC-loads [Hardware cache event]LLC-store-misses [Hardware cache event]LLC-stores [Hardware cache event]branch-load-misses [Hardware cache event]branch-loads [Hardware cache event]dTLB-load-misses [Hardware cache event]dTLB-loads [Hardware cache event]dTLB-store-misses [Hardware cache event]dTLB-stores [Hardware cache event]iTLB-load-misses [Hardware cache event]iTLB-loads [Hardware cache event]node-load-misses [Hardware cache event]node-loads [Hardware cache event]node-store-misses [Hardware cache event]node-stores [Hardware cache event]
- pmu显示支持的PMU事件列表
ccion@ccion:~$ sudo perf list pmu
[sudo] password for ccion: List of pre-defined events (to be used in -e):branch-instructions OR cpu/branch-instructions/ [Kernel PMU event]branch-misses OR cpu/branch-misses/ [Kernel PMU event]bus-cycles OR cpu/bus-cycles/ [Kernel PMU event]cache-misses OR cpu/cache-misses/ [Kernel PMU event]cache-references OR cpu/cache-references/ [Kernel PMU event]cpu-cycles OR cpu/cpu-cycles/ [Kernel PMU event]cstate_core/c3-residency/ [Kernel PMU event]cstate_core/c6-residency/ [Kernel PMU event]cstate_core/c7-residency/ [Kernel PMU event]cstate_pkg/c10-residency/ [Kernel PMU event]cstate_pkg/c2-residency/ [Kernel PMU event]cstate_pkg/c3-residency/ [Kernel PMU event]cstate_pkg/c6-residency/ [Kernel PMU event]cstate_pkg/c7-residency/ [Kernel PMU event]cstate_pkg/c8-residency/ [Kernel PMU event]cstate_pkg/c9-residency/ [Kernel PMU event]cycles-ct OR cpu/cycles-ct/ [Kernel PMU event]cycles-t OR cpu/cycles-t/ [Kernel PMU event]el-abort OR cpu/el-abort/ [Kernel PMU event]el-capacity OR cpu/el-capacity/ [Kernel PMU event]el-commit OR cpu/el-commit/ [Kernel PMU event]el-conflict OR cpu/el-conflict/ [Kernel PMU event]el-start OR cpu/el-start/ [Kernel PMU event]instructions OR cpu/instructions/ [Kernel PMU event]intel_pt// [Kernel PMU event]mem-loads OR cpu/mem-loads/ [Kernel PMU event]mem-stores OR cpu/mem-stores/ [Kernel PMU event]msr/aperf/ [Kernel PMU event]msr/mperf/ [Kernel PMU event]msr/pperf/ [Kernel PMU event]msr/smi/ [Kernel PMU event]msr/tsc/ [Kernel PMU event]power/energy-cores/ [Kernel PMU event]power/energy-gpu/ [Kernel PMU event]power/energy-pkg/ [Kernel PMU event]power/energy-psys/ [Kernel PMU event]power/energy-ram/ [Kernel PMU event]ref-cycles OR cpu/ref-cycles/ [Kernel PMU event]topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]topdown-slots-issued OR cpu/topdown-slots-issued/ [Kernel PMU event]topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]topdown-total-slots OR cpu/topdown-total-slots/ [Kernel PMU event]tx-abort OR cpu/tx-abort/ [Kernel PMU event]tx-capacity OR cpu/tx-capacity/ [Kernel PMU event]tx-commit OR cpu/tx-commit/ [Kernel PMU event]tx-conflict OR cpu/tx-conflict/ [Kernel PMU event]tx-start OR cpu/tx-start/ [Kernel PMU event]uncore_cbox_0/clockticks/ [Kernel PMU event]uncore_cbox_1/clockticks/ [Kernel PMU event]uncore_cbox_2/clockticks/ [Kernel PMU event]uncore_cbox_3/clockticks/ [Kernel PMU event]uncore_cbox_4/clockticks/ [Kernel PMU event]uncore_imc/data_reads/ [Kernel PMU event]uncore_imc/data_writes/ [Kernel PMU event]
- tracepoint显示支持的所有tracepoint列表:
ccion@ccion:~$ sudo perf list tracepointList of pre-defined events (to be used in -e):alarmtimer:alarmtimer_cancel [Tracepoint event]alarmtimer:alarmtimer_fired [Tracepoint event]alarmtimer:alarmtimer_start [Tracepoint event]alarmtimer:alarmtimer_suspend [Tracepoint event]block:block_bio_backmerge [Tracepoint event]block:block_bio_bounce [Tracepoint event]block:block_bio_complete [Tracepoint event]block:block_bio_frontmerge [Tracepoint event]block:block_bio_queue [Tracepoint event]block:block_bio_remap [Tracepoint event]block:block_dirty_buffer [Tracepoint event]block:block_getrq [Tracepoint event]block:block_plug [Tracepoint event]block:block_rq_complete [Tracepoint event]block:block_rq_insert [Tracepoint event]block:block_rq_issue [Tracepoint event]block:block_rq_remap [Tracepoint event]block:block_rq_requeue [Tracepoint event]block:block_sleeprq [Tracepoint event]block:block_split [Tracepoint event]block:block_touch_buffer [Tracepoint event]block:block_unplug [Tracepoint event]bpf:bpf_map_create [Tracepoint event]bpf:bpf_map_delete_elem [Tracepoint event]bpf:bpf_map_lookup_elem [Tracepoint event]bpf:bpf_map_next_key [Tracepoint event]bpf:bpf_map_update_elem [Tracepoint event]bpf:bpf_obj_get_map [Tracepoint event]bpf:bpf_obj_get_prog [Tracepoint event]bpf:bpf_obj_pin_map [Tracepoint event]bpf:bpf_obj_pin_prog [Tracepoint event]bpf:bpf_prog_get_type [Tracepoint event]bpf:bpf_prog_load [Tracepoint event]bpf:bpf_prog_put_rcu [Tracepoint event]bridge:br_fdb_add [Tracepoint event]bridge:br_fdb_external_learn_add [Tracepoint event]bridge:br_fdb_update [Tracepoint event]bridge:fdb_delete [Tracepoint event]cgroup:cgroup_attach_task [Tracepoint event]cgroup:cgroup_destroy_root [Tracepoint event]cgroup:cgroup_mkdir [Tracepoint event]cgroup:cgroup_release [Tracepoint event]cgroup:cgroup_remount [Tracepoint event]cgroup:cgroup_rename [Tracepoint event]cgroup:cgroup_rmdir [Tracepoint event]cgroup:cgroup_setup_root [Tracepoint event]cgroup:cgroup_transfer_tasks [Tracepoint event]clk:clk_disable [Tracepoint event]clk:clk_disable_complete [Tracepoint event]clk:clk_enable [Tracepoint event]clk:clk_enable_complete [Tracepoint event]clk:clk_prepare [Tracepoint event]clk:clk_prepare_complete [Tracepoint event]clk:clk_set_parent [Tracepoint event]clk:clk_set_parent_complete [Tracepoint event]clk:clk_set_phase [Tracepoint event]clk:clk_set_phase_complete [Tracepoint event]clk:clk_set_rate [Tracepoint event]clk:clk_set_rate_complete [Tracepoint event]clk:clk_unprepare [Tracepoint event]clk:clk_unprepare_complete [Tracepoint event]cma:cma_alloc [Tracepoint event]cma:cma_release [Tracepoint event]compaction:mm_compaction_begin [Tracepoint event]compaction:mm_compaction_defer_compaction [Tracepoint event]compaction:mm_compaction_defer_reset [Tracepoint event]compaction:mm_compaction_deferred [Tracepoint event]compaction:mm_compaction_end [Tracepoint event]compaction:mm_compaction_finished [Tracepoint event]compaction:mm_compaction_isolate_freepages [Tracepoint event]compaction:mm_compaction_isolate_migratepages [Tracepoint event]compaction:mm_compaction_kcompactd_sleep [Tracepoint event]compaction:mm_compaction_kcompactd_wake [Tracepoint event]compaction:mm_compaction_migratepages [Tracepoint event]compaction:mm_compaction_suitable [Tracepoint event]compaction:mm_compaction_try_to_compact_pages [Tracepoint event]compaction:mm_compaction_wakeup_kcompactd [Tracepoint event]cpuhp:cpuhp_enter [Tracepoint event]cpuhp:cpuhp_exit [Tracepoint event]cpuhp:cpuhp_multi_enter [Tracepoint event]dma_fence:dma_fence_destroy [Tracepoint event]dma_fence:dma_fence_emit [Tracepoint event]dma_fence:dma_fence_enable_signal [Tracepoint event]dma_fence:dma_fence_init [Tracepoint event]dma_fence:dma_fence_signaled [Tracepoint event]dma_fence:dma_fence_wait_end [Tracepoint event]dma_fence:dma_fence_wait_start [Tracepoint event]drm:drm_vblank_event [Tracepoint event]drm:drm_vblank_event_delivered [Tracepoint event]drm:drm_vblank_event_queued [Tracepoint event]exceptions:page_fault_kernel [Tracepoint event]exceptions:page_fault_user [Tracepoint event]ext4:ext4_alloc_da_blocks [Tracepoint event]ext4:ext4_allocate_blocks [Tracepoint event]ext4:ext4_allocate_inode [Tracepoint event]ext4:ext4_begin_ordered_truncate [Tracepoint event]ext4:ext4_collapse_range [Tracepoint event]ext4:ext4_da_release_space [Tracepoint event]ext4:ext4_da_reserve_space [Tracepoint event]ext4:ext4_da_update_reserve_space [Tracepoint event]ext4:ext4_da_write_begin [Tracepoint event]ext4:ext4_da_write_end [Tracepoint event]ext4:ext4_da_write_pages [Tracepoint event]ext4:ext4_da_write_pages_extent [Tracepoint event]ext4:ext4_direct_IO_enter [Tracepoint event]ext4:ext4_direct_IO_exit [Tracepoint event]ext4:ext4_discard_blocks [Tracepoint event]ext4:ext4_discard_preallocations [Tracepoint event]
.......