初始化HPM(Hardware performance Monitor),系统初始化时调用。
.option norvc
.global hpm_get_icache_miss_rate
hpm_get_icache_miss_rate:
li x5, 0xffffffff
csrw mcountinhibit, x5
csrw mcycle, x0
csrw minstret, x0
csrw mhpmcounter3, x0
csrw mhpmcounter4, x0
csrw mhpmcounter5, x0
csrw mhpmcounter6, x0
csrw mhpmcounter7, x0
csrw mhpmcounter8, x0
csrw mhpmcounter9, x0
csrw mhpmcounter10, x0
csrw mhpmcounter11, x0
csrw mhpmcounter12, x0
csrw mhpmcounter13, x0
csrw mhpmcounter14, x0
csrw mhpmcounter15, x0
csrw mhpmcounter16, x0
csrw mhpmcounter17, x0
csrw mhpmcounter18, x0
csrw mhpmcounter19, x0
csrw mhpmcounter20, x0
csrw mhpmcounter21, x0
csrw mhpmcounter22, x0
csrw mhpmcounter23, x0
csrw mhpmcounter24, x0
csrw mhpmcounter25, x0
csrw mhpmcounter26, x0
csrw mhpmcounter27, x0
csrw mhpmcounter28, x0
csrw mhpmcounter29, x0
csrw mhpmcounter30, x0
csrw mhpmcounter31, x0
li x5, 1
csrw mhpmevent3, x5
li x5, 2
csrw mhpmevent4, x5
li x5, 3
csrw mhpmevent5, x5
li x5, 4
csrw mhpmevent6, x5
li x5, 5
csrw mhpmevent7, x5
li x5, 6
csrw mhpmevent8, x5
li x5, 7
csrw mhpmevent9, x5
li x5, 0xb
csrw mhpmevent13, x5
li x5, 0xc
csrw mhpmevent14, x5
li x5, 0xd
csrw mhpmevent15, x5
li x5, 0xe
csrw mhpmevent16, x5
li x5, 0xf
csrw mhpmevent17, x5
li x5, 0xffffffff
csrw mcounteren, x5
li x5, 0xffffffff
csrw scounteren, x5
csrw mcountinhibit, x0
ret
在需要检测cache miss的位置,调用读取event寄存器:
DBG("counter3: %lx,counter4: %lx, dcr: 0x%lx, dcrm: 0x%lx dcw: 0x%lx, dcwm: 0x%lx, sto:0x%lx\n", csr_read(mhpmcounter3), csr_read(mhpmcounter4),csr_read(mhpmcounter14),csr_read(mhpmcounter15), csr_read(mhpmcounter16), csr_read(mhpmcounter17),csr_read(mhpmcounter13));
测试,在timer中断中查询当前的事件寄存器状态:
riscv_timer_interrupt line 43, counter3: 18c970e9d8,counter4: 28
riscv_timer_interrupt line 43, counter3: 18c9bb2df2,counter4: 28
riscv_timer_interrupt line 43, counter3: 18ca05720f,counter4: 28
riscv_timer_interrupt line 43, counter3: 18ca4fb62c,counter4: 28
riscv_timer_interrupt line 43, counter3: 18ca99fa49,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cae43e67,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cb2e8277,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cb78c68a,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cbc30a99,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cc0d4eb6,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cc5792cf,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cca1d6ec,counter4: 28
riscv_timer_interrupt line 43, counter3: 18ccec1afc,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cd365f0f,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cd80a31e,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cdcae73b,counter4: 28
riscv_timer_interrupt line 43, counter3: 18ce152b4a,counter4: 28
riscv_timer_interrupt line 43, counter3: 18ce5f6f5d,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cea9b36d,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cef3f78a,counter4: 28
riscv_timer_interrupt line 43, counter3: 18cf3e3b99,counter4: 28
通过访问counter3和counter4寄存器,在RTOS上成功获取到了ICache的miss率(counter3/counter4),但是貌似没有办法得到cache miss时候对用的访问地址。 也就是有没有办法能够获得失中时刻的访问地址。它更是一个程序段行为的统计展示。 为了优化程序可以分程序段分别统计cache miss率,针对miss率高的程序段进行分析。
关于HPM的使用,总结如下:
1、不同事件计数器单位不同,可以从事件描述中看出。例如:time这一计数器所表示的是系统计时器的当前值,或者counter_i表征L1 ICache miss counter时,它的单位就是L1 ICache访问缺失的次数;
2、如上所述,如果counter3设置为CACHE(ICACHE/DCACHE读/DCACHE写)的缺失次数,读该寄存器的返回值得到的时访问CACHE缺失的次数,只需再设置某一寄存器counter_i为CACHE访问次数,用counter3的值除以counter_i的值即可得到CACHE缺失率。
结束!