Consider the kernel module testhrarr.c
given in debugging - Watch a variable (memory address) change in Linux kernel, and print stack trace when it changes? - Stack Overflow, which tracks write accesses to a memory location.
Now, I'm trying to track read/write accesses, which I've done by changing this line in the testhrarr_init
function:
attr.bp_type = HW_BREAKPOINT_W | HW_BREAKPOINT_R;
First subquestion here is this: on my platform (Linux kernel 2.6.38); HW_BREAKPOINT_W
on its own works, and HW_BREAKPOINT_W | HW_BREAKPOINT_R
for read/write works too - however trying only read access with HW_BREAKPOINT_R
doesn't work; as I get "Breakpoint registration failed" in /var/log/syslog
, and insmod
fails with "insmod: error inserting './testhrarr.ko': -1 Invalid parameters". Does anyone have any idea why?
Anyways, the problem here is that once I'm watching for read/write access with HW_BREAKPOINT_W | HW_BREAKPOINT_R
, I cannot tell in my handler whether the stack trace I'm getting is due to a read or a write access. I have gone through:
- struct
perf_event_attr
in include/linux/perf_event.h - struct
hw_perf_event
include/linux/perf_event.h - include/linux/hw_breakpoint.h for the
HW_BREAKPOINT_*
enum definitions
... and I couldn't find anywhere an explicit technique to retrieve from the hw breakpoint handler, whether it was a read or write access that triggered. I only found something in kernel/trace/trace_ksym.c, but trace_ksym
has been deprecated (not even found in my 2.6.38), and besides it simply reads attr.bp_type
, which is what we ourselves set in the kernel module.
So, after some brute-forcing, I realized that the hw_perf_event->interrupts
may contain this information; so I modified the handler callback as:
static void sample_hbp_handler(struct perf_event *bp,
struct perf_sample_data *data,
struct pt_regs *regs)
{
struct perf_event_attr attr = bp->attr;
struct hw_perf_event hw = bp->hw;
char hwirep[8];
//it looks like printing %llu, data->type here causes segfault/oops when `cat` runs?
// apparently, hw.interrupts changes depending on read/write access (1 or 2)
// when only HW_BREAKPOINT_W, getting hw.interrupts == 1 always;
// only HW_BREAKPOINT_R - fails for me
// when both, hw.interrupts is either 1 or 2
// defined in include/linux/hw_breakpoint.h:
// HW_BREAKPOINT_R = 1, HW_BREAKPOINT_W = 2,
if (hw.interrupts == HW_BREAKPOINT_R) {
strcpy(hwirep, "_R");
} else if (hw.interrupts == HW_BREAKPOINT_W) {
strcpy(hwirep, "_W");
} else {
strcpy(hwirep, "__");
}
printk(KERN_INFO "+--- %s value is accessed (.bp_type %d, .type %d, state %d htype %d hwi %llu / %s ) ---+\n", ksym_name, attr.bp_type, attr.type, hw.state, hw.info.type, hw.interrupts, hwirep);
dump_stack();
printk(KERN_INFO "|___ Dump stack from sample_hbp_handler ___|\n");
}
... and this seems to work somewhat; as I get the following in syslog
:
$ grep "testhrarr_arr_first value" /var/log/syslog
kernel: [ 200.887620] +--- testhrarr_arr_first value is accessed (.bp_type 3, .type 5, state 0 htype 131 hwi 1 / _R ) ---+
kernel: [ 200.892163] +--- testhrarr_arr_first value is accessed (.bp_type 3, .type 5, state 0 htype 131 hwi 1 / _R ) ---+
kernel: [ 200.892634] +--- testhrarr_arr_first value is accessed (.bp_type 3, .type 5, state 0 htype 131 hwi 2 / _W ) ---+
kernel: [ 200.912192] +--- testhrarr_arr_first value is accessed (.bp_type 3, .type 5, state 0 htype 131 hwi 1 / _R ) ---+
kernel: [ 200.912713] +--- testhrarr_arr_first value is accessed (.bp_type 3, .type 5, state 0 htype 131 hwi 2 / _W ) ---+
kernel: [ 200.932138] +--- testhrarr_arr_first value is accessed (.bp_type 3, .type 5, state 0 htype 131 hwi 1 / _R ) ---+
... however, if attr.bp_type
is just HW_BREAKPOINT_W
, then just three hw.interrupts == 1
are reported (which would be then wrongly reported as _R
with the above code).
So then, if I just invert the meanings of _R
and _W
, I may get something that matches what I guess should occur - but this is quite obviously a shot in the dark, since I have no idea what hw_perf_event->interrupts
should actually stand for.
So - does anyone know the proper way of determining the "direction" (read or write) of access to a hardware-watched memory location?
Edit: the answer for my first subquestion: for my architecture, x86, there is this piece of code:
http://lxr.free-electrons.com/source/arch/x86/kernel/hw_breakpoint.c?v=2.6.38#L252
static int arch_build_bp_info(struct perf_event *bp)
{
struct arch_hw_breakpoint *info = counter_arch_bp(bp);
info->address = bp->attr.bp_addr;
/* Type */
switch (bp->attr.bp_type) {
case HW_BREAKPOINT_W:
info->type = X86_BREAKPOINT_WRITE;
break;
case HW_BREAKPOINT_W | HW_BREAKPOINT_R:
info->type = X86_BREAKPOINT_RW;
break;
case HW_BREAKPOINT_X:
info->type = X86_BREAKPOINT_EXECUTE;
// ...
default:
return -EINVAL;
}
...
}
... which clearly states that X86 has either _WRITE
or _RW
breakpoints; so if we try to set up just for HW_BREAKPOINT_R
, the process would fail returning -EINVAL.
So, I guess, I'd need the answer here primarily for X86, although if there is a generic portable mechanism for determining read/write access, I'd rather know about that...