[CRIU] x86: Hardware breakpoints are not always triggered

Oleg Nesterov oleg at redhat.com
Thu Jan 28 14:01:12 PST 2016


On 01/28, Oleg Nesterov wrote:
>
> On 01/28, Andrey Wagin wrote:
> >
> > We use hardware breakpoints in CRIU and we found that sometimes we set
> > a break-point, but a process doesn't stop on it.
>
> reproduced, and this certainly looks like kvm bug to me.
>
> > The reproducer uses a different break-point address if it is executed
> > with arguments than when it executed without arguments.
>
> IOW, multiple processes running in parallel use the same debug register db0
> but different address. And it seems that set_debugreg(address, 0) sometime
> doesn't work in the guest kernel.
>
> I think I verified the following:
>
> 	- debug registers look always correct as it seen by the guest.
> 	  I used get_debugreg() to dump them after the task misses bp.
>
> 	- do_debug() was not called in this case.
>
> 	- finally, it seems that the host has the wrong value in db0
> 	  set by another process.
>
> 	  I modified your test-case so that child2() calls child() when
> 	  it detects the missed bp, and this does trigger do_debug/etc
> 	  while it should not.

See another test-case below.

I am running "./bp 0 1" on the host and "./bp 14 15" under QEMU, this immediately
leads to

	ERR!! hit wrong bp 0 != 14
	ERR!! hit wrong bp 0 != 14
	ERR!! hit wrong bp 0 != 14
	ERR!! hit wrong bp 1 != 14
	...

Oleg.
-------------------------------------------------------------------------------

#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/user.h>
#include <asm/debugreg.h>
#include <assert.h>

#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len)
{
	unsigned long dr7;

	dr7 = ((len | type) & 0xf)
		<< (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE);
	if (enable)
		dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE));

	return dr7;
}

int write_dr(int pid, int dr, unsigned long val)
{
	return ptrace(PTRACE_POKEUSER, pid,
			offsetof (struct user, u_debugreg[dr]),
			val);
}

void set_bp(pid_t pid, void *addr)
{
	unsigned long dr7;
	assert(write_dr(pid, 0, (long)addr) == 0);
	dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1);
	assert(write_dr(pid, 7, dr7) == 0);
}

void *get_rip(int pid)
{
	return (void*)ptrace(PTRACE_PEEKUSER, pid,
			offsetof(struct user, regs.rip), 0);
}

void test(int nr)
{
	void *bp_addr = &&label + nr, *bp_hit;
	int pid;

	printf("test bp %d\n", nr);
	assert(nr < 16); // see 16 asm nops below

	pid = fork();
	if (!pid) {
		assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
		kill(getpid(), SIGSTOP);
		for (;;) {
			label: asm (
				"nop; nop; nop; nop;"
				"nop; nop; nop; nop;"
				"nop; nop; nop; nop;"
				"nop; nop; nop; nop;"
			);
		}
	}

	assert(pid == wait(NULL));
	set_bp(pid, bp_addr);

	for (;;) {
		assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0);
		assert(pid == wait(NULL));

		bp_hit = get_rip(pid);
		if (bp_hit != bp_addr)
			fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n",
				bp_hit - &&label, nr);
	}
}

int main(int argc, const char *argv[])
{
	while (--argc) {
		int nr = atoi(*++argv);
		if (!fork())
			test(nr);
	}

	while (wait(NULL) > 0)
		;
	return 0;
}



More information about the CRIU mailing list