Understanding kcov – play with -fsanitize-coverage=trace-pc from the user space

kcov is a kernel feature used to support syzkaller[1]. To provide the code coverage information from the kernel itself, the GCC compiler was patched to instrument the kernel image[2]. The kernel itself was also patched to enable this feature where is propriate[3]. This post tries to reproduce the essense of kcov in the user space, in the hope of a better understanding of kcov in general.

1. -fsanitize-coverage=trace-pc

After (>=) 6.0, GCC supports this new feature/flag, which instruments every basic block generated by GCC with function “trace-pc”. This “trace-pc” function is provided by the user, and should have the name as “__sanitizer_cov_trace_pc“. For kcov, this function is implemented in kernel/kcov.c. By the time of writing, there is no user-space example available using this new flag, primarily because this feature is mainly designed for kernel/syzkaller, and gcov is available for user-space programs already. Nevertheless, we will show how to play this in the user space:)

2. First try

Let’s create a “non-trivial” testing file and a makefile, as shown below:

#include <stdio.h>
#include <string.h>

void __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

static int fun2(void)
{
	printf("fun2\n");
	return 0;
}

static void fun1(int a)
{
	fun2();
	printf("fun1: a[%d]\n", a);
}

int main(void)
{
	int a = 10;
	fun1(a);
	return 0;
}

CC=gcc
CFLAGS=-g -ggdb -fsanitize-coverage=trace-pc
LIBS=
DEPS=
OBJ=test.o

%.o: %.c $(DEPS)
	$(CC) -c -o $@ $< $(CFLAGS) $(LIBS)

test : $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm -f *.o test

Function “__sanitizer_cov_trace_pc” is defined in test.c file. The Makefile also enables the “-fsanitize-coverage” flag. We compile it, run it, and as you expected, get coredump. What the heck? Disassembling the binary gives us some hints:

0000000000400546 <__sanitizer_cov_trace_pc>::
  400546:	55                   	push   %rbp
  400547:	48 89 e5             	mov    %rsp,%rbp
  40054a:	e8 f7 ff ff ff       	callq  400546 <__sanitizer_cov_trace_pc>
  40054f:	bf 80 06 40 00       	mov    $0x400680,%edi
  400554:	e8 d7 fe ff ff       	callq  400430 <puts@plt>
  400559:	90                   	nop
  40055a:	5d                   	pop    %rbp
  40055b:	c3                   	retq

Besides main, fun1, fun2 are instrumented by __sanitizer_cov_trace_pc, __sanitizer_cov_trace_pc itself is also instrumented by the compiler, which creates a recursive bomb, and stack overflow eventually. So, we need to tell GCC not to instrument the instrumenting function itself.

3. Second try

If we look at the kernel patch[3] again, the __sanitizer_cov_trace_pc is defined with “notrace” decorator, which is essentially an attribute to let GCC skip instrumentation for the decorated function. All we need to do is to update our __sanitizer_cov_trace_pc function in test.c with this decorator:

#define notrace __attribute__((no_instrument_function))

void notrace __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

Compile, run, and unfortunately coredump again. Disassembly shows the recursive bomb is still there. WTF!

4. Do it again

If we look at the old makefile, we have __sanitizer_cov_trace_pc defined inside the test.c and shared the same compilation flags as other functions, including “-fsanitize-coverage”. It is time to split the function into a separated file and use a different set of compilation flags. Now we have trace.c, trace.h, test.c, and a new makefile, as shown below:

#include <stdio.h>

#define notrace __attribute__((no_instrument_function))

void notrace __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

#define notrace __attribute__((no_instrument_function))
void notrace __sanitizer_cov_trace_pc(void);

#include <stdio.h>
#include <string.h>
#include "trace.h"

static int fun2(void)
{
	printf("fun2\n");
	return 0;
}

static void fun1(int a)
{
	fun2();
	printf("fun1: a[%d]\n", a);
}

int main(void)
{
	int a = 10;
	fun1(a);
	return 0;
}

CC=gcc
CFLAGS=-g -ggdb -fsanitize-coverage=trace-pc
LIBS=
DEPS=trace.h
OBJ=test.o trace.o

all: test

trace.o: trace.c
	$(CC) -c -o $@ $<

test.o: test.c
	$(CC) -c -o $@ $< $(CFLAGS) $(LIBS)

test : $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm -f *.o test

Compile, run, and it works:

[daveti@daveti fuzz]$ ./test
code instrumented...
code instrumented...
code instrumented...
fun2
code instrumented...
fun1: a[10]
code instrumented...

5. Trace PC

The final missing part of our user-space implementation comparing to kcov is the PC tracing functionality. Fortunately, GCC already provides a builtin command to retrieve the return address pushed on the stack. As what linux kernel does, we update trace.c to enable PC tracing:

#include <stdio.h>

#define notrace	__attribute__((no_instrument_function))
#define _RET_IP_	(unsigned long)__builtin_return_address(0)

void notrace __sanitizer_cov_trace_pc(void)
{
	//printf("code instrumented...\n");
	printf("return pc [0x%x]\n", _RET_IP_);
}

Compile, run, and use addr2line to get debugging information from these addresses. Doesn’t it look almost the same as kcov!

[daveti@daveti fuzz2]$ ./test
return pc [0x4005ab]
return pc [0x400581]
return pc [0x400554]
fun2
return pc [0x400568]
fun1: a[10]
return pc [0x4005c6]
[daveti@daveti fuzz2]$ addr2line -e ./test 0x4005ab
/home/daveti/c/fuzz2/test.c:19

6. Notes

One observation we can easily find is the overhead of this instrumentation. Since the instrumentation happens per basic block instead of functions, it is trivial to have bigger number of instrumentations than the number of actual function calls. For example, in the our example, we have 3 functions, and 5 instrumentations. This is also the reason why not every kernel subsystem/component enables kcov (besides, this instrumentation could break kernel booting). Again, this post tries to mimic what kcov does from the user space. For real user space coverage instrumentation, there is gcov[4] already.

[1] https://github.com/google/syzkaller
[2] http://gcc.1065356.n8.nabble.com/Re-Add-fuzzing-coverage-support-td1212322.html
[3] https://github.com/torvalds/linux/commit/5c9a8750a6409c63a0f01d51a9024861022f6593
[4] https://gcc.gnu.org/onlinedocs/gcc/Gcov.html

Understanding kcov – play with -fsanitize-coverage=trace-pc from the user space

About daveti

1 Response to Understanding kcov – play with -fsanitize-coverage=trace-pc from the user space

Leave a comment Cancel reply

Categories

Resource

Tags

Blog Stats