Hacking Valgrind

This post talks about 3 commits I have recently added into my own valgrind tree [1], including the support for fsgsbase instructions, rdrand/rdseed instructions, and adding a new trapdoor (client request) to support gdb-like add-symbol-file command. Note that all these new features are not available in the mainstream valgrind by the time of writing, and I am not planning to work on mainstreaming anyway. Nevertheless, feel free the patch your own valgrind if needed. My work is supported by Fortanix [5].

1. Support for fsgsbase

fsgsbase instructions allow user space to read [6] or write [7] the FS or GS register base on x86_64 architecture, enabling indirect addressing mode using FS/GS, such as “mov %GS:0x10, %rax”. Surprisingly, the most challenging part (for me) was the decoding of amd64/x86_64 instructions. I am not interested in repeating how fucked-up this encoding mechanism is but only remind readers that opcode is USELESS on this architecture. Anyway, once we figure how to decode fsgsbase instructions in valgrind, we are able to generate the corresponding VEX IRs.

Although FS/GS base update from the user space is not supported, valgrind has FS/GS base registers built inside the guest VM state. Valgrind even hooks arch_prctl() syscall to update those guest registers. For us, we need to remove all those constrains assuming a constant FS/GS, and allow fsgsbase instructions to update FS/GS base registers in the guest. Because valgrind is emulating FS/GS in the guest, there is no need to check for the real hardware support for these instructions on the host. For details of the patch, please check [2].

2. Support for rdrand/rdseed

rdrand call the TRNG available inside the CPU to generate a random number [8]. rdseed is similar although focusing on providing random seed for PRNG [9]. The difference between them can be found at [10]. Unlike the fsgsbase instructions, valgrind needs to check whether or not the host CPU supports rdrand/rdseed when encountering these instructions in the client program, and delegate the acutal execution to the real CPU on the host. (Although we could emulate these instructions in valgrind as well, faithfully executing them is more preferred especially when the CPU supports these instructions.)

Once we have extended CPUID to detect these instructions on the host CPU, we can start to write down “dirtyhelpers” for rdrand/rdseed, which are the actual rdrand/rdseed instructions running on the real CPU. Because these instructions may fail (non-block, carry flag not set), we need to do a loop on the carry flag, making sure we return the right rand/seed to the guest. Similarly, a sane implementation of rdrand/rdseed within the client program should also do a loop on the carry flag. This means we need set the carry flag in the rflags of the guest VM state to help the client program move forward. Turns out it is not easy to do this in valgrind, because the rflags is not listed as other registers of the guest VM state explicitly. Instead, all these flags need to be computed based on the operation of the current instruction.

BTW, rdrand/rdseed is also a good example of the pathologicial design of x86_64 instruction encoding. They have the same opcode as cmpxchg8b and cmpxchg16b. For details of this patch, please check [3].

3. A new trapdoor: add-symbol-file

GDB supports loading symbols manually using add-symbol-file command. It is useful when GDB could not figure out what was loaded at certain VA range (thus ??? in the backtrace). Unfortunately, valgrind does not have such a mechanism. As a result, valgrind could not recognize any memory mapping not directly triggered by mmap() syscall, e.g., memcpy from VA1 to VA2. It also means valgrind could not recognize a binary doing a reloation by itself after the first mmap(), such as loader. Based on these considerations, we add a new valgrind trapdoor (client request) — VALGRIND_ADD_SYMBOL_FILE, allowing a client program to pass the memory mapping information to valgrind. It accepts 3 arguments, the file name of the mapping, e.g., a shared object, the starting mapping address (page aligned), and the length of the mapping. For details of this this patch, please check [4].

Reference:

[1] https://github.com/daveti/valgrind
[2] https://github.com/daveti/valgrind/commit/16ccd1974ce2ca13e10adac9906de5bc689c509d
[3] https://github.com/daveti/valgrind/commit/5986cc4a0c6bf2d41822df15e8f074437c32e391
[4] https://github.com/daveti/valgrind/commit/baa7d6b344a539b8842d7c157ab67af990213500
[5] https://fortanix.com/
[6] https://www.felixcloutier.com/x86/RDFSBASE:RDGSBASE.html
[7] https://www.felixcloutier.com/x86/WRFSBASE:WRGSBASE.html
[8] https://www.felixcloutier.com/x86/RDRAND.html
[9] https://www.felixcloutier.com/x86/RDSEED.html
[10] https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed

About daveti

Interested in kernel hacking, compilers, machine learning and guitars.
This entry was posted in Dave's Tools, Programming and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.