gcc, llvm, and Linux kernel

This post talks about what happened recently in the Linux kernel mailing list discussion. While this post does not dig into compiler internals or the whole picture between the Linux kernel and compilers, we discuss 2 specific issues from gcc and llvm respectively. The gcc issue may be a quirk but the llvm issue is definitely a bug. Keep reading…

1. leal %P1(%%esp),%0

The title is the inline assembly used at arch/x86/boot/main.c line 121. The thing seems weird is the ‘P’ in ‘%P1’, which is not the common comparing to ‘%1’ we used to see in gcc inline assembly. So what is the heck[1]? Let us try to put this kernel inline into a main function where we could play with gcc easily:

#include <stdio.h>
#define STACK_SIZE	512
static int stack_end;

int main()
	asm("leal %P1(%%esp),%0"
		: "=r" (stack_end)
		: "i" (-STACK_SIZE));

	return 0;

Then we assemble the code (gcc -S) and look at the assembly, where we can see the inline is interpreted as follows:

leal -512(%esp),%eax

This is exactly the thing we want for ‘leal’. In a word, gcc does not complain anything about this ‘P’. What if we remove the ‘P’ and look at the assembly again? After a quick trial, here is the inline generated by gcc:

leal $-512(%esp),%eax

Oops, gcc recognizes the ‘%1’ is an immediate value and appends ‘$’ (AT&T style) automatically. This may be right in most cases but definitely wrong for ‘lea’. As a matter a fact, if I try to compile the code directly, gcc would not let me do that. Now it is clear that the tricky ‘P’ in ‘%P1’ is used to make gcc happy and work. Note that I am using gcc 4.9.2. Latest gcc (5/6?) seems having fixed this quirk already – generating the same and correct assembly with or without the mysterious ‘P’. Go try yourself.

2. pushf/popf

The original issue was reported from usbhid testing using llvm-compiled kernel[2]. With kernel developers’ further debugging, the root cause of the bug is clear, pointing to the llvm rather than the kerne code itself[3]. Let us go thru the example described in the llvm mailing list. Here is the source file:

#include <stdlib.h>;
#include <stdbool.h>;

/* Assume foo changes the IF in EFLAGS */
void foo(void);
int a;

int bar(void)
	bool const zero = a -= 1;
	asm volatile ("" : : : "cc");
	if (zero) {
		return EXIT_FAILURE;

The point is foo() may (or not) change the IF in the EFLAGS. Compile it to generate the object file (clang -O2  -c -o ) and disassemble it as shown below (objdump -S):

[daveti@daveti c]$ objdump -S llvm_if_issue.o

llvm_if_issue.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <bar>:
   0:	53                   	push   %rbx
   1:	e8 00 00 00 00       	callq  6 <bar+0x6>
   6:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # c <bar+0xc>
   c:	9c                   	pushfq
   d:	5b                   	pop    %rbx
   e:	e8 00 00 00 00       	callq  13 <bar+0x13>
  13:	b8 01 00 00 00       	mov    $0x1,%eax
  18:	53                   	push   %rbx
  19:	9d                   	popfq
  1a:	75 07                	jne    23 <bar+0x23>
  1c:	e8 00 00 00 00       	callq  21 <bar+0x21>
  21:	31 c0                	xor    %eax,%eax
  23:	5b                   	pop    %rbx
  24:	c3                   	retq

Let us focus on the interesting part:

   c:	9c                   	pushfq
   d:	5b                   	pop    %rbx
   e:	e8 00 00 00 00       	callq  13 <bar+0x13>
  13:	b8 01 00 00 00       	mov    $0x1,%eax
  18:	53                   	push   %rbx
  19:	9d                   	popfq

As you can see here, before bar() calls foo(), it saves EFLAGS on the stack using ‘pushf’. After the foo() is done, it recovers the EFLAGS from the stack using ‘popf’. Remember our assumption – foo() may change the IF in the EFLAGS! Now we could explain the bug found in usbhid. The foo() is spin_lock_irq(), and the bar() is usbhid_close(). While spin_lock_irq() makes sure the interrupt disabled, usbhid_close() used the old value of EFLAGS, ignoring what happens in spin_lock_irq().

3. Summary

The gcc quirk may reflect the hackish fix of gcc in the early days to satisfy the kernel compilation requirement. After all, gcc is the only compiler without any patches in the kernel to compile the Linux kernel. As such, Linux kernel is the only project leveraging different gcc features other projects would never bother. On the other hand, llvm is catching up. There are kernel patches already to make llvm compile the kernel, and people are testing llvm kernel images. Nevertheless, the EFLAGS clobbering issue in llvm optimization may be a showstopper. Most user-space applications do not care about interrupt, however, it is the core requirement for the kernel to work as expected. As Linus pointed out – “Using pushf/popf in generated code is completely insane (unless done very localized in a controlled area).

4. Reference


About daveti

Interested in kernel hacking, compilers, machine learning and guitars.
This entry was posted in OS, Stuff about Compiler and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.