Pitfalls in negative indexing in C

Negative indexing in C, such as a[-1], is legit, although rarely used. There are reasons (pitfalls) why negative indexing is not recommended. This post discusses these pitfalls when using negative indexing (for fun).

1. Negative indexing within the bound

#include <stdio.h>
#include <string.h>

int main()
{
	int a[10];
	int *p;
	int i;

	for (i = 0; i < 10; i++)
		a[i] = 0x12345678;
	p = &a[9];

	for (i = 0; i < 10; i++) {
		printf("a[%d]=0x%x, %p\n", i, a[i], &a[i]);
		printf("p[-%d]=0x%x, %p\n", i, p[-i], &p[-i]);
	}

	return 0;
}

In the code above, we let pointer p pointing to the end of array a, and print out each element in a using negative indexing with the help of p.

[daveti@daveti c]$ ./negidx0
a[0]=0x12345678, 0x7ffd50edad40
p[-0]=0x12345678, 0x7ffd50edad64
a[1]=0x12345678, 0x7ffd50edad44
p[-1]=0x12345678, 0x7ffd50edad60
a[2]=0x12345678, 0x7ffd50edad48
p[-2]=0x12345678, 0x7ffd50edad5c
a[3]=0x12345678, 0x7ffd50edad4c
p[-3]=0x12345678, 0x7ffd50edad58
a[4]=0x12345678, 0x7ffd50edad50
p[-4]=0x12345678, 0x7ffd50edad54
a[5]=0x12345678, 0x7ffd50edad54
p[-5]=0x12345678, 0x7ffd50edad50
a[6]=0x12345678, 0x7ffd50edad58
p[-6]=0x12345678, 0x7ffd50edad4c
a[7]=0x12345678, 0x7ffd50edad5c
p[-7]=0x12345678, 0x7ffd50edad48
a[8]=0x12345678, 0x7ffd50edad60
p[-8]=0x12345678, 0x7ffd50edad44
a[9]=0x12345678, 0x7ffd50edad64
p[-9]=0x12345678, 0x7ffd50edad40

The output proves what we would expect. Subcriptor [] simply means that E1[E2]=*((E1)+(E2)). So it does not matter whether E2 is positive or negative, and the compiler will take care of pointer type to generate the correct the result pointer (say, whether it is int or char). [1]

2. Negative indexing out of the bound

Things get complicated when we try to reference var a from var b, assuming that both a and b are next to each other…

#include <stdio.h>
#include <string.h>

int main()
{
	char a[10];
	char b[10];

	memset(a, 0x0, 10);
	memset(b, 0x1, 10);

	int i;
	printf("a=%p, b=%p\n", a, b);
	for (i = 0; i < 10; i++) {
		printf("b[%d]=%d, %p\n", i, b[i], &b[i]);
		printf("b[-%d]=%d, %p\n", i, b[-i], &b[-i]);
	}

	return 0;
}

If you, like myself, expect to see some 0/1 interleaved, then the output would let you down:

[daveti@daveti c]$ ./negidx1
a=0x7ffc4c88ecd2, b=0x7ffc4c88ecc8
b[0]=1, 0x7ffc4c88ecc8
b[-0]=1, 0x7ffc4c88ecc8
b[1]=1, 0x7ffc4c88ecc9
b[-1]=0, 0x7ffc4c88ecc7
b[2]=1, 0x7ffc4c88ecca
b[-2]=0, 0x7ffc4c88ecc6
b[3]=1, 0x7ffc4c88eccb
b[-3]=0, 0x7ffc4c88ecc5
b[4]=1, 0x7ffc4c88eccc
b[-4]=0, 0x7ffc4c88ecc4
b[5]=1, 0x7ffc4c88eccd
b[-5]=0, 0x7ffc4c88ecc3
b[6]=1, 0x7ffc4c88ecce
b[-6]=64, 0x7ffc4c88ecc2
b[7]=1, 0x7ffc4c88eccf
b[-7]=6, 0x7ffc4c88ecc1
b[8]=1, 0x7ffc4c88ecd0
b[-8]=16, 0x7ffc4c88ecc0
b[9]=1, 0x7ffc4c88ecd1
b[-9]=0, 0x7ffc4c88ecbf

With all the addresses printed out, I guess you have figured sth out. The pitfall here is the actual memory layout of a and b – since stack grows downwards (from high to low), a has higher starting address than b. That is being said, b[-x] would never reach a, which can be seen from the addresses printed out. Therefore, if we wanna reference a from b, we need positive indexing out of bound, e.g.,

#include <stdio.h>
#include <string.h>

int main()
{
	char a[10];
	char b[10];

	memset(a, 0x0, 10);
	memset(b, 0x1, 10);

	int i;
	printf("a=%p, b=%p\n", a, b);
	for (i = 0; i < 20; i++) {
		printf("b[%d]=%d, %p\n", i, b[i], &b[i]);
	}

	return 0;
}

And here is the output:

[daveti@daveti c]$ ./negidx2
a=0x7ffdbf9bc682, b=0x7ffdbf9bc678
b[0]=1, 0x7ffdbf9bc678
b[1]=1, 0x7ffdbf9bc679
b[2]=1, 0x7ffdbf9bc67a
b[3]=1, 0x7ffdbf9bc67b
b[4]=1, 0x7ffdbf9bc67c
b[5]=1, 0x7ffdbf9bc67d
b[6]=1, 0x7ffdbf9bc67e
b[7]=1, 0x7ffdbf9bc67f
b[8]=1, 0x7ffdbf9bc680
b[9]=1, 0x7ffdbf9bc681
b[10]=0, 0x7ffdbf9bc682
b[11]=0, 0x7ffdbf9bc683
b[12]=0, 0x7ffdbf9bc684
b[13]=0, 0x7ffdbf9bc685
b[14]=0, 0x7ffdbf9bc686
b[15]=0, 0x7ffdbf9bc687
b[16]=0, 0x7ffdbf9bc688
b[17]=0, 0x7ffdbf9bc689
b[18]=0, 0x7ffdbf9bc68a
b[19]=0, 0x7ffdbf9bc68b

Or, if we wanna reference b from a, we could use negative indexing out of bound, e.g.,

#include <stdio.h>
#include <string.h>

int main()
{
	char a[10];
	char b[10];

	memset(a, 0x0, 10);
	memset(b, 0x1, 10);

	int i;
	printf("a=%p, b=%p\n", a, b);
	for (i = 0; i < 10; i++) {
		printf("a[%d]=%d, %p\n", i, a[i], &a[i]);
		printf("a[-%d]=%d, %p\n", i, a[-i], &a[-i]);
	}

	return 0;
}

And here is the output:

[daveti@daveti c]$ ./negidx3
a=0x7ffece838602, b=0x7ffece8385f8
a[0]=0, 0x7ffece838602
a[-0]=0, 0x7ffece838602
a[1]=0, 0x7ffece838603
a[-1]=1, 0x7ffece838601
a[2]=0, 0x7ffece838604
a[-2]=1, 0x7ffece838600
a[3]=0, 0x7ffece838605
a[-3]=1, 0x7ffece8385ff
a[4]=0, 0x7ffece838606
a[-4]=1, 0x7ffece8385fe
a[5]=0, 0x7ffece838607
a[-5]=1, 0x7ffece8385fd
a[6]=0, 0x7ffece838608
a[-6]=1, 0x7ffece8385fc
a[7]=0, 0x7ffece838609
a[-7]=1, 0x7ffece8385fb
a[8]=0, 0x7ffece83860a
a[-8]=1, 0x7ffece8385fa
a[9]=0, 0x7ffece83860b
a[-9]=1, 0x7ffece8385f9

3. Alignment

Things get more complicated when there is memory alignment in the stack, which is pretty common.

#include <stdio.h>
#include <string.h>

typedef struct _total_made_up {
	char c0;
	int i;
	char c1[10];
} tmu;

int main()
{
	tmu a;
	tmu b;

	memset(&a, 0x0, sizeof(tmu));
	memset(&b, 0x1, sizeof(tmu));

	int i;
	printf("a=%p, a.c0=%p, a.i=%p, a.c1=%p, sizeof(a)=%d\n",
		&a, &(a.c0), &(a.i), &(a.c1), sizeof(a));
	printf("b=%p, b.c0=%p, b.i=%p, b.c1=%p, sizeof(b)=%d\n",
		&b, &(b.c0), &(b.i), &(b.c1), sizeof(b));
	for (i = 0; i < 10; i++) {
		printf("a.c1[%d]=%d, %p\n", i, (a.c1)[i], &(a.c1)[i]);
		printf("a.c1[-%d]=%d, %p\n", i, (a.c1)[-i], &(a.c1)[-i]);
	}

	return 0;
}

Here is the output:

[daveti@daveti c]$ ./negidx4
a=0x7ffc4bf3c8e0, a.c0=0x7ffc4bf3c8e0, a.i=0x7ffc4bf3c8e4, a.c1=0x7ffc4bf3c8e8, sizeof(a)=20
b=0x7ffc4bf3c8c0, b.c0=0x7ffc4bf3c8c0, b.i=0x7ffc4bf3c8c4, b.c1=0x7ffc4bf3c8c8, sizeof(b)=20
a.c1[0]=0, 0x7ffc4bf3c8e8
a.c1[-0]=0, 0x7ffc4bf3c8e8
a.c1[1]=0, 0x7ffc4bf3c8e9
a.c1[-1]=0, 0x7ffc4bf3c8e7
a.c1[2]=0, 0x7ffc4bf3c8ea
a.c1[-2]=0, 0x7ffc4bf3c8e6
a.c1[3]=0, 0x7ffc4bf3c8eb
a.c1[-3]=0, 0x7ffc4bf3c8e5
a.c1[4]=0, 0x7ffc4bf3c8ec
a.c1[-4]=0, 0x7ffc4bf3c8e4
a.c1[5]=0, 0x7ffc4bf3c8ed
a.c1[-5]=0, 0x7ffc4bf3c8e3
a.c1[6]=0, 0x7ffc4bf3c8ee
a.c1[-6]=0, 0x7ffc4bf3c8e2
a.c1[7]=0, 0x7ffc4bf3c8ef
a.c1[-7]=0, 0x7ffc4bf3c8e1
a.c1[8]=0, 0x7ffc4bf3c8f0
a.c1[-8]=0, 0x7ffc4bf3c8e0
a.c1[9]=0, 0x7ffc4bf3c8f1
a.c1[-9]=0, 0x7ffc4bf3c8df

If we compare the starting address of a and b, we can find a 32-byte difference. The size of the struct is 20 bytes, which means there is a 12-byte gap between the end of b and the start of a due to padding (alignment). Even worse, there is also padding happened within the structure, where the c0 is padded with 3 bytes to alignment with the 4-byte int (i), and c1 is padded with another 2 bytes in the end. Therefore, if we want to use a.c1 to reference b.c1 with negative indexing, the index should start from -22 (8 bytes in a, and 12 bytes in the gap, and 2 bytes padding in b.c1). You don’t see this number coming, do you:)

#include <stdio.h>
#include <string.h>

typedef struct _total_made_up {
	char c0;
	int i;
	char c1[10];
} tmu;

int main()
{
	tmu a;
	tmu b;

	memset(&a, 0x0, sizeof(tmu));
	memset(&b, 0x0, sizeof(tmu));
	b.c0 = 1;
	b.i = 1;
	memset(&b.c1, 0x1, 10);

	int i;
	printf("a=%p, a.c0=%p, a.i=%p, a.c1=%p, sizeof(a)=%d\n",
		&a, &(a.c0), &(a.i), &(a.c1), sizeof(a));
	printf("b=%p, b.c0=%p, b.i=%p, b.c1=%p, sizeof(b)=%d\n",
		&b, &(b.c0), &(b.i), &(b.c1), sizeof(b));
	for (i = 0; i < 10; i++) {
		printf("a.c1[%d]=%d, %p\n", i, (a.c1)[i], &(a.c1)[i]);
		printf("a.c1[-%d-22]=%d, %p\n", i, (a.c1)[-i-22], &(a.c1)[-i-22]);
	}

	return 0;
}

[daveti@daveti c]$ ./negidx5
a=0x7ffce988ccf0, a.c0=0x7ffce988ccf0, a.i=0x7ffce988ccf4, a.c1=0x7ffce988ccf8, sizeof(a)=20
b=0x7ffce988ccd0, b.c0=0x7ffce988ccd0, b.i=0x7ffce988ccd4, b.c1=0x7ffce988ccd8, sizeof(b)=20
a.c1[0]=0, 0x7ffce988ccf8
a.c1[-0-22]=0, 0x7ffce988cce2
a.c1[1]=0, 0x7ffce988ccf9
a.c1[-1-22]=1, 0x7ffce988cce1
a.c1[2]=0, 0x7ffce988ccfa
a.c1[-2-22]=1, 0x7ffce988cce0
a.c1[3]=0, 0x7ffce988ccfb
a.c1[-3-22]=1, 0x7ffce988ccdf
a.c1[4]=0, 0x7ffce988ccfc
a.c1[-4-22]=1, 0x7ffce988ccde
a.c1[5]=0, 0x7ffce988ccfd
a.c1[-5-22]=1, 0x7ffce988ccdd
a.c1[6]=0, 0x7ffce988ccfe
a.c1[-6-22]=1, 0x7ffce988ccdc
a.c1[7]=0, 0x7ffce988ccff
a.c1[-7-22]=1, 0x7ffce988ccdb
a.c1[8]=0, 0x7ffce988cd00
a.c1[-8-22]=1, 0x7ffce988ccda
a.c1[9]=0, 0x7ffce988cd01
a.c1[-9-22]=1, 0x7ffce988ccd9

3. Summary

We have just talked about how messed up negative indexing could be on the stack. It is also possible to do this on the heap, which is more error-prone given the memory management under the hood (malloc/kmalloc). To correctly use negative (or out of bound) indexing, one needs to know exactly what the memory layout looks like, this could be hard for stack variables, even harder for heap ones. For normal programming, in most cases, you do not need negative indexing. As a matter of fact, the only normal usage I have seen so far is from ARMv3-M software development manual, where negative indexing is used to retrieve the SVC instruction number, which is saved on the stack by the CPU core during the supervisor call.

Reference:

[1] https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c

Posted in Programming | Tagged , | Leave a comment

USB gadget functionalities in Android

I started working on Android stuffs this summer. While I mainly work on the USB layer within the Linux kernel, I do sometimes need to look into the Android framework, to see if I could achieve my goal from the Android user space. One of the questions I have got is to find the USB configuration of the phone (e.g., MTP, adb, and etc) when connected with the host machine. And here comes this post.

1. lsusb

The most straightforward way to have the USB information of the phone is to plug it into a Linux box, and run ‘lsusb -t‘ or ‘lsusb -v‘. Things get annoying when I tried to pull similar information from the phone directly.

2. adb

If you have adb with root permission, go to ‘/sys/class/android_usb/android0‘. You will find all the kernel USB gadget functionalities builtin, and the configurations, e.g., which functionalities are enabled in the phone right now. This won’t work if adb runs as ‘shell’ user.

3. USB Options

When the phone is connected with the host machine, the USB option menu would pop up providing more options: charging, MTP, PTP, and etc. This should be the easiest way in most cases, however, would fail if you are not playing with your own phone (e.g., pin protected, and yes, I am talking about “remote” phones in the device farm). It is also possible that some USB gadget functionalities are NOT exposed to the menu at all, e.g., a hiddent USB CDC/ACM functionality.

4. USB Settings

Some vendors provide secret dial codes to trigger a USB Setting menu with some combinations of different gadget functionalities for testing/debugging purpose. E.g., for Samsung Galaxy phones, use ‘*#0808#‘. Let me know if you have similar dial codes from other vendors.

5. UsbManager

Now it is time to look into the Android framework. Can we write an app to show the USB configuration/information? UsbManager.java[1] is the implementation in Android to control the USB hardware. To differentiate the USB host mode and the USB gadget mode, the Android APIs have USB Host APIs and USB Accessory API[2], both of which are managed by the UsbManager. For instance, getDeviceList() in the UsbManager returns a list of UsbDevice connected with the phone. As expected, UsbDevice class is the instantiation of a USB device defined by the USB spec. We could go and find the configuration, interfaces, and endpoints, just like what we could do with libusb. Accordingly, getAccessoryList() returns a list of UsbAccessory the phone behaves. Here is the problem – UsbAccessory is NOT abstracted as UsbDevice! The only information from UsbAccessory you could get is device information, such as manufacturer, product, serial number, and etc. In short, no interface-level information in the UsbAccessory!

Nevertheless, UsbManager may still be the answer if we look the latest implementation[3]. ‘isFunctionEnabled()’ and ‘containsFunction()’ seem to be promising to detect hiddent functionalities, which may not be shown in the USB option menu. Unfortunately, this still depends on the vendor-specific customization – whether or not the USB function is exposed to the Android framework (platform), and managed by the UsbManager.

Refs:
[1] https://developer.android.com/reference/android/hardware/usb/UsbManager.html
[2] https://developer.android.com/guide/topics/connectivity/usb/index.html
[3] https://github.com/android/platform_frameworks_base/blob/master/core/java/android/hardware/usb/UsbManager.java

Posted in OS | Tagged , , | Leave a comment

Understanding kcov – play with -fsanitize-coverage=trace-pc from the user space

kcov is a kernel feature used to support syzkaller[1]. To provide the code coverage information from the kernel itself, the GCC compiler was patched to instrument the kernel image[2]. The kernel itself was also patched to enable this feature where is propriate[3]. This post tries to reproduce the essense of kcov in the user space, in the hope of a better understanding of kcov in general.

1. -fsanitize-coverage=trace-pc

After (>=) 6.0, GCC supports this new feature/flag, which instruments every basic block generated by GCC with function “trace-pc”. This “trace-pc” function is provided by the user, and should have the name as “__sanitizer_cov_trace_pc“. For kcov, this function is implemented in kernel/kcov.c. By the time of writing, there is no user-space example available using this new flag, primarily because this feature is mainly designed for kernel/syzkaller, and gcov is available for user-space programs already. Nevertheless, we will show how to play this in the user space:)

2. First try

Let’s create a “non-trivial” testing file and a makefile, as shown below:

#include <stdio.h>
#include <string.h>

void __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

static int fun2(void)
{
	printf("fun2\n");
	return 0;
}

static void fun1(int a)
{
	fun2();
	printf("fun1: a[%d]\n", a);
}

int main(void)
{
	int a = 10;
	fun1(a);
	return 0;
}
CC=gcc
CFLAGS=-g -ggdb -fsanitize-coverage=trace-pc
LIBS=
DEPS=
OBJ=test.o

%.o: %.c $(DEPS)
	$(CC) -c -o $@ $< $(CFLAGS) $(LIBS)

test : $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm -f *.o test

Function “__sanitizer_cov_trace_pc” is defined in test.c file. The Makefile also enables the “-fsanitize-coverage” flag. We compile it, run it, and as you expected, get coredump. What the heck? Disassembling the binary gives us some hints:

0000000000400546 <__sanitizer_cov_trace_pc>::
  400546:	55                   	push   %rbp
  400547:	48 89 e5             	mov    %rsp,%rbp
  40054a:	e8 f7 ff ff ff       	callq  400546 <__sanitizer_cov_trace_pc>
  40054f:	bf 80 06 40 00       	mov    $0x400680,%edi
  400554:	e8 d7 fe ff ff       	callq  400430 <puts@plt>
  400559:	90                   	nop
  40055a:	5d                   	pop    %rbp
  40055b:	c3                   	retq

Besides main, fun1, fun2 are instrumented by __sanitizer_cov_trace_pc, __sanitizer_cov_trace_pc itself is also instrumented by the compiler, which creates a recursive bomb, and stack overflow eventually. So, we need to tell GCC not to instrument the instrumenting function itself.

3. Second try

If we look at the kernel patch[3] again, the __sanitizer_cov_trace_pc is defined with “notrace” decorator, which is essentially an attribute to let GCC skip instrumentation for the decorated function. All we need to do is to update our __sanitizer_cov_trace_pc function in test.c with this decorator:

#define notrace __attribute__((no_instrument_function))

void notrace __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

Compile, run, and unfortunately coredump again. Disassembly shows the recursive bomb is still there. WTF!

4. Do it again

If we look at the old makefile, we have __sanitizer_cov_trace_pc defined inside the test.c and shared the same compilation flags as other functions, including “-fsanitize-coverage”. It is time to split the function into a separated file and use a different set of compilation flags. Now we have trace.c, trace.h, test.c, and a new makefile, as shown below:

#include <stdio.h>

#define notrace __attribute__((no_instrument_function))

void notrace __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}
#define notrace __attribute__((no_instrument_function))
void notrace __sanitizer_cov_trace_pc(void);
#include <stdio.h>
#include <string.h>
#include "trace.h"

static int fun2(void)
{
	printf("fun2\n");
	return 0;
}

static void fun1(int a)
{
	fun2();
	printf("fun1: a[%d]\n", a);
}

int main(void)
{
	int a = 10;
	fun1(a);
	return 0;
}
CC=gcc
CFLAGS=-g -ggdb -fsanitize-coverage=trace-pc
LIBS=
DEPS=trace.h
OBJ=test.o trace.o

all: test

trace.o: trace.c
	$(CC) -c -o $@ $<

test.o: test.c
	$(CC) -c -o $@ $< $(CFLAGS) $(LIBS)

test : $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm -f *.o test

Compile, run, and it works:

[daveti@daveti fuzz]$ ./test
code instrumented...
code instrumented...
code instrumented...
fun2
code instrumented...
fun1: a[10]
code instrumented...

5. Trace PC

The final missing part of our user-space implementation comparing to kcov is the PC tracing functionality. Fortunately, GCC already provides a builtin command to retrieve the return address pushed on the stack. As what linux kernel does, we update trace.c to enable PC tracing:

#include <stdio.h>

#define notrace	__attribute__((no_instrument_function))
#define _RET_IP_	(unsigned long)__builtin_return_address(0)

void notrace __sanitizer_cov_trace_pc(void)
{
	//printf("code instrumented...\n");
	printf("return pc [0x%x]\n", _RET_IP_);
}

Compile, run, and use addr2line to get debugging information from these addresses. Doesn’t it look almost the same as kcov!

[daveti@daveti fuzz2]$ ./test
return pc [0x4005ab]
return pc [0x400581]
return pc [0x400554]
fun2
return pc [0x400568]
fun1: a[10]
return pc [0x4005c6]
[daveti@daveti fuzz2]$ addr2line -e ./test 0x4005ab
/home/daveti/c/fuzz2/test.c:19

6. Notes

One observation we can easily find is the overhead of this instrumentation. Since the instrumentation happens per basic block instead of functions, it is trivial to have bigger number of instrumentations than the number of actual function calls. For example, in the our example, we have 3 functions, and 5 instrumentations. This is also the reason why not every kernel subsystem/component enables kcov (besides, this instrumentation could break kernel booting). Again, this post tries to mimic what kcov does from the user space. For real user space coverage instrumentation, there is gcov[4] already.

[1] https://github.com/google/syzkaller
[2] http://gcc.1065356.n8.nabble.com/Re-Add-fuzzing-coverage-support-td1212322.html
[3] https://github.com/torvalds/linux/commit/5c9a8750a6409c63a0f01d51a9024861022f6593
[4] https://gcc.gnu.org/onlinedocs/gcc/Gcov.html

Posted in OS, Security, Stuff about Compiler | Tagged , , | Leave a comment

SGX Bug SKL012 and CHIPSEC

Intel SGX CPU (staring from Skylake) has been there for while. The good news is that there is still no known exploitation against SGX self yet, though there are some exploitations in the enclave code and Intel SGX SDK. In general, SGX is still believed to provide strong security guarantee for the data/code in the enclave. If there is really something messed up in SGX, it has to be the CPU logic/mirocode. This post tries to peek into a specific bug reported by Intel for its SGX CPU implementation. Moving forward, we invesitgate the possible mitigations and add new features into the well-known Intel platform security tool chipsec. Cheers.

1. SKL012

In the spec update [1] released by Intel on Sep 2016, there are 6 CPU bugs related with SGX in general [2]. None of them is treated seriously by Intel thus no fix is planed for any of it. Among them, we especially look at SKL012:

SKL012

The SMSW Instruction May Execute Within an Enclave

Problem

The SMSW instruction is illegal within an SGX (Software Guard Extensions) enclave, and an attempt to execute it within an enclave should result in a #UD (invalid-opcode exception). Due to this erratum, the instruction executes normally within an enclave and does not cause a #UD.

Implication

The SMSW instruction provides access to CR0 bits 15:0 and will provide that information inside an enclave. These bits include NE, ET, TS, EM, MP and PE.

Workaround

None identified. If SMSW execution inside an enclave is unacceptable, system software should not enable SGX.

Status

For the steppings affected, see the Summary Table of Changes.

My interpretation for this SKL012 bug is that SGX was designed to not allow SMSW instruction in an enclave, though the instuction could be executed by ring-0 and ring-3. Unfortunately, due to this bug, SMSW “may” be executed in the enclave.

2. So what the fuss?

SMWS is one of the secuirty sensitive instructions, which can be run by ring-3 and reveal some security sensitive information of the platform and the kernel to the user space. These instructions are usually leveraged by malware to detect the VM enviornment or exploit the system/kernel configuation [3][4]. Similar with SMWS, other security sensitive instructions include SGDT, SLDT, SIDT, and STR. All these instructions could be run by ring-3 code without any problem. And what about these instructions for SGX besides the SMWS mentioned in SKL012.

3. Verify the bug(s)

Here we implement a tool called sgxbug [7] based on the enclave creation sample code provided by Intel SGX SDK for Linux [8] by adding the SMWS instruction into the enclave and retrieving the results from the application. We also covered other 4 security sensitive instructions mentioned above. For implementation detailes, please check out the commit (https://github.com/daveti/sgxbug/commit/373c9e6f3d0d70a172cf5315dce15151fa4271e1). For normal ring-3 testing code without SGX, one could refer to [6] for details.

Here is the result of running sgxbug app:

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/sgxbug# ./app 0
Got senstive instruction idx: 0
Checksum(0x0x7ffe4ba35c30, 100) = 0xfffd4143
Info: executing thread synchronization, please wait…
Start sensitive instruction testing…
GDT: limit=0127, base=ffff880273c49000
Sensitive instruction testing done…
Info: SampleEnclave successfully returned.
Enter a character before exit …

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/sgxbug# ./app 1
Got senstive instruction idx: 1
Checksum(0x0x7ffd677a11f0, 100) = 0xfffd4143
Info: executing thread synchronization, please wait…
Start sensitive instruction testing…
IDT: limit=4095, base=ffffffffff578000
Sensitive instruction testing done…
Info: SampleEnclave successfully returned.
Enter a character before exit …

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/sgxbug# ./app 2
Got senstive instruction idx: 2
Checksum(0x0x7fff399040a0, 100) = 0xfffd4143
Info: executing thread synchronization, please wait…
Start sensitive instruction testing…
LDT: ffffffffffff0000
Sensitive instruction testing done…
Info: SampleEnclave successfully returned.
Enter a character before exit …

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/sgxbug# ./app 3
Got senstive instruction idx: 3
Checksum(0x0x7fffb1b31300, 100) = 0xfffd4143
Info: executing thread synchronization, please wait…
Start sensitive instruction testing…
MSW: ffffffffffff0033
Sensitive instruction testing done…
Info: SampleEnclave successfully returned.
Enter a character before exit …

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/sgxbug# ./app 4
Got senstive instruction idx: 4
Checksum(0x0x7ffcfe43bee0, 100) = 0xfffd4143
Info: executing thread synchronization, please wait…
Start sensitive instruction testing…
TR: ffffffffffff0040
Sensitive instruction testing done…
Info: SampleEnclave successfully returned.
Enter a character before exit …

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/sgxbug#

(Maybe not) Surprisingly, only SMWS is working smoothly but also SGDT, SLDT, SIDT, and STR. In summary, there is no any limitation for code in the enclave to execute these security sensitive instrucitons.

4. UMIP

Now what? It turns out the latest Intel CPU has a thing called UMIP – User Mode Instruction Prevention [5], which as its name implied can block some security sensitive instructions running at ring-3. The blocked instructions currently include all the 5 instructions mentioned above. That means, once UMIP is enabled, the user-space application (or malware) is not able to run these instructions anymore. This is really good in my opinion and I would recommend enabling this if possible to reduce the attack surface from the user space. Due to this reason, we add some new features to the CHIPSEC tool [9], detecting the UMIP feature, checking if the feature enabled, and enabling the UMIP for all cores if possible. For implementation details, please refer to the commit (https://github.com/daveti/chipsec/commit/0da5abf7e41b7fd9750a1a559db8b2f427745f88) and the commit (https://github.com/daveti/chipsec/commit/3c9b04903a616040e93e59260f313e1ff018eed6).

5. What if…

With the knowledge of UMIP, we can start thinking what would happen for SKL012 when UMIP is enabled. We have verified that without UMIP, all these instructions work in the enclave. Will they work again when UMIP is enabled? Unfortunately, my SGX CPU is apparently not “latest” enough to have such a feature:

root@sgx2-HP-ENVY-x360-m6-Convertible:~/git/chipsec# chipsec_util cpu umip detect

#################################################################
### ##
### CHIPSEC: Platform Hardware Security Assessment Framework ##
### ##
#################################################################
[CHIPSEC] Version 1.2.5
****** Chipsec Linux Kernel module is licensed under GPL 2.0
[CHIPSEC] API mode: using CHIPSEC kernel module API
[CHIPSEC] Executing command ‘cpu’ with args [‘umip’, ‘detect’]

[cpu] CPUID out: EAX=0x00000000, EBX=0x029C67AF, ECX=0x00000000, EDX=0x00000000
[CHIPSEC] UMIP available for CPU: False
[CHIPSEC] (cpu) time elapsed 0.000

Here is what I imagined: Because of SKL012 bug in SGX, it is possible that UMIP could not prevent the execution of these security sensitive instructions within the enclave. If this is true, with UMIP enabled, malware would need to use SGX to guarantee the exeution of these (e.g., VM detection).

Reference:

[1]http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/desktop-6th-gen-core-family-spec-update.pdf
[2]https://www.blackhat.com/docs/us-16/materials/us-16-Aumasson-SGX-Secure-Enclaves-In-Practice-Security-And-Crypto-Review.pdf
[3]https://www.usenix.org/legacy/publications/library/proceedings/sec2000/full_papers/robin/robin_html/index.html
[4]http://hypervsir.blogspot.com/2014/10/kernel-information-leak-with.html
[5]https://lkml.org/lkml/2016/11/8/68
[6]http://charette.no-ip.com:81/programming/2009-12-30_Virtualization/2009-12-30_Virtualization.c
[7]https://github.com/daveti/sgxbug
[8]https://github.com/01org/linux-sgx
[9]https://github.com/daveti/chipsec

Posted in Security | Tagged , , , , , , | Leave a comment

getdelays – get delay accounting information from the kernel

Top may be the most common tool in use whenever a preformance issue is hit. It is simple, quick and dumb. Besides the heavy metal stuffs like perf and gprof, another really useful and simple tool is getdelays, which provides the latency statistics per process/task for CPU, memory, and I/O.

1. Where to get it

https://www.kernel.org/doc/Documentation/accounting/getdelays.c
As mentioned in the comment, need to compile it with:

gcc -I/usr/src/linux/include getdelays.c -o getdelays

Since it uses the netlink socket, it requires root permission to run as well.

2. What it does

Getdelays does a simple job – creating a netlink socket, sending a request to the kernel for reading the task statistics, and printing out the reply. Essentially, this netlink socket exposes the kernel taskstats structure to the user space. For more information about taskstats struct, please refer to
https://lxr.missinglinkelectronics.com/linux/include/uapi/linux/taskstats.h#L177

3. How it looks like

image00

In the example above, it shows the delay information for httpd, which seems working fine without any memory or I/O issues, except some minor delays from CPU since it is a background process rather than an interactive shell. If the application has shown some latency issues, getdelays should be able to show some numbers in “delay total” and “delay average”, which should be helpful to limit the scope of the performance issue to CPU, memory or I/O.

4. Note

Cannot expect too much from getdelays, which simply prints some counts in the kernel, and should be enough to know where the problem would be. To find the performance bottleneck, strace/ltrace/dtrace/lttng/perf/gprof should be considered as the next step.

Posted in OS, Programming | Tagged , , , , , | Leave a comment

Making USB Great Again with USBFILTER – a USB layer firewall in the Linux kernel

USENIX Security '16

Our paper “Making USB Great Again with USBFILTER” has been accepted by USENIX Security’16. This post provides a summary of usbfilter. For details, please read the damn paper or download the presentation video/slides from USENIX website. I will head to TX next week, and see you there~

0. Why USB is not great anymore?

We CANNOT trust a USB device from its appearance anymore. One of the typical BadUSB attacks is a USB drive with a keyboard functionality to inject malicious script into the host machine once plugged into. The root cause of the problem is that (almost) everyone can change the USB device firmware to add new functionalities as desired. And people would just plug in the USB flash drives found somewhere for curiorsities (“Users Really Do Plug in USB Drives They Find” Oakland’16). Even worse, this also puts enterprise infrastructure in danger – however powerful the networking firewall would be, a suspecious USB device used by an employee can turn everything into vain. As a result, enterprise settings usually forbid the usage the external USB devices except the original keyboards/mouses. For most normal users, we just ignore these threats or try to plug in unknown USB devices into someone else’s machines…(this is how friendship breaks). Note that cellphones are also USB devices, and what would you do when someone needs to charge his/her phone using your machine?

1. Our solution – usbfilter

The more we play with USB, the more we realize that it is just another transport protocol for USB devices, like TCP/IP for networking devices. Moreover, it is USB packets trasmitted between the USB host controller and devices. Inspired by the netfilter in the Linux kernel, we then made up something like below, and all we need is to make it work.

image00

2. The design and implementation of usbfilter

One of the key features of usbfilter is its ability to trace the USB packet back to its originating process. This is non trivial. For instance, because of the generic block layer and the I/O scheduler within the kernel, all USB packets operating (read/write) the USB storage devices are handled by the usb-storage kernel thread for performance considerations. Similarly, USB networking devices usually have their own Rx/Tx queue to buffer skb (IP packet) before it is encapulated by the USB stack in their drivers. Because usbfilter works at the lowest level of the USB abstraction in the kernel, the pid it can sees usually is either from a kernel thread (device drivers) or an IRQ context (null). As one can imagine, we hacked into different subsystems of the kernel, and saved the originating pid into the urb (USB packet) before it was lost due to asynchronized I/O. Once we fix that, we have a more concrete picture of usbfilter:

image00

Now all we need to do is to implement a user-space tool, which is called usbtables, to communicate with the usbfilter component in the kernel, and enforce rules/policies pushed from the user-space. To make sure no conflictness/contradictiveness within rules, usbtables also has an internal Prolog engine to reason about each new rule before it is pushed into the kernel.

image00

3. So what can usbfilter do?

Here is the fun part. We list a bunch of cool use cases here. For a complete list of case studies, please refer to our paper. In general, just like iptables, with the help of usbtables, users can write rules to regulate the functionalities of USB devices.

A Listen-only USB headset

usbtables -a logitech-headset -v ifnum=2,product=
      "Logitech USB Headset",manufacturer=Logitech -k
      direction=1 -t drop

A Logitech webcam C310, which can only by used by Skype

usbtables -a skype -o uid=1001,comm=skype -v
      serial=B4482A20 -t allow
usbtables -a nowebcam -v serial=B4482A20 -t drop

A USB port dedicated for charging

usbtables -a charger -v busnum=1,portnum=4 -t drop

There are 2 possible settings for these rules, since users can use usbtables to set the default action when no rule matched. If the default action is DROP, users can use usbtables to add a whitelist, permitting certain devices with certain functionalities. This provides the strongest security guarantee since each USB device needs at least one rule to work. If the default action is ALLOW, users have to use usbtables to add a blacklist, blocking undesired functionalities from certain devices. This is less secure but provides the best usability.

4. What is LUM?

If you look at the usbfilter architecture figure again, you will notice a thing called usbfilter modules or Linux usbfilter modules (LUM). This is another powerful feature of usbfilter. Just like netfilter, usbfilter enables kernel developers to write kernel modules to look into and play with the USB packet as wished, plug in them into the usbfilter, and enable new rules using these kernel modules. Check out the example LUM in the code repo to detect the SCSI write command within the USB packet (https://github.com/daveti/usbfilter/blob/master/lum/lum_block_scsi_write.c). With the help of this LUM, one can write rules to stop data exfiltration from the host machine to a Kingston USB flash drive for user 1001:

usbtables -a nodataexfil2 -o uid=1001
      -v manufacturer=Kingston
      -l name=block_scsi_write -t drop

With default to block any SCSI write into any USB storage devices, a whitelist can help permit a limited number of trusted devices in use while preventing data exfiltration when an unknown USB storage device is plugged into.

5. Todo…

There is still a long way before the usbfilter can be officially accepted by the mainline. Applications may hang forever waiting for the response USB packet, whose request USB packet has been filtered by usbfilter, though this could be an implementation issue of applications. Some USB devices can also be stale in the kernel even if they have been unplugged already, if the USB packet used to release the resource is also filtered. Even though usbfilter has introduced a minimum overhead, using BPF may be mandatory for it to be accepted by the upstream.

6. Like it?

To download the full paper, please go to my publication page. The complete usbfilter implementation, including the usbfilter kernel for Ubuntu 14.04 LTS, the user-space tool usbtables, and the example LUM to block writings into USB storage devices are available on my github: https://github.com/daveti/usbfilter. Any questions, please go ahead to open an issue in the code repo, and I will try my best to answer it in time.

Posted in Dave's Tools, OS, Security | Tagged , , , , , , , , , , , | Leave a comment

Fedora Upgrade from 21 to 24

After almost 5 hours of upgrading, my server has been successfully upgraded from Fedora 21 to Fedora 24, which uses the latest stable kernel 4.6. There is a online post demonstrating how to upgrade from Fedora 21 to 23 using fedup. This post talks about Fedora upgrading from 21 to 24 using dnf. NOTE: please do backup your data before action!

0. yum update

This is usually not a problem for Fedora 21, whose support has expired for a long time. Anyway, run it just in case.

1. dnf

According to the Fedora official wiki (https://fedoraproject.org/wiki/DNF_system_upgrade), dnf is recommed for system upgrade. Apparently, fedup has been ditched. Here what we need are 3 dnf commands:

sudo dnf upgrade --refresh
sudo dnf install dnf-plugin-system-upgrade
sudo dnf system-upgrade download --refresh --releasever=24

The last dnf command should list any error, which blocks the upgrade. The errors I have encountered were obsolete packages which are not supported in Fedora 24 repo. As you can tell, the only way to move the upgrade is to remove all these obsolete packages, using “yum remove” + unsupported package name reported by dnf.

Once all the errors are cleaned, dnf is able to download all the required packages for Fedora 24. On my server, it was about 4GB. So, you need at least some GB left to hold all these new packages. More important, dnf requires another 5GB under root during the package installation. Make sure you make dnf happy.

2. Keys

Before dnf was able to install all new downloaded packages, I got such an error:

Couldn’t open file /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-24-x86_64

There is a bug report talking about the possibilities of this issue and corresponding fixes (https://bugzilla.redhat.com/show_bug.cgi?id=1044086#c11). However, if you find manual key importing does not work, go and take a look at /etc/pki/rpm-gpg directory. What happened to my server was simply no any key file for Fedora 24. Oops. The fix is also easy – creating the key files by ourselves. Go to https://getfedora.org/keys/ and find the key files (primary/secondary). Create these key files and symlink the x86_64 (arch of my server) with the primary. That’s it.

3. dnf again

Reboot the machine to start the upgrade:

sudo dnf system-upgrade reboot

Hint: yum is now deprecated. Run “dnf update” once you are into the new system.

Posted in Linux Distro | Tagged , , , | 4 Comments