Rowhammer Pine64

Rowhammer attacks have been well known, and gotten a lot of publications already. However, we notice that most rowhammers happened on x86 architecture due to the easy access to clflush from the user space. ARM architecture (both ARMv7 and ARMv8) is also explored by DRAMA [1], where Android Ion memory allocator is leveraged to map un-cached pages into the user space. The reason is the cache line flush instruction in ARMv7 is ring-0 instruction. However, ARMv8 exposes cache line flush instruction to the user space again, providing an opportunity to rowhammer from the user space. This post explores rowhammer attacks using ARMv8 cache line flush instructions from the user space using Pine64 board. Let’s start~

1. Setup Pine64

Pine64 [2] is a cheap ARM 64-bit SoC. The one we are using is Pine A64+, with a quad-core ARM Cortex A53 64-Bit processor, and 2G LPDDR3. Both the support for ARMv8 ISA and equipment with DDR3 memory make this board a desired target. However, setting up this board really took us a while. We have encoutered both hardware and software issues on this board – I guess this is you get what you paid:(

Issue-I: Serial console on the Ext connector does NOT work!

As a result, none of the Linux images we tried (Ubuntu, Arch, OpenSUSE, Fedora) worked. In most cases, the booting process stuck at the uboot shell, and the terminal (screen/minicom) got scrambled characters.

Solution: Use the serial console on the Euler bus [3].

serial_euler

Issue-II: Ubuntu/Arch/OpenSUSE images do NOT work!

It is really frustrating for us that major images from Pine A64 software release [4] just do not work. In the best cases, we were able to boot into the kernel but stuck at login. Or the board kept rebooting itself. It is still not clear for us why these mainstream images do not work on our board.

Solution: Use Fedora ARM aarch64 image [5]!

Although Fedora support is never official for Pine64, as one post [6] pointed out that Fedora 27 Minimum image works with some tweaks. We verified that it indeed works on our board. Further, we use Fedora 27 Workstation on our board, and it works like a charm. To easily burn a Fedora image into SD card, we recommend a Fedora host machine with the following procedures:

sudo dnf install uboot-images-armv8 fedora-arm-installer
sudo fedora-arm-image-installer --image=Fedora-Workstation-27-1.6.aarch64.raw.xz --target=pine64_plus --media=/dev/sdb --norootpass

Note that “–norootpass” is needed for Workstation since no user will be created during the kernel booting. “–resizefs” should also be useful to make full usage of the SD card. For more information about arm-image-installer, please refer to [7].

Once we boot into the Linux kernel, we should be able to see things below:

[daveti@localhost ~]$ uname -a
Linux localhost.localdomain 4.13.9-300.fc27.aarch64 #1 SMP Mon Oct 23 13:33:18 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux
[daveti@localhost ~]$ cat /proc/cpuinfo
processor : 0
BogoMIPS : 48.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4

processor : 1
BogoMIPS : 48.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4

processor : 2
BogoMIPS : 48.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4

processor : 3
BogoMIPS : 48.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4

[daveti@localhost ~]$

2. Rowhammer from the user space

One way we decided to do rowhammer is from the user space, without the knowledge of how virtual address, physical address, bank, and row are mapped inside the DRAM. Our goal is to launch this attack as a normal user without any privillge. We also limit our goal to find bit flips within DRAM, rather than developing ful exploitations. We built our tool based on Google’s rowhammer-test, extending the support for ARMv8 64-bit (aarch64) architecture.

Unlike ARMv7, ARMv8 exposes cache line flush instructions to the user space. This provides us an opportunity to find the replacement for “clflush” for ARMv8. Unfortunately, cache line flushing in ARMv8 is not that straightforward as x86. Instead of just one instruction, more than 10 instructions are needed to flush data cache. One can refer to uboot “__asm_flush_dcache_range” [8], and ARMv8-A Programmers Guide [9]. To accelerate the cache line flush, we move the computation for cache line mask out of loops, leaving 4 instructions in total (comparing to 1 clflush). The detailed code change can be found at [10]. Note that cache line mask needs to be applied to the virtual address before flushing. We tried direct “dc civac %[addr]”, and got kernel Oops:

[ 3484.162251] rowhammer_test[9281]: unhandled level 0 translation fault (11) at 0xffffffffffffffff, esr 0x92000144, in rowhammer_test[400000+2000]
[ 3484.175317] CPU: 1 PID: 9281 Comm: rowhammer_test Not tainted 4.13.9-300.fc27.aarch64 #1
[ 3484.183453] Hardware name: sunxi sunxi/sunxi, BIOS 2017.09 10/10/2017
[ 3484.189901] task: ffff800065d90000 task.stack: ffff800063d1c000
[ 3484.195853] PC is at 0x400ad8
[ 3484.198830] LR is at 0x400a8c
[ 3484.201822] pc : [<0000000000400ad8>] lr : [<0000000000400a8c>] pstate: 60000000
[ 3484.209228] sp : 0000ffffee7a9010
[ 3484.212573] x29: 0000ffffee7a9060 x28: 0000ffffee7a9050
[ 3484.217894] x27: 0000ffffee7a9010 x26: 0000ffffee7a9050
[ 3484.223232] x25: 0000ffffee7a9060 x24: 0000000000000001
[ 3484.228553] x23: 000000000000000a x22: 0000000000400000
[ 3484.233890] x21: 0000000000400e40 x20: 0000000000400f40
[ 3484.239211] x19: 0000000000420000 x18: 0000000000000020
[ 3484.244548] x17: 0000ffffa3f0d3d0 x16: 0000000000420040
[ 3484.249869] x15: 0000ffffa3ed4a60 x14: 0003c5c40caf2f6b
[ 3484.255206] x13: 00000003e8000000 x12: 0000ffffa3ed5c34
[ 3484.260543] x11: 0000ffffa3ee3dd8 x10: 0000ffffa3ed6d28
[ 3484.265862] x9 : 003b9aca00000000 x8 : 0000000000000000
[ 3484.271199] x7 : 00000000e961b8a2 x6 : 0000ffffa40730d0
[ 3484.276519] x5 : 0000ffffa4073080 x4 : 0000000000083d60
[ 3484.281855] x3 : 0000ffff9843a000 x2 : 0000ffffee7a9018
[ 3484.287176] x1 : 0000000000000000 x0 : ffffffffffffffff
[ 5689.587275] rowhammer_test[9536]: unhandled level 2 translation fault (11) at 0x00000000, esr 0x92000146, in rowhammer_test[400000+2000]
[ 5689.599618] CPU: 1 PID: 9536 Comm: rowhammer_test Not tainted 4.13.9-300.fc27.aarch64 #1
[ 5689.607753] Hardware name: sunxi sunxi/sunxi, BIOS 2017.09 10/10/2017
[ 5689.614201] task: ffff800065d95a00 task.stack: ffff800079290000
[ 5689.620154] PC is at 0x400ad8
[ 5689.623131] LR is at 0x400a8c
[ 5689.626106] pc : [<0000000000400ad8>] lr : [<0000000000400a8c>] pstate: 20000000
[ 5689.633531] sp : 0000ffffe3ae6600
[ 5689.636853] x29: 0000ffffe3ae6650 x28: 0000ffffe3ae6640
[ 5689.642190] x27: 0000ffffe3ae6600 x26: 0000ffffe3ae6640
[ 5689.647531] x25: 0000ffffe3ae6650 x24: 0000000000000001
[ 5689.652852] x23: 000000000000000a x22: 0000000000400000
[ 5689.658190] x21: 0000000000400e40 x20: 0000000000400f40
[ 5689.663509] x19: 0000000000420000 x18: 0000000000000020
[ 5689.668846] x17: 0000ffff86a133d0 x16: 0000000000420040
[ 5689.674167] x15: 0000ffff869daa60 x14: 002256158d7d8201
[ 5689.679506] x13: 00000003e8000000 x12: 0000ffff869dbc34
[ 5689.684827] x11: 0000ffff869e9dd8 x10: 0000ffff869dcd28
[ 5689.690164] x9 : 003b9aca00000000 x8 : 0000000000000000
[ 5689.695483] x7 : 00000000e961b8a2 x6 : 0000ffff86b790d0
[ 5689.700822] x5 : 0000ffff86b79080 x4 : 0000000000083d60
[ 5689.706143] x3 : 0000000000000000 x2 : 0000ffffe3ae6670
[ 5689.711481] x1 : 0000000000000000 x0 : 0000000000000000
[daveti@localhost ~]$

3. Cache line flushing is expensive!

We ran our code with cache line flushing enabled and disabled (Note that bit flips do NOT depend on cache line flushing; however, the detection of rowhammer needs it). We show some samples runs of rowhammer attacks here:

[daveti@localhost rowhammer-test]$ ./rowhammer_test
clear
cache line mask [0x3f]
Iteration 0 (after 0.00s)
Took 1007.3 ms per address set
Took 10.0728 sec in total for 10 address sets
Took 233.167 nanosec per memory access (for 43200000 memory accesses)
This gives 34310 accesses per address per 64 ms refresh period
Checking for bit flips took 1.095023 sec
Iteration 1 (after 11.17s)
Took 1006.9 ms per address set
Took 10.069 sec in total for 10 address sets
Took 233.080 nanosec per memory access (for 43200000 memory accesses)
This gives 34323 accesses per address per 64 ms refresh period
Checking for bit flips took 1.092644 sec
Iteration 2 (after 22.33s)
Took 1009.1 ms per address set
Took 10.0913 sec in total for 10 address sets
Took 233.595 nanosec per memory access (for 43200000 memory accesses)
This gives 34247 accesses per address per 64 ms refresh period
Checking for bit flips took 1.092391 sec
Iteration 3 (after 33.51s)
Took 1005.2 ms per address set
Took 10.052 sec in total for 10 address sets
Took 232.685 nanosec per memory access (for 43200000 memory accesses)
This gives 34381 accesses per address per 64 ms refresh period
Checking for bit flips took 1.081366 sec 

[daveti@localhost rowhammer-test]$ ./rowhammer_test
clear
cache line mask [0x3f]
Iteration 0 (after 0.00s)
Took 90.4 ms per address set
Took 0.903528 sec in total for 10 address sets
Took 20.915 nanosec per memory access (for 43200000 memory accesses)
This gives 382500 accesses per address per 64 ms refresh period
Checking for bit flips took 1.093423 sec
Iteration 1 (after 2.00s)
Took 121.6 ms per address set
Took 1.21641 sec in total for 10 address sets
Took 28.158 nanosec per memory access (for 43200000 memory accesses)
This gives 284113 accesses per address per 64 ms refresh period
Checking for bit flips took 1.096945 sec
Iteration 2 (after 4.31s)
Took 104.5 ms per address set
Took 1.04498 sec in total for 10 address sets
Took 24.189 nanosec per memory access (for 43200000 memory accesses)
This gives 330723 accesses per address per 64 ms refresh period
Checking for bit flips took 1.089356 sec
Iteration 3 (after 6.45s)
Took 133.6 ms per address set
Took 1.33556 sec in total for 10 address sets
Took 30.916 nanosec per memory access (for 43200000 memory accesses)
This gives 258768 accesses per address per 64 ms refresh period
Checking for bit flips took 1.089574 sec
Iteration 4 (after 8.87s)
Took 122.0 ms per address set
Took 1.22025 sec in total for 10 address sets
Took 28.247 nanosec per memory access (for 43200000 memory accesses)
This gives 283220 accesses per address per 64 ms refresh period
Checking for bit flips took 1.077736 sec 

The first one enabled cache flusing; the second disabled it. Comparing the results, the cache flushing in ARMv8 took around 200 nanosec per memory access. That means 200 CPU cycles for a 1 GHz CPU.

References:

[1] https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_pessl.pdf
[2] https://www.pine64.org/
[3] https://blog.hypriot.com/post/the-pine-a64-is-about-to-become=the-cheapest-ARM-64-bit-platform-to-run-Docker/
[4] http://wiki.pine64.org/index.php/Pine_A64_Software_Release
[5] https://alt.fedoraproject.org/alt/
[6] https://forum.pine64.org/showthread.php?tid=5314
[7] https://pagure.io/arm-image-installer
[8] https://lxr.missinglinkelectronics.com/uboot/arch/arm/cpu/armv8/cache.S#L125
[9] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/BABJDBHI.html
[10] https://github.com/daveti/rowhammer-test/commit/fd50d99c6a8a4f97044b7c99b234b9189d421940

Posted in Linux Distro, OS, Security | Tagged , , , , , | Leave a comment

“make deb-pkg” broken

Last time when I hacked the Linux kernel on Ubuntu, it was 4.4 on LTS 14.04. Time flies. Now I need to hack the kernel 4.13 again on LTS 16.04, and find the kernel build broken. It is fine that we could not have a stable kernel API since at least the ABI is stable. But what about kernel build?

1. “make deb-pkg” broken

This used to be my way to build a kernel on Ubuntu. If something wrong happens and stops the build, just fix the error and repeat the command again — it was incremental build. But now it is broken, as it launches “make clean” by default. Why? Because it needs to generate a source tar at first, which requires a “clean” to exclude all build artifacts[1]. Instead, use “make bindeb-pkg“.

2. VirtualBox build broken

I have no idea when Ubuntu decided to include VirtualBox guest/drivers into the kernel tree, but it is broken, because the source directory structure is different from the ones used by the makefiles. An “ln” based workaround is available[2], although I just hacked those makefiles to fix errors.

References:
[1] https://unix.stackexchange.com/questions/368864/re-building-linux-kernel-without-clean
[2] http://technote.thispage.me/index.php/2016/12/20/ubuntu-16-04-kernel-compile-on-default-setting/

Posted in Linux Distro | Tagged , , | 1 Comment

Running Multics on Linux (Fedora 27)

This post follows the “Multics Simulator Instructions”[1] (with some tweaks) to setup Multics simulator dps8m and run Multics on my Fedora 27. Other Linux distro (Ubuntu/Debian/Raspbian) may need some changes but basically work the same way. Experience the cutting-edge secure operating system made on 1960s by yourself!

Multics_logo

1. Install deps – libuv

While libuv can be built from source[2], it would be more easier to install the library using your package manager. For Fedora 27:

dnf install libuv libuv-devel

2. Build the simulator – dps8

Once libuv and its header files are installed, go ahead to build the simulator following the instructions[2]:

git clone git://git.code.sf.net/p/dps8m/code dps8m-code
cd dps8m-code
git checkout R1.0
make

Once the build is done, the dps8 binary can be found under dps8m-code/src/dps8.

3. Run dps8

mkdir -p $HOME/multics/linux
cp $HOME/dps8m-code/src/dps8/dps8 $HOME/multics/linux
cp $HOME/dps8m-code/src/dps8/Devices.txt $HOME/multics/linux
$HOME/multics/linux/dps8

This should start the simulator. Type “quit” to exit the simulator:

[daveti@daveti linux]$ ./dps8
DPS8/M emulator (git b7a50ffc)
Production build
DPS8M system session id is 18555
Please register your system at https://ringzero.wikidot.com/wiki:register
or create the file ‘serial.txt’ containing the line ‘sn: 0’.
FNP telnet server port set to 6180

DPS8M simulator V4.0-0 Beta git commit id: c420925a
sim> quit
Goodbye
[daveti@daveti linux]$

4. Run Multics

Download the QuickStart disk image, unzip it, and mv/rename disk image directory as multics/qs:

wget https://s3.amazonaws.com/eswenson-multics/public/releases/MR12.6f/QuickStart_MR12.6f.zip
unzip $HOME/QuickStart*.zip
mv $HOME/QuickStart_MR12.6f $HOME/multics/qs
cd $HOME/multics/qs
../linux/dps8 MR12.6f_boot.ini

The whole booting process takes a while, but you should be able to see something below if Multics starts correctly:

[daveti@daveti qs]$ ../linux/dps8 MR12.6f_boot.ini
DPS8/M emulator (git b7a50ffc)
Production build
DPS8M system session id is 18555
Please register your system at https://ringzero.wikidot.com/wiki:register
or create the file ‘serial.txt’ containing the line ‘sn: 0’.
FNP telnet server port set to 6180

DPS8M simulator V4.0-0 Beta git commit id: c420925a
TAPE: unit is read only
CONSOLE: ALERT
bootload_0: Booting system MR12.6f generated 01/09/17 1119.1 pst Mon.
0426.6 announce_chwm: 428. pages used of 512. in wired environment.
0426.6 announce_chwm: 706. words used of 1024. in int_unpaged_page_tables.
find_rpv_subsystem: Enter RPV data: M-> [auto-input] rpv a11 ipc 3381 0a

0426.7 load_mst: 946. out of 1048. pages used in disk mst area.
bce (early) 0426.7: M-> [auto-input] bce

Multics Y2K. System was last shudown/ESD at:
Wednesday, January 17, 2018 17:07:31 pst
Current system time is: Wednesday, January 17, 2018 20:26:43 pst.
Is this correct? M-> [auto-input] yes

The current time is more than the supplied boot_delta hours beyond the
unmounted time recorded in the RPV label. Is this correct? M-> [auto-input] yes

bce (boot) 2026.7: M-> [auto-input] yes

bce: Unrecognizable request. Type lr for a list of requests.
bce (boot) 2026.7: M-> [auto-input] boot star

Multics MR12.6f – 01/17/18 2027.0 pst Wed
2027.0 Loading FNP d, >user_dir_dir>SysAdmin>a>mcs.7.6c>site_mcs 7.6c
Received BOOTLOAD command…
listening to 6180
2027.0 FNP d loaded successfully

scavenge_vol: No volumes found
Ready
2027 as as_init_: Multics MR12.6f; Answering Service 17.0
2027 as LOGIN IO.SysDaemon dmn cord (create)
2027 as LOGIN Backup.SysDaemon dmn bk (create)
2027 as LOGIN IO.SysDaemon dmn prta (create)
2027 as LOGIN Utility.SysDaemon dmn ut (create)
2027 as LOGIN Volume_Dumper.Daemon dmn vinc (create)
fnpuv_open_slave 3.31
listening on port 6131
2027 as as_mcs_mpx_: Load signalled for FNP d.
M-> CONSOLE: RELEASED

2027 cord Enter command: coordinator, driver, or logout:
–> cord
2027 bk
2027 prta Enter command: coordinator, driver, or logout:
–> prta
2027 ut copy_dump: Attempt to re-copy an invalid dump.
2027 bk r 20:27 0.446 37
2027 bk
–> bk
2027 vinc
2027 vinc r 20:27 0.450 25
2027 vinc
–> vinc
2027 as sc_admin_command_: Utility.SysDaemon.z: delete_old_pdds
2027 ut send_admin_command: Execution started …
2027 ut completed.
2027 ut monitor_quota: The requested action was not performed.
2027 ut The quota of >dumps is 0, a record limit needs to be specified.
2027 ut
2027 ut Records Left % VTOCEs Left % PB/PD LV Name
2027 ut
2027 ut 166172 100086 60 42218 34130 81 pb root
2027 ut
2027 ut r 20:27 2.481 498
2027 ut
–> ut
2027.2 RCP: Attached tapa_00 for Utility.SysDaemon.z
2027.2 RCP: Detached tapa_00 from Utility.SysDaemon.z
2027.2 RCP: Attached rdra for Utility.SysDaemon.z
2027.2 RCP: Detached rdra from Utility.SysDaemon.z
2027.2 RCP: Attached puna for Utility.SysDaemon.z
2027.2 RCP: Detached puna from Utility.SysDaemon.z
2027.2 RCP: Attached prta for Utility.SysDaemon.z
2027.2 RCP: Detached prta from Utility.SysDaemon.z

Once we have Multics running, we can try to login the system with the help of telnet and the login command in a new terminal. For the first-time login as “Repair” user, the default passwd is “repair”. A new passwd is asked immediately by the system used to replace the old passwd. NOTE: you need to create a new passwd everytime logging into the system! Once logged in, type simple “who” command, and then “logout”:

[daveti@daveti ~]$ telnet localhost 6180
Trying ::1…
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.
HSLA Port (d.h000,d.h001,d.h002,d.h003,d.h004,d.h005,d.h006,d.h007,d.h008,d.h009,d.h010,d.h011,d.h012,d.h013,d.h014,d.h015,d.h016,d.h017,d.h018,d.h019,d.h020,d.h021,d.h022,d.h023,d.h024,d.h025,d.h026,d.h027,d.h028,d.h029)?
Attached to line d.h000

Multics MR12.6f: Installation and location (Channel d.h000)
Load = 5.0 out of 90.0 units: users = 5, 01/17/18 2035.3 pst Wed

login Repair -cpw
Password:
New Password:
New Password Again:
Password changed.
You are protected from preemption.
Repair.SysAdmin logged in 01/17/18 2035.6 pst Wed from ASCII terminal “none”.

New messages in message_of_the_day:

Welcome to the Multics System.

print_motd: Created >user_dir_dir>SysAdmin>Repair>Repair.value.
r 20:35 0.469 33

who -a -lg

Multics MR12.6f; Installation and location
Load = 6.0 out of 90.0 units; users = 6, 1 interactive, 5 daemons.
Absentee users = 0 background; Max background absentee users = 3
System up since 01/17/18 2027.0 pst Wed
Last shutdown was at 01/17/18 1707.4 pst Wed

Login at TTY Load User ID

01/17/18 20:27 cord 1.0 IO.SysDaemon
20:27 bk 1.0 Backup.SysDaemon
20:27 prta 1.0 IO.SysDaemon
20:27 ut 1.0 Utility.SysDaemon
20:27 vinc 1.0 Volume_Dumper.Daemon
20:35 none 1.0 Repair.SysAdmin

r 20:36 0.102 3

logout
Repair.SysAdmin logged out 01/17/18 2036.8 pst Wed
CPU usage 1 sec, memory usage 0.1 units, cost $0.07.
hangup
Multics has disconnected you
Connection closed by foreign host.

Meanwhile, you should be able to see the logging information on the Multics ternimal:

CONNECT 127.0.0.1
<>CONNECT 127.0.0.1 to d.h000
2035 as LOGIN Repair.SysAdmin int d.h000 (create)
2036 as LOGOUT Repair.SysAdmin.a int d.h000 0: 0 $0.07 (logo)
DISCONNECT d.d000
CONNECT 127.0.0.1

5. Multics 101

All commands can be found at [3]. Go ahead to try “ls”, “who”, “date”, “time”, and etc. There is also an Emacs available (no Vi apparently). The command mapping between Multics and Linux may also be useful to have a quick understanding of Multics commands[4]. For instance, if you want to take a look at the source file of Multics (HINT: neither B nor C):

cwd >ldd>mcs>source
pr ic_sampler.map355

(Another Hint: Ctrl-C can be your friend~)

6. Now what?

Once we “logout”, it is time to learn how to quit Multics (gracefully). Yes, you need to learn how to quit – take a look at [5] and use 5 ESCs and 5 commands to shutdown Multics:

M-> logout * * *

Ready
M->
CONSOLE: RELEASED

2110 as LOGOUT IO.SysDaemon.z dmn cord 0: 0 $0.03 (looc)
2110 as LOGOUT Backup.SysDaemon.z dmn bk 0: 0 $0.04 (looc)
2110 as LOGOUT IO.SysDaemon.z dmn prta 0: 0 $0.03 (looc)
2110 as LOGOUT Utility.SysDaemon.z dmn ut 0: 3 $1.90 (looc)
2110 as LOGOUT Volume_Dumper.Daemon.z dmn vinc 0: 0 $0.04 (looc)
M-> shut

2110 as act_ctl_: shutdown, 8 0.05 0.05 0.00 0.02 0:3:21 $16.99
2110.9 shutdown complete
DBG(16336227444863)> ERR ERR: Need status register data format
DBG(16336227444911)> ERR ERR: doPayloadChan expected IDCW 10 (12)
bce (boot) 2110.9: M-> die

Do you really wish bce to die? M-> yes

BCE DIS causes CPU halt

simCycles = 16336308761570

cpuCycles = 698896257
Timer runout faults = 21587
Derail faults = 1
Lockup faults = 65
Connect faults = 7242
Illegal procedure faults = 5
Directed fault 0 faults = 754
Directed fault 1 faults = 6926
Access violation faults = 370
Fault tag 2 faults = 3052

Halt, IC: 000012 (000000000000)
sim> quit
Goodbye
[daveti@daveti qs]$

References:

[1] http://multicians.org/sim-inst.html
[2] http://ringzero.wikidot.com/r1-build-from-git
[3] http://multicians.org/multics-commands.html
[4] http://swenson.org/multics_wiki/index.php?title=Linux-to-Multics_Command_Mapping
[5] http://multicians.org/sim-inst-linux.html

Posted in OS, Security | Tagged , , , | 1 Comment

Some notes on the Monotonic Counter in Intel SGX and ME

SGX sealing is vulnerable to rollback attacks as the enclave is not able to tell if the sealed data is the latest or a old copy. To mitigate this attack, monotonic counter (MC) has been introduced in Intel SGX SDK 1.8. This post looks into some implementation details inside Intel SGX SDK.

1. sgx_create_monotonic_counter[1]

Look at the description of this API, two things show up – first, the counter creation involves non-volatile memory (NVM) writing; second, limited number of MCs can be created per enclave due to the quota. Apparently, unlike the older version of Intel SGX SDK, where MC was only available in the simulation mode, the MC here is based on real NVM, which also seems to have limited capacity. Let’s go deeper.

2. PSE (Platform Service/Security Enclave?)

The MC creation essentially calls sgx_create_monotonic_counter_ex[2] inside SDK. This function builds two pse_message_t messages for the request and reply, which are both passed to crypt_invoke. The crypt_invoke function establishes a PSE session (secure and authenticated channel) with aesmd, encrypts the request using the shared secret, and calls uae_invoke_service with both the request and reply passed, which eventually calls invoke_service_ocall[3] in the PSE provided by PSW other than the SDK.

3. OAL (OS Abstraction Layer?)

The PSE request and reply are then delivered to oal_invoke_service[4], which essentially transforms the PSE messages into AEInvokeServiceRequest and AEInvokeServiceResponse, and sends the AE messages to aesmd via UNIX socket ‘/var/run/aesmd/aesm.socket’. This MC creation request will then be handled by PSE_MC_SERVICE as a PSE_MC_CREATE command[1]

4. PSE_OP (PSE OPeration?)

The service handler table defined in pse_op.cpp[5] lists all the MC related command PSE_MC_SERVICE supports:

static const struct service_handler_t
{
    uint16_t  service_id;
    uint16_t  service_cmd;
    uint16_t  req_size;
    uint16_t  resp_size;
    srv_pfn_t srv_pfn;
} service_handler[] = {
    {PSE_MC_SERVICE, PSE_MC_CREATE, sizeof(pse_mc_create_req_t), sizeof(pse_mc_create_resp_t), pse_mc_create},
    {PSE_MC_SERVICE, PSE_MC_READ,   sizeof(pse_mc_read_req_t),   sizeof(pse_mc_read_resp_t),   pse_mc_read  },
    {PSE_MC_SERVICE, PSE_MC_INC,    sizeof(pse_mc_inc_req_t),    sizeof(pse_mc_inc_resp_t),    pse_mc_inc   },
    {PSE_MC_SERVICE, PSE_MC_DEL,    sizeof(pse_mc_del_req_t),    sizeof(pse_mc_del_resp_t),    pse_mc_del   },
    {PSE_TRUSTED_TIME_SERVICE, PSE_TIMER_READ,   sizeof(pse_timer_read_req_t), sizeof(pse_timer_read_resp_t), pse_read_timer},
};

And pse_mc_create is our target handler.

5. VMC (Virtual Monotonic Counter?)

Now we are into create_vmc[6], described as “create a VMC in SQLite Database and return UUID of the VMC to the caller”. Huh, so the MC is essentially an entry in a SQLite DB. So how could a SQLite DB defend against rollback attacks given that it is just a normal file on the disk? Hang on. Let’s continue. When this is the first MC creation request ever happend, we need to create the VMC DB first, via initialize_sqlite_database_file[7]. This very first thing it does is to call read_rpdata[8]. As its file name implies, the SQLite DB accesses hardware (NV) to get the hash of the DB at certain epoch, which is essentailly a HW MC. We are getting close to the real HW…

6. PSDA_Service[9]

This file contains all functions used by the VMC SQLite DB to create/delete/update HW MC in the so-far-unknown HW via invoke_psda_service. An ephemeral session is established in the function to provide secure and authenticated communication with the mystery HW, through psda_invoke_service_ocall[10], and eventually psda_invoke_service[11].

7. Intel ME

Just looking at the psda_invoke_service is really insightful:

    JVM_COMM_BUFFER commBuf;
    commBuf.TxBuf->buffer = psda_req_msg;
    commBuf.TxBuf->length = psda_req_msg_size;
    commBuf.RxBuf->buffer = psda_resp_msg;
    commBuf.RxBuf->length = psda_resp_msg_size;
    int response_code;
    ae_error_t ret;

    PROFILE_START("JHI_SendAndRecv2");
    ret = PSDAService::instance().send_and_recv(
                        PSDA_COMMAND_SERVICE,
                        &commBuf,
                        &response_code,
                        NO_RETRY_ON_SESSION_LOSS);
    PROFILE_END("JHI_SendAndRecv2");

JVM? JHI? Finally, we have reached the dark side of the MC. Somehow some Java shit involves in communicating with the secret HW[13] — Yes, it is Intel ME! JHI stands for Java Host Interface, which is used to install Java applets into Dynamic Application Loader (DAL) firmware (inside ME) and communicate with them[12]. Yes, the SGX MC is essentially implemented in ME as a Java applet called “PSDA”!

8. PSDA.dalp

PSDAService::install_psda[14] installs the psda applet into ME using JHI. Not surprisingly, the source code for psda is not available. Instead, the download_prebuilt.sh script is used to download the applet binary directly, since it needs to signed by Intel — Right! Intel does not want anyone to install any applet into ME, for security reasons!!! After running the download script, PSDA.dalp will appear under psw/ae/data/prebuilt. It is encoded as Base64:

image1

Copying out the Base64 encoded applet and running decoding, we have a chance to peek into the implementation of the applet:

com.intel.cryptocom.intel.langutilcom.intel.sgx.psdacom.intel.sgx.psda.commoncom.intel.sgx.psda.servicecom.intel.sgx.psda.service.base&com.intel.sgx.psda.service.entity.base)com.intel.sgx.psda.service.entity.command)com.intel.sgx.psda.service.entity.message)com.intel.sgx.psda.service.entity.storagecom.intel.sgx.psda.sessioncom.intel.sgx.psda.util
                                             com.intel.uicom.intel.util	java.langPlatformServiceDalAppletBytesArrayInputStreamBytesArrayOutputStream
                                  CodeWrapper	Constants#Constants$CSE_PLATFORM_CAPABILITIES!Constants$CSE_SERVICE_RESP_STATUS!Constants$EPHEMERAL_SESSION_STATEConstants$LT_SESSION_STATEConstants$MEDIAPATH_IDonstants$PSDA_COMMARPDataServiceServiceFactoryServiceMessageFactoryTrustedTimeServiceServiceHandlerPSDACommonMessage$PSDA_SERVICE_ID
                                                                                                 PSDAMessagePSDASerializableServiceCommandCSEUnknownReqResponsePRTCReadRequestPRTCReadResponse"ProtectedOutputSetStreamKeyRequest#ProtectedOutputSetStreamKeyResponseRPDataReadRequestRPDataResetRequestRPDataResponseRPDataUpdateRequestCapabilityRespMessageCertificateInfoRespMessageEphSessionMessageLTSessionMessagePlatformInfoReqMessageServiceMessagePSEPairingInfo
   SEFlashData
EphSessionEphSessionManager	LTSessionLTSessionManager
                                                         CalendarUtil
                                                                     DALApiProxy	DebugUtil	FlashUtilSGXMonotonicCounterProxyCryptoException	EpidAlgExHashAlgIllegalParameterExceptionNotSupportedExceptionOperationFailedExceptionRandom
SigmaAlgExSymmetricBlockCipherAlgSymmetricSignatureAlg
TypeConverterProtectedOutput
                            UiExceptioniIllegalParameterExceptionUiIllegalUseExceptioCalendar
DebugPrint
          FlashStorageFlashWearoutException
                                           IOException
UtilExceptionimeExceptionObjectString                 IntelAppletSEMonotonicCounter
StringBuilder   Throwable            StringBuffer
                         API_VERSION
                                    AUDIO_OUTPUT!CAP_PROTECTED_OUTPUT_KEY_EXCHANGE
CAP_RPDATACAP_TRUSTED_TIME
EPID_GID_SIZEERROR_CAP_NOT_AVAILABLEERROR_INTERNALERROR_INVALID_PARAM%ERROR_PERSISTENT_DATA_WRITE_THROTTLEDERROR_UNKNOWN_REQFLASH_STRUCT_VERSIONHMAC_SHA256_SIZEINSTANCE_ID_LEINT_SIZELT_SESSION_STATE_ESTABLISHEDLT_SESSION_STATE_IN_PROGRESSLT_SESSION_STATE_NOT_EXIST
                                    MAC_KEY_SIZEMAX_SESSION_COUNTMC_EPOCH_LENGTH
NONCE_SIZEPAYLOAD_IV_SIZEPAYLOAD_MAC_SIZEPRODUCT_TYPE_PSEPROTECTED_OUTPUT_STREAMKEY_SIZEPR_SIZEPSDA_BAD_PARAMETERPSDA_COMMAND_EPPSDA_COMMAND_LTPSDA_COMMAND_SERVICE PSDA_ILLEGAL_PARAMETER_EXCEPTIONPSDA_INTEGRITY_ERRORPSDA_INTERNAL_ERRORPSDA_INVALID_COMMANDPSDA_INVALID_SESSION_STATEPSDA_LT_PAIRING_NOT_EXISTPSDA_MSG_TYPE_CAP_QUERYPSDA_MSG_TYPE_CAP_RESULTPSDA_MSG_TYPE_CERT_INFO_QUERYPSDA_MSG_TYPE_CERT_INFO_RESULTPSDA_MSG_TYPE_EP_M1PSDA_MSG_TYPE_EP_M2PSDA_MSG_TYPE_EP_M3PSDA_MSG_TYPE_EP_M4PSDA_MSG_TYPE_LT_M1PSDA_MSG_TYPE_LT_M2PSDA_MSG_TYPE_LT_M3PSDA_MSG_TYPE_LT_M4PSDA_MSG_TYPE_SERV_REQPSDA_MSG_TYPE_SERV_RESPPSDA_NOT_PROVISIONEDPSDA_NOT_SUPPORTED_EXCEPTION$PSDA_PERSISTENT_DATA_WRITE_THROTTLEDPSDA_PLATFORM_KEYS_REVOKEDSDA_PROTOCOL_NOT_SUPPORTEDPSDA_SEQNO_CHECK_FAIL
                                                                                   PSDA_SUCCESSPSDA_UNKNOWN_REQEUSTPSDA_UTIL_EXCEPTIONPSE_PROTECTED_OUTPUT_SERVICE PSE_PROTECTED_OUTPUT_SETSTEAMKEYPSE_RPDATA_READPSE_RPDATA_RESETPSE_RPDATA_SERVICEPSE_RPDATA_UPDATEPSE_TIMER_READPSE_TRUSED_TIME_SERVICEPS_COMMAND_INFO
                                                                                     RPDATA_SIZESESSION_ACTIVESESSION_CLOSEDSESSION_IN_PROGRESSSESSION_KEY_SIZESGX_MONOTONIC_COUNTER_TYPE_RPMBSGX_MONOTONIC_COUNTER_TYPE_RPMCSUCCESSTIMER_EPOCH_LENGTHUNIX_TIMESTAMP_2015_01_01
                                           VIDEO_OUTPUTbuffercapDescriptorVersiocertInfocodecount
                                                                                                 currentTime
                                                                                                            enableDebug
encryptKeyencryptedPayload
                          eph_sessionindeinstance
lt_sessionlt_session_state_maskm_bIsFreshPairingm_bWritePending
m_calendarm_idCsem_idPsm_macKeym_mk
m_nonceCsem_pairing_infm_seqNum
                               m_sessionKeym_sessionStatem_sigmaInstancem_sk
                                                                            mediaPathIdmkmsgBodymsgLenmsgTypemsgTypeExpRespSize
             pairing_info
pairing_sk	payloadIv
payloadMac
          payloadSize
prtc_epochpse_instance_ireservedrpEpochrp_datrp_epochrpdata
rpdata_cur
rpdata_newseqNumserialVersionUIDserviceCapDescriptor0serviceCapDescriptor1
set_time_infosgxMonotonicCounterskstatussw_instance_id
timerEpochversionwrappedEncryptKeabandonSessionclearcreateServiceCommandcreateServiceMessage	deriveIDseraseAllFlashData
                        expandBufferfinalizeEphemeralSessionfinalizeLTSessionfindPairingInfogetCurrentEpidGidgetCurrentTimegetFlashDataSize
invokeServiceisFreshPairingisSessionEstablishedonCloseonInitonInternalCommandonInternalOpenHandler
                                                                                           printBuffeprintInt
                                                                                                             printreadFlashDatareadInreadLongonprocessLTSessioreadByte	readBytes
readObjectreadPairingCounter	readShortresetresponseWithMessageseekset!setFlashStoragePropertyAntiReplay
                                                                                                          setResponse	setSeqNumsetTimesigmaSetPropertyAppendSVNsizestartEphemeralSessionstartLongTermSession
                                                                                              toByteArratoStringupdatePairingInfo
               updateRPData	writeByte
writeByteswriteFlashData_0writeFlashData_writeInt	writeLong
                                                                 writeObject
writeShortappend
bytesToInt
copyByteArraycreateortcocreateAlgcreateInstancedecryptCompletedeletedisposeenableHighUpdateFrequencyFlashencryptCofillByteArraygetEncryptedKeyRecord
getMacKeySize
getMessagegetRandomBytes
                        getS1MessagegetS1MessageLengthgetS3MessageLength
                                                                        getSecretKeygetSecretKeySizegetSessionParametergetTime
initialize
isProvisionedlengthprocessCompleteprocessS2MessagesetIVsetKey
                                                             setPropertysetResponseCode
                                                                                       shortToBytes
                                                                                                   signCompleteverifyCompletewriteFlashData*+***-*D*)*;j)*L**)**Sd*+*(***+***%*%*&*&d*)*d**d**ddd*N*Sd*+**d*****dd**S*)dd*!**U*U*R*U*S*T*S*Uddddd*6*<*=*5*;dd*Tdd*L*O*Dd*@ddddddd

9. Intel ME Again

Although we only looked at the MC creation in SGX, this post has been long enough. Last thing about SGX MC is still about Intel ME. If you have seen dmesg like “Device doesn’t have valid ME Interface”, and neither /dev/mei nor /dev/mei0 is created, it is very likely that you are running a server platform with a different ME firmware (SPS) comparing to desktops/laptops[15], and the corresponding DAL firmware is not available in SPS. This means SGX MC is NOT available for servers!!! Well, I guess this does not really matter as people are in a rush to disable ME anyway… On the other hand, implementing SGX MC using ME seems to be the most reasonable choice (at least, we are still inside the CPU package), although this makes “SGX defends against ME attacks” statement in vain.

References:

[1] https://software.intel.com/en-us/node/709160
[2] https://github.com/01org/linux-sgx/blob/master/sdk/tae_service/tae_service.cpp
[3] https://github.com/01org/linux-sgx/blob/master/psw/uae_service/sgx_uae_service.cpp
[4] https://github.com/01org/linux-sgx/blob/master/psw/uae_service/uae_wrapper/src/tae_ocall_api.cpp
[5] https://github.com/01org/linux-sgx/blob/master/psw/ae/pse/pse_op/pse_op.cpp
[6] https://github.com/01org/linux-sgx/blob/master/psw/ae/pse/pse_op/monotonic_counter_database_sqlite_rpdb.cpp
[7] https://github.com/01org/linux-sgx/blob/master/psw/ae/pse/pse_op/monotonic_counter_database_sqlite_bin_hash_tree_utility.cpp
[8] https://github.com/01org/linux-sgx/blob/master/psw/ae/pse/pse_op/monotonic_counter_database_sqlite_access_hw_mc.cpp
[9] https://github.com/01org/linux-sgx/blob/master/psw/ae/pse/pse_op/psda_service.cpp
[10] https://github.com/01org/linux-sgx/blob/master/psw/ae/aesm_service/source/pse_op/pse_op_psda_ocall.cpp
[11] https://github.com/01org/linux-sgx/blob/master/psw/ae/aesm_service/source/pse_op/PSEClass.cpp
[12] https://github.com/intel/dynamic-application-loader-host-interface
[13] https://www.slideshare.net/codeblue_jp/igor-skochinsky-enpub
[14] https://github.com/01org/linux-sgx/blob/master/psw/ae/aesm_service/source/pse_op/PSDAService.cpp
[15] https://software.intel.com/en-us/forums/intel-business-client-software-development/topic/696357

Posted in Security | Tagged , , , | 12 Comments

Why list and charlist are confusing in Elixir

This post talks about list and charlist in Elixir, and discusses one specific issue in Elixir.

Let’s look at the confusing thing in Elixir:

iex(2)> a = [7]
‘\a’
iex(3)> b = ‘\a’
‘\a’
iex(4)> a == b
true
iex(5)> is_list(a)
true
iex(6)> is_list(b)
true

First, a list is shown as a charlist in iex; second, (without surprise if we believe in iex), the charlist equals the list; last, both the list and the charlist are lists.

Why? Because charlist in represented as a list with code point(s) inside, and those encoding scheme is UTF-8 (rather than UTF-16 in Erlang). To convince yourself:

iex(7)> Enum.map(0..255, fn x -> [x] end)
[[0], [1], [2], [3], [4], [5], [6], ‘\a’, ‘\b’, ‘\t’, ‘\n’, ‘\v’, ‘\f’, ‘\r’,
[14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26],
‘\e’, [28], [29], [30], [31], ‘ ‘, ‘!’, ‘”‘, ‘#’, ‘$’, ‘%’, ‘&’, ‘\”, ‘(‘,
‘)’, ‘*’, ‘+’, ‘,’, ‘-‘, ‘.’, ‘/’, ‘0’, ‘1’, …]

Looks familiar? Yes, those are ASCII codes. So the conclusion seems to be that list and charlist are the same in Elixir. Well, this is mostly right, as we have seen, but not exactly right. Run the file below to inspect into list and charlist internal using pry:

require IEx

value_list = [7]
value_list2 = [1]
value_char = '\a'

IEx.pry

daveti@Daves-MacBook-Pro:~/elixir/list_issue$ iex list_issue.exs
Erlang/OTP 20 [erts-9.0] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Request to pry #PID<0.83.0> at list_issue.exs:7

5: value_char = ‘\a’
6:
7: IEx.pry

Allow? [Yn]
Interactive Elixir (1.5.1) – press Ctrl+C to exit (type h() ENTER for help)
pry(1)> i value_list
Term
‘\a’
Data type
List
Description
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to as
“charlists” (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
Raw representation
[7]
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(2)>
nil
pry(3)> i value_list2
Term
[1]
Data type
List
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(4)>
nil
pry(5)> i value_char
Term
‘\a’
Data type
List
Description
This is a list of integers that is printed as a sequence of characters
delimited by single quotes because all the integers in it represent valid
ASCII characters. Conventionally, such lists of integers are referred to as
“charlists” (more precisely, a charlist is a list of Unicode codepoints,
and ASCII is a subset of Unicode).
Raw representation
[7]
Reference modules
List
Implemented protocols
IEx.Info, Collectable, Enumerable, Inspect, List.Chars, String.Chars
pry(6)>

Now you see the difference between list and charlist. Both have data type list, which explains why they are equivalent. Moreover, when the numbers inside the list could be interpreted as valid ASCII code points, Elixir shows it as a charlist rather than a list, although the raw presentation is always a list.

Why this is confusing? Let’s say you wanna print out a list, e.g., [7], but what Elixir displays is ‘\a’. Although it does not affect the internal computation, e.g., map, this display issue is really annoying and confusing. In this case, Erlang seems doing sane:

Eshell V9.0 (abort with ^G)
1> [7].
[7]
2> ‘\a’.
a
3> [7] == ‘\a’.
false

Posted in Programming | Tagged , , , , | Leave a comment

A PoC of DoS attack in Elixir Actor Model

The naive way of using the Actor model in Elixir is using “receive” in a loop, which is then “spawn”d as a Erlang process. Unfortunately, a potential DoS attack could happen if the pattern matching is not coded carefully with security in mind. This post looks into this potential vulnerability, and provides a PoC implementation.

A typical “receive” block in Elixir provides different pattern matchings, like below:

receive do
      {sender_pid, {:msg, _msg}} ->
        send(sender_pid, {:ok, "gotcha"})
      _ ->
        IO.puts "WTF"
    end

Following the same order of matchings, the message goes through each matching one by one till finding a match. Otherwise, “_” (matching all) catchs whatever received. While this defensive matching all seems optional, it is vital to the security of the Erlang VM. As turns out, the message would stay in the mailbox forever, if no matching is able to “catch” the message. The potential result is DoS attack, caused by OOM (out of memory) of the mailbox.

To observe this potential DoS attack, please download the PoC (https://github.com/daveti/edos). As shown below, unmatched messages keep accumulating in the mailbox (message queue) when running edos:

msg 24645 sent
worker msg queue len {:message_queue_len, 24645}
msg 24646 sent
worker msg queue len {:message_queue_len, 24646}
msg 24647 sent
worker msg queue len {:message_queue_len, 24647}
msg 24648 sent
^C

“top” also shows the memory usage (of beam.smp) keep increasing as more messages are sent to the mailbox. This DoS could turn into DDoS easily in a distributed environment.

As usual, Elixir compiler does not complain about this during the compilation time, although it would be really useful for the compiler to give some warnings. Nothing is gonna happen in runtime as well, except the memory consumption of the mailbox, which is “invisible” to the user. Even worse, dialyxir also reports nothing regarding this potential issue.

One more thing needs to verify is whether a GenServer implementation without handle_info/2 would also cause DoS in the mailbox. I suspect it the same as above.

Posted in Programming, Security, Static Code Analysis | Tagged , , , , | Leave a comment

A pitfall of GenServer programming in Elixir

OTP as a programming platform/framework, empowers not only Erlang, but also Elixir. This post looks into a pitfall when programming GenServer in Elixir, and provides a potential solution.

All callbacks in the GenServer behavior have a limited and pre-defined possible return values. Presumbly, one should pick up the desired return value from valid ones. One example for handle_call/3 looks like below:

{:reply, reply, new_state}

But what if we screw the return value, e.g., by mistakes like below,

{:reply, reply, :ok, new_state}

where atom :ok is added as an extra parameter of the return value. One would hope the Elixir compiler would catch this simple mistake. Nevertheless, neither mix compile or mix escript.build would report this bug. And yes, this bug will be triggered when handle_call/3 is called with a “Bad Argument” exception thrown out.

While “let it crash” and “unit testing” may be good reasons for Elixir compiler to treat this bug as a runtime one rather than a compiling one, this also means this simple bug won’t be discovered util deployment if unit testing is not covered properly. And once it happens, it keeps crashing “thanks” to the fault-tolerant feature using Supervisor…

Without changes in the Elixir compiler, it is still possible to detect this bug, using dialyxir (yes, Elixir version of dialyzer for sure). “mix dialyxir” shows the detection of this bug:

done in 0m1.52s
lib/dig.ex:9: The inferred return type of handle_call/3 ({‘reply’,{‘ok’,{_,_,_,_,_}},’ok’,#{‘__struct__’:=’Elixir.Digdb’, ‘node’:=_, ‘prefix’:=_, ‘remote’:=_, ‘unit_base’:=_, ‘unit_range’:=_, ‘worker_idx’:=_, ‘zero_num’:=_}}) has nothing in common with {‘noreply’,_} | {‘noreply’,_,’hibernate’ | ‘infinity’ | non_neg_integer()} | {‘reply’,_,_} | {‘stop’,_,_} | {‘reply’,_,_,’hibernate’ | ‘infinity’ | non_neg_integer()} | {‘stop’,_,_,_}, which is the expected return type for the callback of the ‘Elixir.GenServer’ behaviour
done (warnings were emitted)

Posted in Programming, Static Code Analysis | Tagged , , , , | Leave a comment

Pitfalls in negative indexing in C

Negative indexing in C, such as a[-1], is legit, although rarely used. There are reasons (pitfalls) why negative indexing is not recommended. This post discusses these pitfalls when using negative indexing (for fun).

1. Negative indexing within the bound

#include <stdio.h>
#include <string.h>

int main()
{
	int a[10];
	int *p;
	int i;

	for (i = 0; i < 10; i++)
		a[i] = 0x12345678;
	p = &a[9];

	for (i = 0; i < 10; i++) {
		printf("a[%d]=0x%x, %p\n", i, a[i], &a[i]);
		printf("p[-%d]=0x%x, %p\n", i, p[-i], &p[-i]);
	}

	return 0;
}

In the code above, we let pointer p pointing to the end of array a, and print out each element in a using negative indexing with the help of p.

[daveti@daveti c]$ ./negidx0
a[0]=0x12345678, 0x7ffd50edad40
p[-0]=0x12345678, 0x7ffd50edad64
a[1]=0x12345678, 0x7ffd50edad44
p[-1]=0x12345678, 0x7ffd50edad60
a[2]=0x12345678, 0x7ffd50edad48
p[-2]=0x12345678, 0x7ffd50edad5c
a[3]=0x12345678, 0x7ffd50edad4c
p[-3]=0x12345678, 0x7ffd50edad58
a[4]=0x12345678, 0x7ffd50edad50
p[-4]=0x12345678, 0x7ffd50edad54
a[5]=0x12345678, 0x7ffd50edad54
p[-5]=0x12345678, 0x7ffd50edad50
a[6]=0x12345678, 0x7ffd50edad58
p[-6]=0x12345678, 0x7ffd50edad4c
a[7]=0x12345678, 0x7ffd50edad5c
p[-7]=0x12345678, 0x7ffd50edad48
a[8]=0x12345678, 0x7ffd50edad60
p[-8]=0x12345678, 0x7ffd50edad44
a[9]=0x12345678, 0x7ffd50edad64
p[-9]=0x12345678, 0x7ffd50edad40

The output proves what we would expect. Subcriptor [] simply means that E1[E2]=*((E1)+(E2)). So it does not matter whether E2 is positive or negative, and the compiler will take care of pointer type to generate the correct the result pointer (say, whether it is int or char). [1]

2. Negative indexing out of the bound

Things get complicated when we try to reference var a from var b, assuming that both a and b are next to each other…

#include <stdio.h>
#include <string.h>

int main()
{
	char a[10];
	char b[10];

	memset(a, 0x0, 10);
	memset(b, 0x1, 10);

	int i;
	printf("a=%p, b=%p\n", a, b);
	for (i = 0; i < 10; i++) {
		printf("b[%d]=%d, %p\n", i, b[i], &b[i]);
		printf("b[-%d]=%d, %p\n", i, b[-i], &b[-i]);
	}

	return 0;
}

If you, like myself, expect to see some 0/1 interleaved, then the output would let you down:

[daveti@daveti c]$ ./negidx1
a=0x7ffc4c88ecd2, b=0x7ffc4c88ecc8
b[0]=1, 0x7ffc4c88ecc8
b[-0]=1, 0x7ffc4c88ecc8
b[1]=1, 0x7ffc4c88ecc9
b[-1]=0, 0x7ffc4c88ecc7
b[2]=1, 0x7ffc4c88ecca
b[-2]=0, 0x7ffc4c88ecc6
b[3]=1, 0x7ffc4c88eccb
b[-3]=0, 0x7ffc4c88ecc5
b[4]=1, 0x7ffc4c88eccc
b[-4]=0, 0x7ffc4c88ecc4
b[5]=1, 0x7ffc4c88eccd
b[-5]=0, 0x7ffc4c88ecc3
b[6]=1, 0x7ffc4c88ecce
b[-6]=64, 0x7ffc4c88ecc2
b[7]=1, 0x7ffc4c88eccf
b[-7]=6, 0x7ffc4c88ecc1
b[8]=1, 0x7ffc4c88ecd0
b[-8]=16, 0x7ffc4c88ecc0
b[9]=1, 0x7ffc4c88ecd1
b[-9]=0, 0x7ffc4c88ecbf

With all the addresses printed out, I guess you have figured sth out. The pitfall here is the actual memory layout of a and b – since stack grows downwards (from high to low), a has higher starting address than b. That is being said, b[-x] would never reach a, which can be seen from the addresses printed out. Therefore, if we wanna reference a from b, we need positive indexing out of bound, e.g.,

#include <stdio.h>
#include <string.h>

int main()
{
	char a[10];
	char b[10];

	memset(a, 0x0, 10);
	memset(b, 0x1, 10);

	int i;
	printf("a=%p, b=%p\n", a, b);
	for (i = 0; i < 20; i++) {
		printf("b[%d]=%d, %p\n", i, b[i], &b[i]);
	}

	return 0;
}

And here is the output:

[daveti@daveti c]$ ./negidx2
a=0x7ffdbf9bc682, b=0x7ffdbf9bc678
b[0]=1, 0x7ffdbf9bc678
b[1]=1, 0x7ffdbf9bc679
b[2]=1, 0x7ffdbf9bc67a
b[3]=1, 0x7ffdbf9bc67b
b[4]=1, 0x7ffdbf9bc67c
b[5]=1, 0x7ffdbf9bc67d
b[6]=1, 0x7ffdbf9bc67e
b[7]=1, 0x7ffdbf9bc67f
b[8]=1, 0x7ffdbf9bc680
b[9]=1, 0x7ffdbf9bc681
b[10]=0, 0x7ffdbf9bc682
b[11]=0, 0x7ffdbf9bc683
b[12]=0, 0x7ffdbf9bc684
b[13]=0, 0x7ffdbf9bc685
b[14]=0, 0x7ffdbf9bc686
b[15]=0, 0x7ffdbf9bc687
b[16]=0, 0x7ffdbf9bc688
b[17]=0, 0x7ffdbf9bc689
b[18]=0, 0x7ffdbf9bc68a
b[19]=0, 0x7ffdbf9bc68b

Or, if we wanna reference b from a, we could use negative indexing out of bound, e.g.,

#include <stdio.h>
#include <string.h>

int main()
{
	char a[10];
	char b[10];

	memset(a, 0x0, 10);
	memset(b, 0x1, 10);

	int i;
	printf("a=%p, b=%p\n", a, b);
	for (i = 0; i < 10; i++) {
		printf("a[%d]=%d, %p\n", i, a[i], &a[i]);
		printf("a[-%d]=%d, %p\n", i, a[-i], &a[-i]);
	}

	return 0;
}

And here is the output:

[daveti@daveti c]$ ./negidx3
a=0x7ffece838602, b=0x7ffece8385f8
a[0]=0, 0x7ffece838602
a[-0]=0, 0x7ffece838602
a[1]=0, 0x7ffece838603
a[-1]=1, 0x7ffece838601
a[2]=0, 0x7ffece838604
a[-2]=1, 0x7ffece838600
a[3]=0, 0x7ffece838605
a[-3]=1, 0x7ffece8385ff
a[4]=0, 0x7ffece838606
a[-4]=1, 0x7ffece8385fe
a[5]=0, 0x7ffece838607
a[-5]=1, 0x7ffece8385fd
a[6]=0, 0x7ffece838608
a[-6]=1, 0x7ffece8385fc
a[7]=0, 0x7ffece838609
a[-7]=1, 0x7ffece8385fb
a[8]=0, 0x7ffece83860a
a[-8]=1, 0x7ffece8385fa
a[9]=0, 0x7ffece83860b
a[-9]=1, 0x7ffece8385f9

3. Alignment

Things get more complicated when there is memory alignment in the stack, which is pretty common.

#include <stdio.h>
#include <string.h>

typedef struct _total_made_up {
	char c0;
	int i;
	char c1[10];
} tmu;

int main()
{
	tmu a;
	tmu b;

	memset(&a, 0x0, sizeof(tmu));
	memset(&b, 0x1, sizeof(tmu));

	int i;
	printf("a=%p, a.c0=%p, a.i=%p, a.c1=%p, sizeof(a)=%d\n",
		&a, &(a.c0), &(a.i), &(a.c1), sizeof(a));
	printf("b=%p, b.c0=%p, b.i=%p, b.c1=%p, sizeof(b)=%d\n",
		&b, &(b.c0), &(b.i), &(b.c1), sizeof(b));
	for (i = 0; i < 10; i++) {
		printf("a.c1[%d]=%d, %p\n", i, (a.c1)[i], &(a.c1)[i]);
		printf("a.c1[-%d]=%d, %p\n", i, (a.c1)[-i], &(a.c1)[-i]);
	}

	return 0;
}

Here is the output:

[daveti@daveti c]$ ./negidx4
a=0x7ffc4bf3c8e0, a.c0=0x7ffc4bf3c8e0, a.i=0x7ffc4bf3c8e4, a.c1=0x7ffc4bf3c8e8, sizeof(a)=20
b=0x7ffc4bf3c8c0, b.c0=0x7ffc4bf3c8c0, b.i=0x7ffc4bf3c8c4, b.c1=0x7ffc4bf3c8c8, sizeof(b)=20
a.c1[0]=0, 0x7ffc4bf3c8e8
a.c1[-0]=0, 0x7ffc4bf3c8e8
a.c1[1]=0, 0x7ffc4bf3c8e9
a.c1[-1]=0, 0x7ffc4bf3c8e7
a.c1[2]=0, 0x7ffc4bf3c8ea
a.c1[-2]=0, 0x7ffc4bf3c8e6
a.c1[3]=0, 0x7ffc4bf3c8eb
a.c1[-3]=0, 0x7ffc4bf3c8e5
a.c1[4]=0, 0x7ffc4bf3c8ec
a.c1[-4]=0, 0x7ffc4bf3c8e4
a.c1[5]=0, 0x7ffc4bf3c8ed
a.c1[-5]=0, 0x7ffc4bf3c8e3
a.c1[6]=0, 0x7ffc4bf3c8ee
a.c1[-6]=0, 0x7ffc4bf3c8e2
a.c1[7]=0, 0x7ffc4bf3c8ef
a.c1[-7]=0, 0x7ffc4bf3c8e1
a.c1[8]=0, 0x7ffc4bf3c8f0
a.c1[-8]=0, 0x7ffc4bf3c8e0
a.c1[9]=0, 0x7ffc4bf3c8f1
a.c1[-9]=0, 0x7ffc4bf3c8df

If we compare the starting address of a and b, we can find a 32-byte difference. The size of the struct is 20 bytes, which means there is a 12-byte gap between the end of b and the start of a due to padding (alignment). Even worse, there is also padding happened within the structure, where the c0 is padded with 3 bytes to alignment with the 4-byte int (i), and c1 is padded with another 2 bytes in the end. Therefore, if we want to use a.c1 to reference b.c1 with negative indexing, the index should start from -22 (8 bytes in a, and 12 bytes in the gap, and 2 bytes padding in b.c1). You don’t see this number coming, do you:)

#include <stdio.h>
#include <string.h>

typedef struct _total_made_up {
	char c0;
	int i;
	char c1[10];
} tmu;

int main()
{
	tmu a;
	tmu b;

	memset(&a, 0x0, sizeof(tmu));
	memset(&b, 0x0, sizeof(tmu));
	b.c0 = 1;
	b.i = 1;
	memset(&b.c1, 0x1, 10);

	int i;
	printf("a=%p, a.c0=%p, a.i=%p, a.c1=%p, sizeof(a)=%d\n",
		&a, &(a.c0), &(a.i), &(a.c1), sizeof(a));
	printf("b=%p, b.c0=%p, b.i=%p, b.c1=%p, sizeof(b)=%d\n",
		&b, &(b.c0), &(b.i), &(b.c1), sizeof(b));
	for (i = 0; i < 10; i++) {
		printf("a.c1[%d]=%d, %p\n", i, (a.c1)[i], &(a.c1)[i]);
		printf("a.c1[-%d-22]=%d, %p\n", i, (a.c1)[-i-22], &(a.c1)[-i-22]);
	}

	return 0;
}

[daveti@daveti c]$ ./negidx5
a=0x7ffce988ccf0, a.c0=0x7ffce988ccf0, a.i=0x7ffce988ccf4, a.c1=0x7ffce988ccf8, sizeof(a)=20
b=0x7ffce988ccd0, b.c0=0x7ffce988ccd0, b.i=0x7ffce988ccd4, b.c1=0x7ffce988ccd8, sizeof(b)=20
a.c1[0]=0, 0x7ffce988ccf8
a.c1[-0-22]=0, 0x7ffce988cce2
a.c1[1]=0, 0x7ffce988ccf9
a.c1[-1-22]=1, 0x7ffce988cce1
a.c1[2]=0, 0x7ffce988ccfa
a.c1[-2-22]=1, 0x7ffce988cce0
a.c1[3]=0, 0x7ffce988ccfb
a.c1[-3-22]=1, 0x7ffce988ccdf
a.c1[4]=0, 0x7ffce988ccfc
a.c1[-4-22]=1, 0x7ffce988ccde
a.c1[5]=0, 0x7ffce988ccfd
a.c1[-5-22]=1, 0x7ffce988ccdd
a.c1[6]=0, 0x7ffce988ccfe
a.c1[-6-22]=1, 0x7ffce988ccdc
a.c1[7]=0, 0x7ffce988ccff
a.c1[-7-22]=1, 0x7ffce988ccdb
a.c1[8]=0, 0x7ffce988cd00
a.c1[-8-22]=1, 0x7ffce988ccda
a.c1[9]=0, 0x7ffce988cd01
a.c1[-9-22]=1, 0x7ffce988ccd9

3. Summary

We have just talked about how messed up negative indexing could be on the stack. It is also possible to do this on the heap, which is more error-prone given the memory management under the hood (malloc/kmalloc). To correctly use negative (or out of bound) indexing, one needs to know exactly what the memory layout looks like, this could be hard for stack variables, even harder for heap ones. For normal programming, in most cases, you do not need negative indexing. As a matter of fact, the only normal usage I have seen so far is from ARMv3-M software development manual, where negative indexing is used to retrieve the SVC instruction number, which is saved on the stack by the CPU core during the supervisor call.

Reference:

[1] https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c

Posted in Programming | Tagged , | Leave a comment

USB gadget functionalities in Android

I started working on Android stuffs this summer. While I mainly work on the USB layer within the Linux kernel, I do sometimes need to look into the Android framework, to see if I could achieve my goal from the Android user space. One of the questions I have got is to find the USB configuration of the phone (e.g., MTP, adb, and etc) when connected with the host machine. And here comes this post.

1. lsusb

The most straightforward way to have the USB information of the phone is to plug it into a Linux box, and run ‘lsusb -t‘ or ‘lsusb -v‘. Things get annoying when I tried to pull similar information from the phone directly.

2. adb

If you have adb with root permission, go to ‘/sys/class/android_usb/android0‘. You will find all the kernel USB gadget functionalities builtin, and the configurations, e.g., which functionalities are enabled in the phone right now. This won’t work if adb runs as ‘shell’ user.

3. USB Options

When the phone is connected with the host machine, the USB option menu would pop up providing more options: charging, MTP, PTP, and etc. This should be the easiest way in most cases, however, would fail if you are not playing with your own phone (e.g., pin protected, and yes, I am talking about “remote” phones in the device farm). It is also possible that some USB gadget functionalities are NOT exposed to the menu at all, e.g., a hiddent USB CDC/ACM functionality.

4. USB Settings

Some vendors provide secret dial codes to trigger a USB Setting menu with some combinations of different gadget functionalities for testing/debugging purpose. E.g., for Samsung Galaxy phones, use ‘*#0808#‘. Let me know if you have similar dial codes from other vendors.

5. UsbManager

Now it is time to look into the Android framework. Can we write an app to show the USB configuration/information? UsbManager.java[1] is the implementation in Android to control the USB hardware. To differentiate the USB host mode and the USB gadget mode, the Android APIs have USB Host APIs and USB Accessory API[2], both of which are managed by the UsbManager. For instance, getDeviceList() in the UsbManager returns a list of UsbDevice connected with the phone. As expected, UsbDevice class is the instantiation of a USB device defined by the USB spec. We could go and find the configuration, interfaces, and endpoints, just like what we could do with libusb. Accordingly, getAccessoryList() returns a list of UsbAccessory the phone behaves. Here is the problem – UsbAccessory is NOT abstracted as UsbDevice! The only information from UsbAccessory you could get is device information, such as manufacturer, product, serial number, and etc. In short, no interface-level information in the UsbAccessory!

Nevertheless, UsbManager may still be the answer if we look the latest implementation[3]. ‘isFunctionEnabled()’ and ‘containsFunction()’ seem to be promising to detect hiddent functionalities, which may not be shown in the USB option menu. Unfortunately, this still depends on the vendor-specific customization – whether or not the USB function is exposed to the Android framework (platform), and managed by the UsbManager.

Refs:
[1] https://developer.android.com/reference/android/hardware/usb/UsbManager.html
[2] https://developer.android.com/guide/topics/connectivity/usb/index.html
[3] https://github.com/android/platform_frameworks_base/blob/master/core/java/android/hardware/usb/UsbManager.java

Posted in OS | Tagged , , | Leave a comment

Understanding kcov – play with -fsanitize-coverage=trace-pc from the user space

kcov is a kernel feature used to support syzkaller[1]. To provide the code coverage information from the kernel itself, the GCC compiler was patched to instrument the kernel image[2]. The kernel itself was also patched to enable this feature where is propriate[3]. This post tries to reproduce the essense of kcov in the user space, in the hope of a better understanding of kcov in general.

1. -fsanitize-coverage=trace-pc

After (>=) 6.0, GCC supports this new feature/flag, which instruments every basic block generated by GCC with function “trace-pc”. This “trace-pc” function is provided by the user, and should have the name as “__sanitizer_cov_trace_pc“. For kcov, this function is implemented in kernel/kcov.c. By the time of writing, there is no user-space example available using this new flag, primarily because this feature is mainly designed for kernel/syzkaller, and gcov is available for user-space programs already. Nevertheless, we will show how to play this in the user space:)

2. First try

Let’s create a “non-trivial” testing file and a makefile, as shown below:

#include <stdio.h>
#include <string.h>

void __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

static int fun2(void)
{
	printf("fun2\n");
	return 0;
}

static void fun1(int a)
{
	fun2();
	printf("fun1: a[%d]\n", a);
}

int main(void)
{
	int a = 10;
	fun1(a);
	return 0;
}
CC=gcc
CFLAGS=-g -ggdb -fsanitize-coverage=trace-pc
LIBS=
DEPS=
OBJ=test.o

%.o: %.c $(DEPS)
	$(CC) -c -o $@ $< $(CFLAGS) $(LIBS)

test : $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm -f *.o test

Function “__sanitizer_cov_trace_pc” is defined in test.c file. The Makefile also enables the “-fsanitize-coverage” flag. We compile it, run it, and as you expected, get coredump. What the heck? Disassembling the binary gives us some hints:

0000000000400546 <__sanitizer_cov_trace_pc>::
  400546:	55                   	push   %rbp
  400547:	48 89 e5             	mov    %rsp,%rbp
  40054a:	e8 f7 ff ff ff       	callq  400546 <__sanitizer_cov_trace_pc>
  40054f:	bf 80 06 40 00       	mov    $0x400680,%edi
  400554:	e8 d7 fe ff ff       	callq  400430 <puts@plt>
  400559:	90                   	nop
  40055a:	5d                   	pop    %rbp
  40055b:	c3                   	retq

Besides main, fun1, fun2 are instrumented by __sanitizer_cov_trace_pc, __sanitizer_cov_trace_pc itself is also instrumented by the compiler, which creates a recursive bomb, and stack overflow eventually. So, we need to tell GCC not to instrument the instrumenting function itself.

3. Second try

If we look at the kernel patch[3] again, the __sanitizer_cov_trace_pc is defined with “notrace” decorator, which is essentially an attribute to let GCC skip instrumentation for the decorated function. All we need to do is to update our __sanitizer_cov_trace_pc function in test.c with this decorator:

#define notrace __attribute__((no_instrument_function))

void notrace __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}

Compile, run, and unfortunately coredump again. Disassembly shows the recursive bomb is still there. WTF!

4. Do it again

If we look at the old makefile, we have __sanitizer_cov_trace_pc defined inside the test.c and shared the same compilation flags as other functions, including “-fsanitize-coverage”. It is time to split the function into a separated file and use a different set of compilation flags. Now we have trace.c, trace.h, test.c, and a new makefile, as shown below:

#include <stdio.h>

#define notrace __attribute__((no_instrument_function))

void notrace __sanitizer_cov_trace_pc(void)
{
	printf("code instrumented...\n");
}
#define notrace __attribute__((no_instrument_function))
void notrace __sanitizer_cov_trace_pc(void);
#include <stdio.h>
#include <string.h>
#include "trace.h"

static int fun2(void)
{
	printf("fun2\n");
	return 0;
}

static void fun1(int a)
{
	fun2();
	printf("fun1: a[%d]\n", a);
}

int main(void)
{
	int a = 10;
	fun1(a);
	return 0;
}
CC=gcc
CFLAGS=-g -ggdb -fsanitize-coverage=trace-pc
LIBS=
DEPS=trace.h
OBJ=test.o trace.o

all: test

trace.o: trace.c
	$(CC) -c -o $@ $<

test.o: test.c
	$(CC) -c -o $@ $< $(CFLAGS) $(LIBS)

test : $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm -f *.o test

Compile, run, and it works:

[daveti@daveti fuzz]$ ./test
code instrumented...
code instrumented...
code instrumented...
fun2
code instrumented...
fun1: a[10]
code instrumented...

5. Trace PC

The final missing part of our user-space implementation comparing to kcov is the PC tracing functionality. Fortunately, GCC already provides a builtin command to retrieve the return address pushed on the stack. As what linux kernel does, we update trace.c to enable PC tracing:

#include <stdio.h>

#define notrace	__attribute__((no_instrument_function))
#define _RET_IP_	(unsigned long)__builtin_return_address(0)

void notrace __sanitizer_cov_trace_pc(void)
{
	//printf("code instrumented...\n");
	printf("return pc [0x%x]\n", _RET_IP_);
}

Compile, run, and use addr2line to get debugging information from these addresses. Doesn’t it look almost the same as kcov!

[daveti@daveti fuzz2]$ ./test
return pc [0x4005ab]
return pc [0x400581]
return pc [0x400554]
fun2
return pc [0x400568]
fun1: a[10]
return pc [0x4005c6]
[daveti@daveti fuzz2]$ addr2line -e ./test 0x4005ab
/home/daveti/c/fuzz2/test.c:19

6. Notes

One observation we can easily find is the overhead of this instrumentation. Since the instrumentation happens per basic block instead of functions, it is trivial to have bigger number of instrumentations than the number of actual function calls. For example, in the our example, we have 3 functions, and 5 instrumentations. This is also the reason why not every kernel subsystem/component enables kcov (besides, this instrumentation could break kernel booting). Again, this post tries to mimic what kcov does from the user space. For real user space coverage instrumentation, there is gcov[4] already.

[1] https://github.com/google/syzkaller
[2] http://gcc.1065356.n8.nabble.com/Re-Add-fuzzing-coverage-support-td1212322.html
[3] https://github.com/torvalds/linux/commit/5c9a8750a6409c63a0f01d51a9024861022f6593
[4] https://gcc.gnu.org/onlinedocs/gcc/Gcov.html

Posted in OS, Security, Stuff about Compiler | Tagged , , | Leave a comment