getdelays – get delay accounting information from the kernel

Top may be the most common tool in use whenever a preformance issue is hit. It is simple, quick and dumb. Besides the heavy metal stuffs like perf and gprof, another really useful and simple tool is getdelays, which provides the latency statistics per process/task for CPU, memory, and I/O.

1. Where to get it

https://www.kernel.org/doc/Documentation/accounting/getdelays.c
As mentioned in the comment, need to compile it with:

gcc -I/usr/src/linux/include getdelays.c -o getdelays

Since it uses the netlink socket, it requires root permission to run as well.

2. What it does

Getdelays does a simple job – creating a netlink socket, sending a request to the kernel for reading the task statistics, and printing out the reply. Essentially, this netlink socket exposes the kernel taskstats structure to the user space. For more information about taskstats struct, please refer to
https://lxr.missinglinkelectronics.com/linux/include/uapi/linux/taskstats.h#L177

3. How it looks like

image00

In the example above, it shows the delay information for httpd, which seems working fine without any memory or I/O issues, except some minor delays from CPU since it is a background process rather than an interactive shell. If the application has shown some latency issues, getdelays should be able to show some numbers in “delay total” and “delay average”, which should be helpful to limit the scope of the performance issue to CPU, memory or I/O.

4. Note

Cannot expect too much from getdelays, which simply prints some counts in the kernel, and should be enough to know where the problem would be. To find the performance bottleneck, strace/ltrace/dtrace/lttng/perf/gprof should be considered as the next step.

Posted in OS, Programming | Tagged , , , , , | Leave a comment

Making USB Great Again with USBFILTER – a USB layer firewall in the Linux kernel

USENIX Security '16

Our paper “Making USB Great Again with USBFILTER” has been accepted by USENIX Security’16. This post provides a summary of usbfilter. For details, please read the damn paper or download the presentation video/slides from USENIX website. I will head to TX next week, and see you there~

0. Why USB is not great anymore?

We CANNOT trust a USB device from its appearance anymore. One of the typical BadUSB attacks is a USB drive with a keyboard functionality to inject malicious script into the host machine once plugged into. The root cause of the problem is that (almost) everyone can change the USB device firmware to add new functionalities as desired. And people would just plug in the USB flash drives found somewhere for curiorsities (“Users Really Do Plug in USB Drives They Find” Oakland’16). Even worse, this also puts enterprise infrastructure in danger – however powerful the networking firewall would be, a suspecious USB device used by an employee can turn everything into vain. As a result, enterprise settings usually forbid the usage the external USB devices except the original keyboards/mouses. For most normal users, we just ignore these threats or try to plug in unknown USB devices into someone else’s machines…(this is how friendship breaks). Note that cellphones are also USB devices, and what would you do when someone needs to charge his/her phone using your machine?

1. Our solution – usbfilter

The more we play with USB, the more we realize that it is just another transport protocol for USB devices, like TCP/IP for networking devices. Moreover, it is USB packets trasmitted between the USB host controller and devices. Inspired by the netfilter in the Linux kernel, we then made up something like below, and all we need is to make it work.

image00

2. The design and implementation of usbfilter

One of the key features of usbfilter is its ability to trace the USB packet back to its originating process. This is non trivial. For instance, because of the generic block layer and the I/O scheduler within the kernel, all USB packets operating (read/write) the USB storage devices are handled by the usb-storage kernel thread for performance considerations. Similarly, USB networking devices usually have their own Rx/Tx queue to buffer skb (IP packet) before it is encapulated by the USB stack in their drivers. Because usbfilter works at the lowest level of the USB abstraction in the kernel, the pid it can sees usually is either from a kernel thread (device drivers) or an IRQ context (null). As one can imagine, we hacked into different subsystems of the kernel, and saved the originating pid into the urb (USB packet) before it was lost due to asynchronized I/O. Once we fix that, we have a more concrete picture of usbfilter:

image00

Now all we need to do is to implement a user-space tool, which is called usbtables, to communicate with the usbfilter component in the kernel, and enforce rules/policies pushed from the user-space. To make sure no conflictness/contradictiveness within rules, usbtables also has an internal Prolog engine to reason about each new rule before it is pushed into the kernel.

image00

3. So what can usbfilter do?

Here is the fun part. We list a bunch of cool use cases here. For a complete list of case studies, please refer to our paper. In general, just like iptables, with the help of usbtables, users can write rules to regulate the functionalities of USB devices.

A Listen-only USB headset

usbtables -a logitech-headset -v ifnum=2,product=
      "Logitech USB Headset",manufacturer=Logitech -k
      direction=1 -t drop

A Logitech webcam C310, which can only by used by Skype

usbtables -a skype -o uid=1001,comm=skype -v
      serial=B4482A20 -t allow
usbtables -a nowebcam -v serial=B4482A20 -t drop

A USB port dedicated for charging

usbtables -a charger -v busnum=1,portnum=4 -t drop

There are 2 possible settings for these rules, since users can use usbtables to set the default action when no rule matched. If the default action is DROP, users can use usbtables to add a whitelist, permitting certain devices with certain functionalities. This provides the strongest security guarantee since each USB device needs at least one rule to work. If the default action is ALLOW, users have to use usbtables to add a blacklist, blocking undesired functionalities from certain devices. This is less secure but provides the best usability.

4. What is LUM?

If you look at the usbfilter architecture figure again, you will notice a thing called usbfilter modules or Linux usbfilter modules (LUM). This is another powerful feature of usbfilter. Just like netfilter, usbfilter enables kernel developers to write kernel modules to look into and play with the USB packet as wished, plug in them into the usbfilter, and enable new rules using these kernel modules. Check out the example LUM in the code repo to detect the SCSI write command within the USB packet (https://github.com/daveti/usbfilter/blob/master/lum/lum_block_scsi_write.c). With the help of this LUM, one can write rules to stop data exfiltration from the host machine to a Kingston USB flash drive for user 1001:

usbtables -a nodataexfil2 -o uid=1001
      -v manufacturer=Kingston
      -l name=block_scsi_write -t drop

With default to block any SCSI write into any USB storage devices, a whitelist can help permit a limited number of trusted devices in use while preventing data exfiltration when an unknown USB storage device is plugged into.

5. Todo…

There is still a long way before the usbfilter can be officially accepted by the mainline. Applications may hang forever waiting for the response USB packet, whose request USB packet has been filtered by usbfilter, though this could be an implementation issue of applications. Some USB devices can also be stale in the kernel even if they have been unplugged already, if the USB packet used to release the resource is also filtered. Even though usbfilter has introduced a minimum overhead, using BPF may be mandatory for it to be accepted by the upstream.

6. Like it?

To download the full paper, please go to my publication page. The complete usbfilter implementation, including the usbfilter kernel for Ubuntu 14.04 LTS, the user-space tool usbtables, and the example LUM to block writings into USB storage devices are available on my github: https://github.com/daveti/usbfilter. Any questions, please go ahead to open an issue in the code repo, and I will try my best to answer it in time.

Posted in Dave's Tools, OS, Security | Tagged , , , , , , , , , , , | Leave a comment

Fedora Upgrade from 21 to 24

After almost 5 hours of upgrading, my server has been successfully upgraded from Fedora 21 to Fedora 24, which uses the latest stable kernel 4.6. There is a online post demonstrating how to upgrade from Fedora 21 to 23 using fedup. This post talks about Fedora upgrading from 21 to 24 using dnf. NOTE: please do backup your data before action!

0. yum update

This is usually not a problem for Fedora 21, whose support has expired for a long time. Anyway, run it just in case.

1. dnf

According to the Fedora official wiki (https://fedoraproject.org/wiki/DNF_system_upgrade), dnf is recommed for system upgrade. Apparently, fedup has been ditched. Here what we need are 3 dnf commands:

sudo dnf upgrade --refresh
sudo dnf install dnf-plugin-system-upgrade
sudo dnf system-upgrade download --refresh --releasever=24

The last dnf command should list any error, which blocks the upgrade. The errors I have encountered were obsolete packages which are not supported in Fedora 24 repo. As you can tell, the only way to move the upgrade is to remove all these obsolete packages, using “yum remove” + unsupported package name reported by dnf.

Once all the errors are cleaned, dnf is able to download all the required packages for Fedora 24. On my server, it was about 4GB. So, you need at least some GB left to hold all these new packages. More important, dnf requires another 5GB under root during the package installation. Make sure you make dnf happy.

2. Keys

Before dnf was able to install all new downloaded packages, I got such an error:

Couldn’t open file /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-24-x86_64

There is a bug report talking about the possibilities of this issue and corresponding fixes (https://bugzilla.redhat.com/show_bug.cgi?id=1044086#c11). However, if you find manual key importing does not work, go and take a look at /etc/pki/rpm-gpg directory. What happened to my server was simply no any key file for Fedora 24. Oops. The fix is also easy – creating the key files by ourselves. Go to https://getfedora.org/keys/ and find the key files (primary/secondary). Create these key files and symlink the x86_64 (arch of my server) with the primary. That’s it.

3. dnf again

Reboot the machine to start the upgrade:

sudo dnf system-upgrade reboot

Hint: yum is now deprecated. Run “dnf update” once you are into the new system.

Posted in Linux Dist | Tagged , , , | 4 Comments

Malware Reverse Engineering – Part II

While most tools for MRE are staightforward, some of them require time, patience, and skills to show the full power. For static analysis, this means IDA; for dynamic analysis, it is OllyDbg (and WinDbg for Windows kernel debugging). In this post, we will play disassembly code heavily with both tools. Remember – the key point of MRE is not to fully understand every line of disassembly, but rather to construct a big picture of the malware in a high-level programming language, e.g., C/C++. If you have a Hex-Rays decompiler already, use it to make your life easier. Otherwise, read this post.

0. Report header

Apr 11, 2016. GNV, FL.

1. Download the malware – play with your own risk!

Git clone my git repo (https://github.com/daveti/mre) and copy the malware_g.7z into the Windows VM. NOTE: there is not password protection for this malware.

2. Summary

This malware G and the accompanied jellydll.dll are a proof-of-concept GPU-based rootkit  called JellyCuda (https://github.com/x0r1/WIN_JELLY). It leverages the Nvidia GPU non-volatile memory to hide the malicious jellydll.dll and make it persistent without being detected by scanning the hard disk of the host machine. When the host is infected by the JellyCuda the first time, it loads the jellydll.dll into the GPU memory, creates a file called jellyboot.vbs in the startup folder and writes itself into the pre-formated VBscript, making sure that the malware would run every time when the machine is booted, and finally the jellydll.dll is removed. After the machine is rebooted, the malware looks for the jellydll.dll. If the dll file is still available, the malware would repeat the previous procedure to hide the malicious DLL file in the GPU memory. Otherwise, the malware reads the GPU memory, finds the memory block containing the jellydll.dll contents, reconstructs the DLL file in the memory, replaces the current process memory with the contents of the DLL, and finally calls the DllMain() entry function of the jellydll.dll, which simply prints out warnings of the existence of the GPU RAT.

Since this is a proof-of-concept malware, specific signatures or remediations for this malware may not be interesting or useful. However, JellyCuda does give us some hints to think about GPU-based rootkit in general:

  1. Calls to CUDA/OpenCL – normal applications usually do not deal with GPU directly.
  2. cuMemAlloc, cuMemcpyHtoD, cuMemcpyDtoH (or the OpenCL equivalents) – this means there is memory block transmission between the main RAM and the GPU memory.
  3. New file created – either the registry and/or the startup folder or the prefetch folder may be changed to include the malware itself, making sure it persistent across rebooting.

To remove JellyCuda from the system, one needs to clean the residency in the GPU memory at first, position the malware itself based on the modified registry/startup/prefetch, and remove it. The good news is that my Avast is able to recognize the JellyCuda as malware when I tried to copy it into the VM for analysis on my Mac.

NOTE: this report focuses on IDA and OllyDbg analysis, rather than other straight-forward tools. IDA analysis shows the complete picture of the malware, and OllyDbg digs into the malicious payload (jellydll.dll), which could not be analyzed by IDA.

3. Static Analysis

  • Is it packed?

No, though PEiD shows a packer named Pelles C for this malware, but it is the compiler which compiles the binary, not the packer.

image46

And nothing found for the accompanied dll:

image37

  • Compilation data?

Malware_g.exe: 2015/05/09.

image14

Jellydll.dll: 2015/05/09

image17

  • GUI or CLI?

Malware_g.exe: PEiD thinks it is a Win32 GUI and PEview thinks the same way.

image26

jellydll.dll: PEiD reports it as Win32 GUI and PEview agrees.

image63

  • Imports?

malware_g.exe:

image18

Kernel32.dll:

File manipulation:

CreateFile, WriteFile, CloseHandle, GetFileSize, ReadFile, DeleteFile, GetFileAttributes, GetFileType, GetStdHandle, DuplicateHandle, SetHandleCount,

Memory manipulation:

VirtualAlloc, GlobalAlloc, HeapAlloc, GlobalFree, HeapCreate, HeapDestroy, HeapReAlloc, HeapFree, HeapSize, HeapValidate, VritualQuery

Process manipulation:

GetProcAddress, GetModuleHandle, GetProcessHeap, GetModuleFileName, GetCurrentProcess, ExitProcess,

Library manipulation:

LoadLibrary, FreeLibrary,

Others:

Strlen, strcat, GetLastError, GetStartupInfo, RtlUnwind, GetSystemTimeAsFileTime, GetCommandLine, GetEnvironmentStrings, FreeEnvironmentStrings, UnhandledExceptionFilter, WideCharToMultiByte, SetConsoleCtrlHandler

User32.dll:

MessageBox, wsprintf, ExitWindowsEx

Advapi32.dll:

OpenProcessToken, LookupPrivilegeValue, AdjustTokenPrivileges

Shell32.dll:

SHGetKnownFolderPath

 

jellydll.dll:

image61

User32.dll:

MessageBox

Kernel32.dll:

File manipulation:

GetFileType, GetStdHandle, DuplicateHandle, SetHandleCount,

Memory manipulation:

VirtualAlloc, VirtualFree, HeapCreate, HeapDestroy, HeapReAlloc, HeapFree, HeapSize, HeapValidate, VritualQuery

Process manipulation:

GetCurrentProcess, ExitProcess,

Others:

GetStartupInfo, GetSystemTimeAsFileTime, GetCommandLine, GetModuleFileName, GetEnvironmentStrings, FreeEnvironmentStrings

  • Strings?

malware_g.exe:

IP: N/A
URL: N/A
Process: svchost
File:
Jellyboot.vbs, malware_g.exe

Files generated by the compiler:

image35

image29

Commands/Scripts:

image62

Error handling:

image30

CUDA:

image33

image04

Interesting:

NtFlushInstructionCache,

image16

Functions:

image32

Malware_g.dll:

Interesting:

image65

image34

  • Sections and contents?

malware_g.exe: there are 3 sections in total

.text: it looks like there is code in it.

.rdata: Warning strings, windows commands, CUDA functions, and interesting stuffs

image50

.data: IAT, and a bunch of debug sections, including COFF

image43

image10

Jellydll.dll: there are 4 sections.

.text: normal code

.rdata: malware writer’s kind reminder

image08

.data: IAT

image11

.reloc: relocation table

(g) Resource

ResourceHacker found nothing for either the malware_g.exe or jellydll.dll.

(h) IDA Pro

Malware_g.exe:

The first entry function of malware_g.exe is WinMainCRTStartup(), which is generated by the Pelles C compiler for Windows.

image52

It sets up an exception handler, which calls RtlUnwind(), which is usually generated by the compiler for try/except. It then moves to allocate space on the heap using HeapCreate() called by __bheapinit(). If failed, then exit. Otherwise, system setting up continues.

image72

image66

If everything is still good, we reach the second entry function WinMain(), which is the real function implemented by the malware.

image31

The first thing WinMain() tries to do is to call LoadCuda().

image36

If the loading is failed, the malware exits. Otherwise, it continues with a call to dword_40595C, dword_405958, dword_405954, and jc. Since all these are indirect calls, we need to figure out what these memory address are by looking into the LoadCuda().

image09

As its name implies, LoadCuda() starts with loading nvcuda.dll using LoadLibrary(), and exits if the loading fails.

image60

When nvcuda.dll is successfully loaded, memory address jc is loaded into %eax and then the local variable lpAddress. Looking at that memory address, we realize the connections among all those memory addresses. Jc is the start address of a struct with address 0x405950, and dword_405954, dword_405958, dword_40595C, …, dword_40596C are the following members of the struct. Since all members are dword (4 bytes) and called by the call instruction, this jc struct contains a bunch of function pointers.

image41

image07

Once jc is loaded into lpAddress, a loop starts on szFuncNames array. For each name in szFuncNames, GetProcAddress() is called with the library handle returned by LoadLibrary() and the name. The return value is assigned to the current value of lpAddress.

image59

Looking into the szFuncNames, we see the CUDA functions we have seen in the strings.

image24

Once LoadCuda() is done. Struct jc is initialized with all these CUDA functions in order. So back to the WinMain(), after LoadCuda() is successfully returned, cuInit(), cuDeviceGetCount(), cuDeviceGet(), cuCtxCreate_v2() are called one by one. Any call failure would free the loaded CUDA library and exit the malware. When CUDA is successfully initialized, GetFileAttributes() is called with jellydll.dll and the return value is checked against 0xffffffff (-1), which is INVALID_FILE_ATTRIBUTES. GetLastError() is called and the return value is checked against 2, which is ERROR_FILE_NOT_FOUND. When both errors happen, SearchJellyDustOnGPU() is called; otherwise, SprayJellyDustToGPU() is called. Then FreeLibrary() is called and WinMain() returns.

image47

SearchJellyDustOnGPU() calls AllocateGPUMemory() at first, which calls dword_405960, which is essentially the 5th member of struct jc – cuMemAlloc_v2().

image67

If AllocateGPUMemory() failed, SearchJellyDustOnGPU() would exit. Otherwise, it continues calling GlobalAlloc(), dword_405964 (cuMemcpyDtoH_v2()), dword_40596C (cuMemFree_v2()), which copies the GPU into the host memory. Note that the copied memory size is expected to >= 0x1000C (65548) bytes.

image48

The copied memory is then examined against a number 0x5DAB355 in a loop.

image58

image51

If the memory blocks starts with the magic number, and some checkings are passed, and GetDustCheckSum() is passed as well, we hit the core of this SearchJellyDustOnGPU() – GetProcessHeap(), HeapAlloc(), and ExecuteJellyDust(). Note that the ‘rep movsb’ copies the memory block we found with offset 0xC into a local variable lpvDust, which is then passed into ExecuteJellyDust().

image13

The ExecuteJellyDust() function calls VirtualAlloc(), LoadLibrary(), and GetProcAddress() in a big loop. Based on the naming of local variables involved – pImport and pRelocBase, one can guess that this loop is used to reconstruct a library from the memory block. Finally, ExecuteJellyDust() loads ntdll.dll and calls NtFlushInstructionCache(), which parameters (-1, 0, 0), which is undocumented, and clears the old code in the cache. Finally, an indirect call to %eax is made with parameters (lpvTarget, 1, 0). Note that %eax is derived from pNt with offset 0x28, which is the offset of DllMain() against the PE signature. So, we know that final call is to call the entry function of the library created in the fly before. Now the question is what is that library?

image71

The last function we haven’t looked at is SprayJellyDustToGPU(), which is called when the malware is able to find the jellydll.dll. The only parameter of this function is “jellydll.dll”. First, it calls CreateFile() to open jellydll.dll, and GetFileSize(). Then GetProcessHeap() and HeapAlloc() are called to allocate enough memory for jellydll.dll, which is then read into the memory via ReadFile(). AllocateGPUMemory() is called after followed by GetDustCheckSum() and GlobalAlloc(). Note that the magic number 0x5DAB355 is added ahead of the memory block of jellydll.dll.

image06

The JellyDust (magic number + tweak(jellydll.dll)) is then copied into the memory allocated by the GlobalAlloc(), and later copied into the GPU memory via dword_405968 (cuMemcpyHtoD_v2()).

image03

At last, file jellydll.dll is closed and deleted via CloseHandle() and DeleteFile(), before the Reboot() is called, which is the last piece of the malware_g.exe puzzle. This function calls SHGetKnownFolderPath() to open _FOLDERDIR_Startup, which is %APPDATA%\Microsoft\Windows\Start Menu\Programs\StartUp.

image01

The startup path is then converted from wide chars into multiple bytes using wcstombs(), appended with byte 0x5C (‘\”), and null terminated.

image44

Then file jellyboot.vbs is created under than startup direction.

image15

After the new file is created, GetModuleFileName() is called to get the file path of the malware_g.exe itself. The jellyboot.vbs file then is written via WriteFile() with command lines formated by wsprintf() using the file path of the malware itself, and finally closed via CloseHandle(). The command lines are used to create a COM object using VBscript to run the malware itself and then remove itself.

image38

The last thing Reboot() does is to call GetCurrentProcess(), OpenProcessToken(), LookupPrivilegeValue(), and AdjustTokenPrivileges() to gain the permission to reboot the machine using ExitWindows().

image23

Jellydll.dll:

Now we know that jellydll.dll is the RAT, and the DllMain() entry function would be executed by the malware_g.exe. However, IDA screws the analysis of this library. The dll entry function tries to call sub_10001030, which is the address in the .rdata section.

image49

image28

4. Dynamic Analysis

Malware_g.exe

We are not able to run malware_g.exe, not only because of the CUDA requirement, but also the fact that below procedure could not be located. Why? This function is only available above Windows Vista.

image40

Jellydll.dll:

To see what the heck jellydll.dll is doing in its DllMain() entry function, we load jellydll.dll into OllyDbg, which asks if we want to load LoadDLL.exe to run the library. After yes, we finally see the RAT.

image02

Then we break at the new module loading time and find the exact DllMain entry function, which is at 0x7C901187.

image68

Then we break at the DllMain() function to examine the stack. %esp is 0x0006F8AC, and %ebp is 0x0006F8C4. The first parameter of the function is at the top of the stack, which is address 0x0006F8AC. The second parameter is address 0x0006F8B0. The third parameter is address 0x0006F8B4. The function call is ss:[ebp+8], which is address 0x0006F8CC.

image20

Moving on to look back at the stack, we have:

First parameter (hinstDLL) – 0x0006F8AC: 0x10000000 –  should be the handle to the loaddll.exe itself.

Second parameter (fdwReason) – 0x0006F8B0: 0x00000001 – that is the REASON code DLL_PROCESS_ATTACH.

Third parameter (lpvReserved) – 0x0006F8B4: 0x00000000 – NULL for dynamic loads.

Function call – 0x0006F8CC: 0x10001140 – that is the correct address of DllEntryPoint() shown in IDA.

There we go, let us step into the DllMain(). The real function call in the DLL entry is at address 0x1000117E, with an instruction “call 10001000”. So break at this line again and examine the stack.

image19

Now interesting thing happens. When we try to set a breakpoint at the address, OllyDbg tells us that we are looking at the code in the data section rather than the code section, which may explains why IDA screws. Anyway, set the breakpoint and step into.

image42

We finally see the final function called in the DllMain() of the jellydll.dll. It is a call to MessageBox with the capital string and the RAT string.

image12

5. Indicators of compromise

Since this is a proof-of-concept of GPU-based malware, it is easy to know the machine is compromised when the warning window shows up. In reality, the indicator could be non-trivial to find, depending on the implementation of the GPU payload (jellydll.dll). If it is a rootkit, it may stay in the machine for a long time without detection, and even AV may not help. If it is a RAT, we may be able to find unfamiliar socket connections with outside. If it is a ransomware, we know when we know.

6. Disinfection and remedies

It is not clear so far what the best solution would be for GPU-based malware (and I am going to dig deeper to see if there would be a paper potential). Since current prototypes of GPU-based malware require a ‘helper’ in the host system to make it work, Intel does not think it would be threat (http://www.securityweek.com/gpu-malware-not-difficult-detect-intel-security). On the other hand, my Avast on Mac is able to detect the JellyCuda when I tried to move it into the VM for analysis. As far as I can think of now is a system tool/mechanism to look into the GPU memory for malware detection just like AV does on the host machine. We may also reconsider the access control for the GPU from the security point. Yeah, I am talking about the pitch of a potential paper trying to defense GPU malware. Will see how it goes:)

Posted in Security, Static Code Analysis, Uncategorized | Tagged , , , , , , , , , | Leave a comment

Malware Reverse Engineering – Part I

I took a “Malware Reverse Engineering (MRE)” class last semeter and it was fun to me, partially because I was not a Windows person, though I am still not. What seems ridiculous to me is how trivial one can write into any process on Windows XP, which was apparently designed for malware! Regardless of all those Windows craps, this post is to share a general working flow of malware reverse engineering on Windows (XP) platform, and the corresponding tools. Note that this report has no way to be a good one. Instead, this was my first trial and I intended to put as much information as possible. If you are interested in MRE or wants a job for that, do buy this book (https://www.nostarch.com/malware) and give it a complete read. Have fun and stick with Linux~

0. Report header

Feb 12, 2016, GNV FL.

1. Download the malware – play with your own risk!

All the malware samples can be found on my github (https://github.com/daveti/mre). All malware binaries are compressed by 7Zip with password “malware” protection. This post is about the first malware/ransomeware uploaded. Before you start, make sure you have a Windows VM (KVM/VirtualBox/VMware) ready with networking disconnected from the host machine.

2. Summary

This malware is a kind of ransomware. The IP/domain of the networking is encoded instead of plain text. The encryption routine is also a DIY method without calling existing crypto libraries. Most imports are ReadFile/CreateFile/DeleteFile. 2 new entries are added into the registry, and one of them is the malware itself. All *.doc, *.txt, *.jpg, and etc. under C: are encrypted. A DNS query is also triggered for domain “time.windows.com”. The file “CryptoLogFile.txt” may be used to detect this malware, since it is created at first to log all the files encrypted. “time.windows.com” seems not a helpful signature, since it is a valid domain.

3. Static analysis

  • Is it packed?

Seems not.

image34

PEiD: Nothing Found (but shows Win32 GUI as its subsystem).

image01

PEview: A lot of imports can be found.

image11

pestudio: same as PEview

If PEiD is able to detect packer, it would provide the information of the packer, which can be used to find the unpacker; If PEiD fails, we have to refer to PEview/pestudio, investigating the imports and section contents manually.

  • Compilation data?

image13

PEview: 2009/10/09.

  • GUI or CLI?

Seems CLI.

PEview: There is no GUI related functions or DLL found in the text section.

image30

Depends: It only depends on user32.dll, kernel32.dll, and shell32.dll.

  • Imports?

PEview: file related operations (CreateFile, DeleteFile, FindFile, ReadFile, WriteFile), a bunch of ‘get’ functions (GetCommandLine, GetEnvironmentVariable, GetFileSize, GetLogicalDrives, GetWindowsDirectory), and some string operations (lstrcat, lstrcmp, lstrcpy). A wild guess would be this malware goes into the Windows directory, removes the target files, and also creates some new files.

image05

  • Strings?

strings2:
IP: NA
URL: NA
Process: NA
File: user32.dll, kernel32.dll, shell32.dll, CryptLogFile.txt, wallpaper.bmp, .txt, .doc, .xls, .db, .mp3, .waw, .jpg, .rtf, .pdf, .zip,

image17

pestudio:

image29

  • Sections and contents?

PEview: text seems OK; rdata contains imports address table, directory table, name table, as shown in (d); data contains 2 interesting file names (CryptLogFile.txt, wallpaper.bmp); rsrc contains some icons, which seem fine.

image20

ResourceHacker: rsrc section looks no code embedded.

image28

IDA Pro

  • The first file created

There are 3 subroutines and main calling CreateFile:

image37

image14

The main function prepares the file name to be “c:\windows\CryptoLogFile.txt”,

image33

and then save it at byte_403F28 after preprocessing – removing the char 22h,

image25

and then call sub_4015B5, which calls CreateFile the first time, which then creates the file using the filename at byte_403F28, which is the CryptoLogFile.txt.

image00

  • dword 0xCA6B93C9

The DIY encryption routine uses this table to look up for a value, then XOR with the original value to achieve the encryption. If the encryption was PKI, then this secret data buffer could hold keys, .e.g., private key can be used to encrypt, while the public key would be sent to the attacker asking for ransom.

  • sub_401000

This routine starts with FindFirstFileA to find a specific file, and returns if the search fails (locret_401261), and keeps looping till all the target files have been gone thru.

image27

  • sub_40140D

I would try to name it as – read the file, encrypt it into a new file, and remove the original file.

image02

It also calls the shell del command to remove the file:

image24

  • sub_401263

I would rename it as – DIY encryption routine, especially after I saw the operations like:

lodsb
xor eax, the_secret_look_up_table[edx]
stosb

image19
4. Dynamic analysis

Preparement:

REMnux: start inetsim

image26

Windows:
start apateDNS

image03

start Process Explorer

image38

start Procmon (then pause and clear)

image22

start RegShot (the 1st shot)

image32

Unpause the Procmon; Execute the malware; Pause the Procmon (seems it got hang every time…)

image06

Take 2nd RegShot

  • Interesting behaviors that occur after the malware has executed.

image31

  • Machines and services the malware attempts to contact by IP or domain or host name.

image08

  • Registry keys created/modified by the malware

image04

  • Files created/modified by the malware

image07

There are also files encrypted outside the windows directory, e.g., the Dynamic Analysis directory on the desktop. Since I was scanning the dir only under c:\windows, these files are not shown in RegShot. However, CryptoLogFile lists all the encrypted files (how nice is that).

  • Processes started by the malware

Notepad, and maybe else (Procmon stuck when pause…)

image38

5. Indicators of compromise

A lot of files have been encrypted, as listed in CryptoLogFile.txt. For example, one of the README.txt looks like below. And, for sure, comes the “new” wallpaper with introduction to ransomware, and ways to pay the ransom.

image35

image21

6. Disinfection and remedies

To make sure this ransomware will not start again, need to do a clean up in the registry. If there is a data backup (there should be), or a system snapshot, do a recover – yeah, problem solved. If there is no data backup, and I am able to decrypt the encryption routine (DIY crypto could be vulnerable comparing to other common crypto methods and implementations), then it is time to learn maths and assembly. Otherwise, which may be the most common way, pay the ransom.

Posted in Security, Static Code Analysis, Uncategorized | Tagged , , , , , , , , , | Leave a comment

gcc, llvm, and Linux kernel

This post talks about what happened recently in the Linux kernel mailing list discussion. While this post does not dig into compiler internals or the whole picture between the Linux kernel and compilers, we discuss 2 specific issues from gcc and llvm respectively. The gcc issue may be a quirk but the llvm issue is definitely a bug. Keep reading…

1. leal %P1(%%esp),%0

The title is the inline assembly used at arch/x86/boot/main.c line 121. The thing seems weird is the ‘P’ in ‘%P1’, which is not the common comparing to ‘%1’ we used to see in gcc inline assembly. So what is the heck[1]? Let us try to put this kernel inline into a main function where we could play with gcc easily:

#include <stdio.h>
#define STACK_SIZE	512
static int stack_end;

int main()
{
	asm("leal %P1(%%esp),%0"
		: "=r" (stack_end)
		: "i" (-STACK_SIZE));

	return 0;
}

Then we assemble the code (gcc -S) and look at the assembly, where we can see the inline is interpreted as follows:

leal -512(%esp),%eax

This is exactly the thing we want for ‘leal’. In a word, gcc does not complain anything about this ‘P’. What if we remove the ‘P’ and look at the assembly again? After a quick trial, here is the inline generated by gcc:

leal $-512(%esp),%eax

Oops, gcc recognizes the ‘%1’ is an immediate value and appends ‘$’ (AT&T style) automatically. This may be right in most cases but definitely wrong for ‘lea’. As a matter a fact, if I try to compile the code directly, gcc would not let me do that. Now it is clear that the tricky ‘P’ in ‘%P1’ is used to make gcc happy and work. Note that I am using gcc 4.9.2. Latest gcc (5/6?) seems having fixed this quirk already – generating the same and correct assembly with or without the mysterious ‘P’. Go try yourself.

2. pushf/popf

The original issue was reported from usbhid testing using llvm-compiled kernel[2]. With kernel developers’ further debugging, the root cause of the bug is clear, pointing to the llvm rather than the kerne code itself[3]. Let us go thru the example described in the llvm mailing list. Here is the source file:

#include <stdlib.h>;
#include <stdbool.h>;

/* Assume foo changes the IF in EFLAGS */
void foo(void);
int a;

int bar(void)
{
	foo();
	bool const zero = a -= 1;
	asm volatile ("" : : : "cc");
	foo();
	if (zero) {
		return EXIT_FAILURE;
	}
	foo();
	return EXIT_SUCCESS;
}

The point is foo() may (or not) change the IF in the EFLAGS. Compile it to generate the object file (clang -O2  -c -o ) and disassemble it as shown below (objdump -S):

[daveti@daveti c]$ objdump -S llvm_if_issue.o

llvm_if_issue.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <bar>:
   0:	53                   	push   %rbx
   1:	e8 00 00 00 00       	callq  6 <bar+0x6>
   6:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # c <bar+0xc>
   c:	9c                   	pushfq
   d:	5b                   	pop    %rbx
   e:	e8 00 00 00 00       	callq  13 <bar+0x13>
  13:	b8 01 00 00 00       	mov    $0x1,%eax
  18:	53                   	push   %rbx
  19:	9d                   	popfq
  1a:	75 07                	jne    23 <bar+0x23>
  1c:	e8 00 00 00 00       	callq  21 <bar+0x21>
  21:	31 c0                	xor    %eax,%eax
  23:	5b                   	pop    %rbx
  24:	c3                   	retq

Let us focus on the interesting part:

   c:	9c                   	pushfq
   d:	5b                   	pop    %rbx
   e:	e8 00 00 00 00       	callq  13 <bar+0x13>
  13:	b8 01 00 00 00       	mov    $0x1,%eax
  18:	53                   	push   %rbx
  19:	9d                   	popfq

As you can see here, before bar() calls foo(), it saves EFLAGS on the stack using ‘pushf’. After the foo() is done, it recovers the EFLAGS from the stack using ‘popf’. Remember our assumption – foo() may change the IF in the EFLAGS! Now we could explain the bug found in usbhid. The foo() is spin_lock_irq(), and the bar() is usbhid_close(). While spin_lock_irq() makes sure the interrupt disabled, usbhid_close() used the old value of EFLAGS, ignoring what happens in spin_lock_irq().

3. Summary

The gcc quirk may reflect the hackish fix of gcc in the early days to satisfy the kernel compilation requirement. After all, gcc is the only compiler without any patches in the kernel to compile the Linux kernel. As such, Linux kernel is the only project leveraging different gcc features other projects would never bother. On the other hand, llvm is catching up. There are kernel patches already to make llvm compile the kernel, and people are testing llvm kernel images. Nevertheless, the EFLAGS clobbering issue in llvm optimization may be a showstopper. Most user-space applications do not care about interrupt, however, it is the core requirement for the kernel to work as expected. As Linus pointed out – “Using pushf/popf in generated code is completely insane (unless done very localized in a controlled area).

4. Reference

[1]http://comments.gmane.org/gmane.linux.kernel.kernelnewbies/52259
[2]https://lkml.org/lkml/2016/3/1/160
[3]http://lists.llvm.org/pipermail/llvm-dev/2015-July/088780.html

Posted in OS, Stuff about Compiler | Tagged , , , , | Leave a comment

Defending Against Malicious USB Firmware with GoodUSB

Finally, 4 months after our paper was accepted by ACSAC’15, I could now write a blog talking about our work – GoodUSB, and release the code, due to some software patent bul*sh*t. (I sincerely think software patent should be abolished from the very start!) Anyway, this post is all about malicious USB firmware, BadUSB attacks, and our defense solution from the Linux kernel – GoodUSB. Go ahead to download GoodUSB and play with it. Any question, shoot me an email.

0. To memorize the old paper title given by Dr. Bates

GoodUSB: How I Learned to Stop Worrying and Love the Rubber Ducky

1. A quote from a chat in Skype[1]

“I read an article about how a dude in the subway fished out a USB flash drive from the outer pocket of some guy’s bag. The USB drive had “128” written on it. He came home, inserted it into his laptop and burnt half of it down. He wrote “129” on the USB drive and now has it in the outer pocket of his bag…”

2. BadUSB attacks

If you read the reference link, you will find that the “USB flash drive” (or USB Killer) has embedded some capacitors supporting high negative voltage (-110V). Once charged, these capacitors are able to cause over current in the USB signal line. While we are not going to talk about this in details (probably in another post), it does leave us a question: “Arriving at work, you find a USB drive on your table. What would you do?”[1]

In reality, data exfiltration or backdoor injection is preferred to burning down the machine. This is what USB rubber ducky[2] designed for. As a penetration testing tool, the USB rubber ducky looks like a USB thumb drive, but quacks like a keyboard, and types like a keyboard. Therefore, it is a keyboard. The only difference between a keyboard and a USB rubber ducky besides the appearance is that unlike normal keyboards, a USB rubber ducky does not need a human being to type the keystrokes – an adversary can write a malicious script, compile and load it into the ducky, and the ducky will execute it once plugged. How cool is that! A more powerful programable USB device is Teensy[3]. Teensy 3.1 development has been integrated into Arduino IDE. Similar with USB Killer, the best possible defense solution would be to open the case, and look at the PCB board carefully (as long as you know what you are looking at…).

Unfortunately, BadUSB[4] attacks in BlackHat 2014 made our best try so far a vain. Rather than requiring a specific USB micro controller, people could write malicious firmware by themselves on common USB micro controllers, thanks to the existing firmware building tools[5]. This means that a USB flash drive could behave like a storage and a keyboard the same time. While the storage provides the normal usage, the keyboard part is essentially a USB rubber ducky. The problem now is that we will not know if the firmware is malicious or not util it is plugged. And most of the time, we do not even know that there is a keyboard enabled, since it happens within the OS. Now let us repeat the question again: “Arriving at work, you find a USB drive on your table. What would you do?”

3. Root Cause Analysis

The root of BadUSB attacks originates from the USB spec. A USB device is able to have multiple functionalities (interfaces). Think about a USB headset, which contains audio functionalities (speaker + microphone), and a input/keyboard functionality (volume control). Therefore, there is no violation from the spec point for a USB storage device to have a keyboard functionality (and for some storage devices, this extra keyboard functionality may be needed as we will talk about this later). In reality, when a USB device is plugged into the host machine, it can report any functionalities (interfaces) that need OS’s support. The OS would try its best to find the corresponding driver to serve each of the functionality. Think about a BadUSB thumb drive. When it is plugged into the host machine, it reports itself with both a storage and an input (keyboard) interfaces during the USB enumeration (a procedure for the host machine to recognize the device). The OS then loads a storage driver and an input driver to make the device function. Once the input driver is loaded, the BadUSB device types a malicious script (like a human being), which is executed by the OS automatically. All of these happen in the OS within a second while the user is going through the files saved in the storage.

4. GoodUSB

The OS knows nothing about the USB device but is able to load different drivers to make the device happy (work); the user knows something about the device, e.g., from the appearance of the device, but is not able to interpose between the OS and the device. To bridge this semantic gap, ideally, we need a way to let the user and the OS talk:

User: I have just plugged in a USB flash drive.
OS: OK. I will not allow it to have a keyboard functionality then.

Essentially, this is GoodUSB.

As a end-to-end & systematic solution defending against malicious USB firmware, GoodUSB does not only include a customized Linux kernel but also a user-space daemon supporting GUI and a Honeypot KVM (HoneyUSB) for redirecting suspicious devices during run-time or start-time. While I am not going to list technical details here, I put the GoodUSB architecture figure here for a flavor and redirect further interests to our paper.

goodusb_arch

When the USB device is plugged into the host machine for the first time, the device class identifier in the kernel would try to fingerprint the firmware to get the signature (SHA1). The kernel then suspends further actions and sending the information about the device to the user-space daemon before enabling the device. The GoodUSB user-space daemon (gud) pops out a GUI asking for the user’s expectation about this device, as shown in the figure below:

goodusb_stupid_user_mode

Note that the choices in the GUI are high-level description for the device without any low-level USB spec terms. One beauty of GoodUSB is that once the user could give a general description of the device, the policy engine within gud is able to find right possible functionalities (interfaces) required to enable the device with the least “permission” (if we treat drivers as permissions). For instance, if the user choses “USB Storage”, no keyboard (input) functionality will be enabled for sure. After this, another GUI would pop out letting the user to bind this device with a security picture (just like a security picture when logging into online banking system). Now gud has all the information it needs. Besides updating the local device database, it relays all the information to the kernel, which could further configure the device as needed and expected by the user. When the device is plugged in for the 2nd time, the kernel is able to recognize it and asks for confirmation from the user via gud:

sec_pic_user_mode

However, if the device is shown as a green dinosaur but the user knows it should be a red one, then the user is aware that the firmware of the device has been changed (to mimic the device bound with a green dinosaur). In the case, after “This is NOT my device!” is clicked, the device will be redirected into HoneyUSB, where we have implemented a USB profiler (usbpro) to inspect the behaviors of the device. Even though GoodUSB was designed against BadUSB attacks, its ability to customize the functionalities to be enabled for a USB device is also invaluable in daily use. E.g., GoodUSB is able to shutdown the microphone in a USB headset but leaving the speaker working as usual.

5. Limitations

As other 0-day stuffs, GoodUSB is not able to defend against 0-day malicious firmware. If the keyboard is able to input scripts automatically, there is nothing GoodUSB can do. As readers may have been realized, GoodUSB relies on the trust of drivers. If the driver is malicious, GoodUSB does not work. Another thing I have to mention here is USB quirks. Although we have tried to cover as many devices as possible, there are always USB quirks, which would not function properly with GoodUSB. One example would be Yubikey, which looks like a thumb driver, has a USB hub functionality, and behaves like a keyboard. The last limitation comes from us – human beings. GoodUSB uses GUI and security pictures with the hope to help users make a better judgement. Again, this is our hope. We have not done and will not do any user study to show the validity of using GUI and security pictures. Usability is beyond the scope of the paper.

6. RtDC
paper: https://github.com/daveti/daveti/raw/master/paper/acsac15/acsac2015djt.pdf
code: https://github.com/daveti/GoodUSB

References:
[1]http://kukuruku.co/hub/diy/usb-killer
[2]http://hakshop.myshopify.com/products/usb-rubber-ducky-deluxe?variant=353378649
[3]https://www.pjrc.com/teensy/teensy31.html
[4]https://srlabs.de/badusb/
[5]https://github.com/daveti/badusb

Posted in OS, Security | Tagged , , , , , , , | Leave a comment