Monday, September 22, 2014

CSAW CTF 2014 Quals: Crypto 300 Feal Writeup

This challenge was completed by {Caesar, Lynxjerm, Sliden, Unix-dude}@RPISEC.

Our approach was the following:
  1. Wrote a python script to connect to the CSAW server and complete a SHA-1 proof-of-work
  2. Determined the follow-on cipher to be FEAL-4
  3. Researched attacks on FEAL-4 and found it vulnerable to differential cryptanalysis
  4. Found slight differences between CSAW's FEAL-4 implementation and the standard version
  5. Adjusted differential cryptanalysis attack to fit CSAW's modified implementation of FEAL-4

SHA-1 Proof-of-Work

Upon connecting to the server, we see the following:

The server sends 12 random bytes encoded into base64. The goal is to send back a SHA-1 hash which begins with those 12 bytes and ends with all 1 bits.

Our client code receives those 12 prefix bytes and generates a SHA-1 hash till it matches:

FEAL Cipher

After completing the proof-of-work, the server greets you with the message:

The number after "Please decrypt: " is the flag encrypted with a random key which changes on each connection.

As far as we could tell, there is no feal 4.3 cipher, there is only FEAL-4, FEAL-8, and FEAL-N, where the numbers 4, 8, and N indicate the number of rounds of the cipher. Here is an excerpt from the wikipedia page on FEAL:

In cryptography, FEAL (the Fast data Encipherment ALgorithm) is a block cipher proposed as an alternative to the Data Encryption Standard (DES), and designed to be much faster in software. The Feistel based algorithm was first published in 1987 by Akihiro Shimizu and Shoji Miyaguchi from NTT. The cipher is susceptible to various forms of cryptanalysis, and has acted as a catalyst in the discovery of differential and linear cryptanalysis.
An impressive page on cryptography and cryptanalysis is by Jon King. He even has a page on differential cryptanalysis of the FEAL-4 cipher. It includes a video, a tutorial with diagrams, and source code. Below are diagrams of the FEAL-4 cipher with permission from King. Please visit his page if you'd like a more in-depth explanation the cipher and its cryptanalysis.

Here we see the cipher has four rounds and six subkeys.
Each round has a round function, also known as an F-Box, and each F-box contains four G-Boxes.

Differential Cryptanalysis

At a high level, differential cryptanalysis is a chosen plaintext attack that works by generating plaintext messages with a known relationship and tracing how that relationship changes at each round of the encryption. Determining how the relationship changes leaks information about the subkeys.

The first step is generating a random plaintext. The next is to derive a second plaintext by XORing the first with a specific number called the differential. The differential is chosen based on how the encryption rounds of a particular cipher affect it. In the case of FEAL-4, we generate three sets of plaintext pairs, each set of pairs has a differential that is specific to a round. There are three sets of plaintext pairs and differentials instead of four because the first round requires an approach closer to brute-force. In addition to the three input differentials that correspond to three rounds, there is an expected output differential. Combining these known input and output relationships allows us to dramatically shrink the space of the subkeys and brute-force them.

Running this differential cryptanalysis attack against CSAW's version of FEAL-4 did not work though. In the following section, I highlight why and what we did to fix it.

CSAW's Version of FEAL: feal 4.3

There was one slight difference in feal 4.3 versus the standard FEAL-4 cipher. That difference was the cyclical bit shift inside the G-Boxes. See the diagrams from the FEAL-4 section above. feal 4.3 had a cyclic bit shift of 3 and the standard FEAL-4 has only a cyclic bit shift of 2.

feal 4.3 python code:

Standard FEAL-4 code from King:

The greeting "Welcome to feal 4.3" given after the SHA-1 proof-of-work seems to hint towards this change. This modified bit shift in the encryption algorithm required us to change our cryptanalysis to account for it. We changed the third input differential from 0x0000000002000000 to 0x0000000004000000, just an extra bit shift. We verified this adaptation by modifying King's implementation to do a bit shift of 3  in the encryption and confirmed King's cracker could only extract subkeys with our new differential.


There was possibly a second difference in feal 4.3. It may have been due to something we did or maybe it was the way CSAW implemented its algorithm, or both.

We think the way the F-boxes were implemented in CSAW's feal 4.3 differed from that provided by King. We could not get ciphertexts from CSAW's version to match King's. This was until we swapped the bytes in the F-box in King's C code, as in swap(x0, x3) and swap( x1, x2). This led us to matching ciphertexts between the CSAW's version and King's version. To isolate this detail, we stepped through and verified every bit shift between the two versions.

Assuming King's implementation is correct, which we believe it is, there may have been some endianness issues in the python code. Another possibility is that issues could have come from the way we were sending the plaintext to the server. We experimented with sending little endian and big endian plaintexts.

After verifying that King's code and CSAW's were encrypting identically, we noticed the differential cryptanalysis attack stopped working. This is because swapping the bytes in our F-box required us to again change the differentials. We changed the third input differential again and now also the output differential from 0x0000000004000000 to 0x0000000000000004. With these changes we were able to successfully mount a differential cryptanalysis attack to extract all six subkeys from the ciphertext generated by CSAW's server.


With only 5 minutes left in the CTF, we got the 6 subkeys. Our screenshot says Bad subkeys because of some debug code we left in it. Seeing the screen below was a powerful moment after 24 hours of work.

Then it got a bit chaotic. I read the hex bytes out loud to Unix-dude, who also sent them to Sliden. They both tried to decrypt the flag and submit it to the server.

With less than 3 minutes left, we scored.

It was a dramatic finish and the most fun I've ever had in a CTF. Thank you CSAW organizers and thank you Jon King. I look forward to learning more about cryptography.

Saturday, September 6, 2014

Reversing a 16-bit NE File Part 1: Clumsy and Unprepared

A friend and I were reminiscing about the hacking we were doing around 15 years ago. It got me thinking about an old AOL cracking program called Sabotage. So I found a copy on an old AOL hacking website. It turns out it had a password so I decided to reverse it over the weekend. The password was available online so reversing it was just for kicks. The following is the tale of that diversion. Names have been changed to protect the innocent.

The program in question is Sabotage Cracker 3.3, available here:

The zip file contains an installer which I unpacked using Universal Extractor. Below is the screenshot showing the 3 folders contained in the installer with the files of each folder.

When we run Cracker.exe, we see the following:

Click OK with no password:

Try a password and fail, we see the following typed one letter at a time in the box, then the program quits.

OK, so far pretty unexciting. We should be able to get the password in no time. It was written in the mid 90's, how protected could it be?

16-bit, Your Tools Have No Power Here

Let's start with identifying what type of file it is with the file program in cygwin.

This is an New Executable (NE), which I have never heard of before this. The NE format was after the old DOS flat assembly format and before the 32-bit PE format.

Let's see what IDA Pro 6.5 can do with this. We should be able to get this done with a few jumps around the code, bada bing, we're done. Here's what we get:

The navigation bar which has been boxed in red shows that the almost the entire program is unexplored by IDA.

IDA correctly identifies it as a 16-bit NE file but only parses a total of 3 instructions correctly.

It's possible to go through the listing and convert the bytes to code or data, but I thought that there's an easier way.

I tried to get some other information statically but the following tools cannot process 16-bit executables: 
  • PEView
  • PE-Explorer
  • CFF Explorer
  • Dependency Walker
  • PEiD
  • Resource Hacker

IDA et al. you had your chance, c'mon OllyDbg, let's get this over with, I want to grab lunch.


Well right, Olly only handles 32-bit too, but let's see what it does anyways.

So it runs in Olly, but there's no code or memory being shown.

Task manager shows it listed in the applications when run:

Task Manager shows it listed in under Processes but it's indented with wowexec and has no Process ID (PID).

Process Explorer does not even list wowexec or cracker.exe.

However, it does list "ntvdm.exe", which I don't recognize. It turns out NTVDM is the compatibility layer that allows 16-bit executables to run on 32-bit Windows XP. It's a built-in virtual machine, which allows seamless execution of 16-bit executables.

A 16-bit exe run under Windows XP runs under the ntvdm.exe process. To debug a 16-bit process on XP, one must debug it through the ntvdm virtual machine. There is a debugging library built into Windows called VDMDBG.dll. There is an article on that here.  I could not find any debugger to debug the process using the VDMDBG functionality. I'm not sure if VDMBDG.dll implements all traditional debugger functionality like single-stepping.

I did find an article and some code in an article written in 1998. The code is for VDMDBGDemo.exe which demonstrates all the functionality of VDMBDG.dll. This functionality only includes enumerated 16-bit processes and listing modules they have loaded.

This is consistent with the functions exported by VDMBDG.dll.

There may be some undocumented functionality but I'm trying to do a breadth-first search for the easiest way to get a stupid password out of this program. 

Visual Basic 3.0 and Decompiling (from 1993)

A run of strings on the program gives us 15k+ strings, some are relevant, most not. But one of the first is VBRUN300, which is the name of the Visual Basic (VB) 3.0 runtime DLL.

VB 3.0 came out in 1993. There's a copy of a manual scanned and upload here. So this program may be about 20 years old. So this is half a reversing challenge, half an archeological expedition.

Some googling brings me to Dodi's Visual Basic 4 Decompiler, on an Angelfire website, which I didn't know was still around.

Here are the files the decompiler spits out:

Grep those files for the reject string "See Ya!":

Looking at frm1.bas:

We found the good boy string "Access Granted". I don't know Visual Basic, but I looked around a little and couldn't trace back the good boy message to any input string. Now I could learn VB fairly quickly to assist my hunt, but I decided to pause on this approach and return to dynamic analysis.

OllyDbg Round 2

So IDA and OllyDBG were kind of a bust the first time around. I looked into using Bochs emulator with IDA but it doesn't support 16-bit executables. This time I was going use OllyDbg and attach to the ntvdm.exe emulator and try to find my program.

Started Cracker.exe, then opened OllyDbg and attached to ntvdm.exe. After passing back a few exceptions the program is now running and waiting for input. OllyDbg says there is 3 threads. To figure out which one of the threads in ntvdm.exe is the one interacting with Cracker.exe, I click on the cracker window a few times to increase the User Time in OllyDbg. You can see it is the one that is non-zero with a value of 0.6309s. Now I switch to that thread context.

I open the memory map window, there is no sign of Cracker.exe. There is ntvdm and wow32. I know wow is Windows on Windows, another DLL having to do with emulation. I also remember seeing wowexec in the VDMDBGDemo program I showed earlier in the post. 

Next I go to WOW32 in the CPU window in OllyDbg and right-click->Search for Intermodular Calls. Within the listing of calls I look for anything that might interface with the text dialog box or the GUI. After fumbling around with placing breakpoints on a few function calls, I finally get a response on User32.SendMessageA.

I set a breakpoint on all calls to that. Now every time I enter text or click on a button, I hit a breakpoint.

Next I enter a password of "AAAA" and get the message "See Ya!", but one character at a time.

Looking at the stack I see a memory address of 0x0003852E. I know this isn't any known DLL by cross-referencing the memory window. This falls under the segment of 0x10000 with size 0x90000.

It's possible NTVDM always stores the 16-bit file in the memory range 0x10000 - 0xA0000. Running spawns ntvdm.exe too. Checking the memory range 0x10000-0xA0000 returns ASCII that would suggest that is also stored in the memory range. This is useful to know.

I follow that memory segment in the dump window and see some ASCII. I saved the dump to disk through right-click->Backup->Save data to file.

Then run that file through strings and search for "Pass".

We see the string "xSAAAr PassWord and Retry". This looks like my password of "AAAA" overwrote the string "Enter the PassWord and Retry". It also looks as if the string "See Ya!" is about to overwrite the same string. And hitting continue a few more times through the breakpoints does demonstrate that "See Ya!" ends up overwriting the original string. Just below we see the string of "leetness" which is the password.

I knew the password was "leetness" from the beginning so I did everything without using that knowledge, although at the end I wonder if I would have made the final connection seeing "leetness" in memory.

Here's your prize! You get to crack AOL accounts for AOL 2.5, 3.0, and maybe 4.0. I don't remember.

Other Approaches

 I tried Turbo Debugger for Windows 3.1 on a Windows 98 VM but I think there were some anti-debug mechanisms that Turbo Debugger was not equipped to handle or I wasn't trained enough with Turbo.

Conclusion and Future Work

So most 32-bit tools don't work on 16-bit executables, and 32-bit tools are mostly what I have experience with. On a positive note, 16-bit executables can be debugged through OllyDbg by attaching to ntvdm.exe. I hope that helps somebody.

I fumbled through this process but had fun taking a trip down memory lane. I'd like to gather a 16-bit toolset to better tackle this challenge and try again. I just read some good things about DOSBox and it's built-in debugger. Also, Open Watcom has 16-bit tools like a compiler and debugger. I tried them on WinXP but ran into the ntvdm emulation layer problem. So I'll be trying those on Windows 98.

Edit: another picture

Monday, August 26, 2013

PE Runtime Data Structures v1

This is a diagram of PE runtime data structures. I used OmniGraffle. I will post the OmniGraffle files when I get ahold of them again. They are on another computer. 

I was inspired by Ero Carrera's [1] diagrams and Corkami [2]. I made this diagram because I was teaching myself Windows data structures and was unsatisfied with what was out there. The information for these structures was obtained from "Windows Internals 6" by Russinovich, Solomon, and Ionescu [3].

It is not finished, but I figured I should just upload it now instead of whenever I get around to finishing it. Hopefully I haven't made any mistakes. It will probably go through many iterations, maybe end up being interactive.

Edit: Nov. 10, 2013 
OmniGraffle file: