Skip to content

A quick look into the DOS exe header and relocation table

Continuing from where we left off, let's examine USNF.EXE in a hex editor, to get a better understanding of the file structure. The file command identifies the executable as MS-DOS executable. Loading into a hex editor:

DOS exe header

A good description of the DOS exe header can be found at http://www.delorie.com/djgpp/doc/exe/

Offset Size Value Description
0x00 word 0x5a4d ASCII characters 'MZ'.
0x02 word 0x0150 (336 decimal) Number of bytes in the last block of the program that are actually used. Zero means the entire last block is used (effectively 512).
0x04 word 0x000d (13) Number of 512 byte blocks in the file that are part of the exe.
0x06 word 0x005d (93) Number of relocation entries.
0x08 word 0x001c (28) Number of 16 byte paragraphs in the header. This is the start of the program data.
0x0a word 0x0007 (7) Minimum number of 16 byte paragraphs reqiured by the program.
0x0c word 0xffff Maximum number of 16 byte paragraphs requested by the program.
0x0e word 0x0176 Relative value of the stack segment. *Added to the segment the program was loaded at.
0x10 word 0x0064 Initial value of the SP register.
0x12 word 0x0000 If set, the 16 bit sum of all words in the file should be zero. Normally not filled in.
0x14 word 0x0000 Inital value of the IP register.
0x16 word 0x0020 Relative value of the code segment. Added to the segment the program was loaded at.
0x18 word 0x0040 Offset of the relocation table (relative to the start of the file).
0x1a word 0x0000 Overlay number. Zero indicates it's the main program.

Finding the start of the program data:

The word at offset 0x08 tells us the size of the MZ header (0x1c or 28 paragraphs). Thus the start of program data will be offset 0x1c0 or 448 bytes (28 x 16) from the start of the file. Indeed we can find a copyright message at this location.

Finding the program entry point:

The words at offsets 0x14 and 0x16 give us the offset and (relative) segment of the program entry point respectively. To find the entry point in the program, add the (relative) segment (0x20) paragraphs, or 0x200 bytes, to the start of program data, then add the initial value of the IP register (zero in our case). We should find the program entry point at offset 0x3c0.

Calculating the program size:

The words at offsets 0x02 and 0x04 tell us the program is 12 full 512 byte blocks, plus 336 bytes in the final block. This gives a program size of (12 * 512) + 336 = 6,480 bytes. Adding this to the start offset (0x1c0) gives the program end at offset 0x1b10 (or 6928 decimal). Jumping to that location in the file, we find a bunch of zeroes, but there is something interesting at 0x1b50, that could be a windows PE header. We'll look at it in more detail later.

Relocation table

The word at offset 0x18 tells us the start of the relocation table, in our case 0x40. The word at offset 0x06 tells us the number of relocation entries (93). Each relocation entry is four bytes and consists of a 16 bit offset followed by a 16 bit segment. For each entry, the loader adds the start segment address to the word value pointed to by the segment:offset pair.

Calculate the start segment address:

We know the start of the program data (based on the word at offset 0x08) is at offset 0x1c0 from the start of the file. To find out where it was loaded into memory we can subtract the word at offset 0x16 (0x20) from the initial value in the CS register (0x020D). This gives a start segment address value of 0x01ED, representing the location the program was loaded into memory.

Let's examine the first two relocation entries:

0000:003C

0000:0040

Adding the offsets to the start of the program data, we find the file contains zeroes at offsets 0x1fc and 0x200. Comparing the memory view of the corresponding location in the loaded file (01ED:003C and 01ED:0040), we can confirm the zeroes have been altered to 0x01ED.

In fact, it appears that the bytes starting at 0x003C to 0x004F consist of five segment:offset pairs, all pointing to locations within the starting segment.

Let's look at another, non-trivial example, near the program entry point:

Look closely at the instruction at 020D:0002:

9A 02 00 7C 02    call 027C:0002

Compared with the original bytes in the source file:

9A 02 00 8F 00

Note the value 0x008F at address 020D:0005 has 0x1ED added to it, as expected to become 0x027C as part of the call instruction. Let's find the relocation entry corresponding to this address.

Look at offset 0x0054 from the beginning of the file:

0020:0005

Remember, our file was loaded into segment 0x01ED. Adding 0x0020 to the segment gives us 0x020D (the initial value of the CS register), and the 0x0005 offset leaves us the adjusted value in our call instruction!

Now we understand how the relocation table works, take a close look at the value of the relocation segments. We've already seen the five instances from 0x0000, and there's a few instances from 0x0020 and 0x0027, but the majority are from segment 0x008F. This gives us an idea about the structure of the executable code in the file.

Using the information we've discovered, we can map out the file structure.

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

No comments

Add Comment

Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA 1CAPTCHA 2CAPTCHA 3CAPTCHA 4CAPTCHA 5


Form options