Reverse Engineering Patterns

Patterns for analyzing compiled code and binaries.

Why Reverse Engineering Matters

Even if you never write assembly, you need to READ it:

  1. Debugging - When the debugger drops you into disassembly

  2. Security - Understanding exploits, malware analysis

  3. Performance - Seeing what the compiler actually generated

  4. CTFs - Capture the flag competitions, crackmses

  5. Understanding - What does memcpy really do?

Tools covered: objdump, GDB, strace, ltrace, radare2

objdump - Quick Disassembly

# Basic disassembly (default: AT&T syntax)
objdump -d /bin/ls

# Intel syntax (MUCH more readable)
objdump -d -M intel /bin/ls

# With source interleaved (if compiled with -g)
objdump -d -S -M intel ./my_program

# Only specific section
objdump -d -M intel --section=.text /bin/ls

# Show relocations
objdump -d -r -M intel ./my_program.o

# Show all headers
objdump -x /bin/ls

# Show symbols
objdump -t /bin/ls    # All symbols
objdump -T /bin/ls    # Dynamic symbols only

# Disassemble specific function (grep + context)
objdump -d -M intel /bin/ls | grep -A 30 '<main>:'

# With addresses relative to section start
objdump -d -M intel --adjust-vma=0 /bin/ls

objdump output explained:

0000000000401126 <main>:
  401126:   55                      push   rbp
  401127:   48 89 e5                mov    rbp,rsp
  ^^^^^^^^  ^^^^^^^^                ^^^^^^^^^^^
  Address   Machine code (hex)      Assembly instruction

GDB - Interactive Analysis

# Start GDB
gdb ./program
gdb -q ./program              # Quiet mode (no banner)
gdb --args ./program arg1 arg2 # With arguments

# Basic commands
(gdb) run                     # Run program
(gdb) run arg1 arg2           # Run with arguments
(gdb) quit                    # Exit

# Breakpoints
(gdb) break main              # Break at function
(gdb) break *0x401126         # Break at address
(gdb) break file.c:25         # Break at line (with debug info)
(gdb) info breakpoints        # List breakpoints
(gdb) delete 1                # Delete breakpoint 1
(gdb) disable 1               # Disable breakpoint 1

# Execution control
(gdb) continue                # Continue execution
(gdb) stepi                   # Step one instruction
(gdb) nexti                   # Next instruction (skip calls)
(gdb) finish                  # Run until current function returns
(gdb) until *0x401150         # Run until address

# Examine registers
(gdb) info registers          # All registers
(gdb) info registers rax rbx  # Specific registers
(gdb) print $rax              # Print RAX value
(gdb) print/x $rax            # Print in hex
(gdb) print/t $rax            # Print in binary

GDB - Examining Memory

# x (examine) command format: x/NFU address
# N = count, F = format, U = unit size

# Format specifiers
# x = hex, d = decimal, u = unsigned, o = octal
# t = binary, a = address, c = char, s = string
# i = instruction

# Unit sizes
# b = byte, h = halfword (2), w = word (4), g = giant (8)

# Examples
(gdb) x/10x $rsp              # 10 hex words at stack pointer
(gdb) x/20i $rip              # 20 instructions at current position
(gdb) x/s 0x402000            # String at address
(gdb) x/10gx $rsp             # 10 quad words (64-bit) in hex
(gdb) x/4i $rip               # Next 4 instructions
(gdb) x/10bx $rax             # 10 bytes starting at address in RAX

# Examine memory at expression
(gdb) x/s *((char**)$rdi)     # String pointed to by pointer in RDI

# Show memory around address
(gdb) x/32xb 0x401000         # 32 bytes in hex

# Disassemble function
(gdb) disassemble main
(gdb) disassemble /r main     # With machine code bytes
(gdb) disassemble /m main     # With source (if available)

GDB - Advanced Usage

# Stack analysis
(gdb) backtrace               # Call stack
(gdb) bt full                 # With local variables
(gdb) info frame              # Current frame details
(gdb) info locals             # Local variables
(gdb) info args               # Function arguments

# Watch for memory changes
(gdb) watch *0x601050         # Break when memory changes
(gdb) rwatch *0x601050        # Break when read
(gdb) awatch *0x601050        # Break on read or write

# Conditional breakpoints
(gdb) break main if argc > 2
(gdb) break *0x401130 if $rax == 0

# Commands on breakpoint
(gdb) break *0x401130
(gdb) commands 1
> print/x $rax
> continue
> end

# Set register/memory values
(gdb) set $rax = 0x42
(gdb) set *(int*)0x601050 = 100
(gdb) set {char[10]}0x7fff1234 = "hello"

# Process memory map
(gdb) info proc mappings

# Follow child processes
(gdb) set follow-fork-mode child

# Pwndbg/GEF enhancement commands (if installed)
(gdb) context                 # Show registers, stack, code
(gdb) vmmap                   # Memory mappings
(gdb) checksec                # Security features
(gdb) telescope $rsp 20       # Smart stack display

Recognizing Common Patterns

; ═══════════════════════════════════════════════════════════════════
; FUNCTION PROLOGUE / EPILOGUE
; ═══════════════════════════════════════════════════════════════════

; Standard prologue
push   rbp
mov    rbp, rsp
sub    rsp, 0x20        ; 32 bytes for locals

; Standard epilogue
leave                   ; mov rsp, rbp; pop rbp
ret
; OR
add    rsp, 0x20
pop    rbp
ret

; Leaf function (no frame setup)
; Just starts with the function body, ends with ret

; ═══════════════════════════════════════════════════════════════════
; IF-ELSE
; ═══════════════════════════════════════════════════════════════════

; if (x > 10) { a(); } else { b(); }
cmp    edi, 0xa         ; Compare x to 10
jle    .else_branch     ; If x <= 10, goto else
call   a                ; Then branch
jmp    .endif
.else_branch:
call   b                ; Else branch
.endif:

; ═══════════════════════════════════════════════════════════════════
; LOOPS
; ═══════════════════════════════════════════════════════════════════

; for (int i = 0; i < n; i++) { body(); }
xor    ecx, ecx         ; i = 0
jmp    .check
.loop:
; ... loop body ...
inc    ecx              ; i++
.check:
cmp    ecx, edi         ; i < n?
jl     .loop            ; If so, continue

; while (condition) { body(); }
jmp    .check
.loop:
; ... loop body ...
.check:
; ... compute condition ...
test   eax, eax
jnz    .loop

; ═══════════════════════════════════════════════════════════════════
; SWITCH/JUMP TABLE
; ═══════════════════════════════════════════════════════════════════

; switch (x) { case 0: ...; case 1: ...; case 2: ...; }
cmp    edi, 2           ; Check if x > max case
ja     .default
lea    rax, [rel .jump_table]
movsxd rdi, edi
jmp    [rax + rdi*8]    ; Jump to table[x]

.jump_table:
    dq .case_0
    dq .case_1
    dq .case_2

Recognizing Function Calls

; ═══════════════════════════════════════════════════════════════════
; PRINTF CALL
; ═══════════════════════════════════════════════════════════════════

; printf("x = %d, y = %d\n", x, y);
lea    rdi, [rel .LC0]  ; Format string address
mov    esi, [rbp-4]     ; x (second arg)
mov    edx, [rbp-8]     ; y (third arg)
xor    eax, eax         ; No floating point args (variadic)
call   printf

; ═══════════════════════════════════════════════════════════════════
; MALLOC / FREE
; ═══════════════════════════════════════════════════════════════════

; void *p = malloc(100);
mov    edi, 100
call   malloc
mov    [rbp-8], rax     ; Store pointer

; free(p);
mov    rdi, [rbp-8]
call   free

; ═══════════════════════════════════════════════════════════════════
; STRCMP
; ═══════════════════════════════════════════════════════════════════

; if (strcmp(s1, s2) == 0)
mov    rdi, [rbp-8]     ; s1
mov    rsi, [rbp-16]    ; s2
call   strcmp
test   eax, eax         ; strcmp returns 0 if equal
jne    .not_equal

; ═══════════════════════════════════════════════════════════════════
; MEMCPY
; ═══════════════════════════════════════════════════════════════════

; memcpy(dest, src, n);
mov    rdi, [rbp-8]     ; dest
mov    rsi, [rbp-16]    ; src
mov    edx, 100         ; n
call   memcpy

; REP MOVSB (inline memcpy for small sizes)
lea    rdi, [rbp-40]    ; dest
lea    rsi, [rbp-80]    ; src
mov    ecx, 32          ; count
rep    movsb            ; Copy ECX bytes from RSI to RDI

Analyzing Stripped Binaries

# Check if binary is stripped
file ./program
# output: "stripped" or "not stripped"

# Find main() in stripped PIE binary
# 1. Find __libc_start_main call
# 2. First argument (RDI) is main address

objdump -d -M intel ./program | grep -B5 '__libc_start_main'
# Look for: lea rdi, [rip+0x123] or similar before the call
# That offset + current address = main

# Use radare2 for automated analysis
r2 ./program
[0x00001060]> aaa        # Analyze all
[0x00001060]> afl        # List functions
[0x00001060]> s main     # Seek to main
[0x00001060]> pdf        # Print disassembly of function

# Identify library functions by their patterns
# strlen: looks for null byte in loop
# memcpy: rep movsb or SIMD instructions
# strcmp: byte-by-byte comparison loop

Pattern: Finding main in stripped binary

; Typical _start (entry point)
_start:
    xor    ebp, ebp              ; Clear frame pointer
    mov    r9, rdx               ; rtld_fini
    pop    rsi                   ; argc
    mov    rdx, rsp              ; argv
    and    rsp, 0xfffffff0       ; Align stack
    push   rax                   ; padding
    push   rsp                   ; stack_end
    lea    r8, [rip+0x...]       ; fini
    lea    rcx, [rip+0x...]      ; init
    lea    rdi, [rip+0x123]      ; ← THIS IS MAIN!
    call   __libc_start_main

strace and ltrace

# STRACE - System call trace
strace ./program                    # Basic trace
strace -f ./program                 # Follow forks
strace -e open ./program            # Only open() calls
strace -e trace=file ./program      # File-related calls
strace -e trace=network ./program   # Network calls
strace -e trace=memory ./program    # Memory calls
strace -c ./program                 # Summary statistics
strace -tt ./program                # With timestamps
strace -o trace.log ./program       # Output to file

# Filter by syscall
strace -e openat,read,write ./program

# Attach to running process
strace -p $(pgrep program_name)

# LTRACE - Library call trace
ltrace ./program                    # Basic trace
ltrace -e malloc+free ./program     # Only malloc/free
ltrace -e '*' ./program             # All library calls
ltrace -c ./program                 # Summary

# Combine for full picture
strace -f ./program 2>&1 | tee syscalls.log &
ltrace -f ./program 2>&1 | tee libcalls.log

# Example strace output
# openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3
# read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 2773
# close(3)                                = 0

Security-Relevant Patterns

; ═══════════════════════════════════════════════════════════════════
; BUFFER OVERFLOW SETUP (vulnerable)
; ═══════════════════════════════════════════════════════════════════

vulnerable_function:
    push   rbp
    mov    rbp, rsp
    sub    rsp, 0x40        ; 64 bytes buffer

    lea    rdi, [rbp-0x40]  ; Buffer on stack
    call   gets             ; ← DANGEROUS! No bounds check
                            ; Can overwrite return address at [rbp+8]
    leave
    ret                     ; Returns to attacker-controlled address!

; ═══════════════════════════════════════════════════════════════════
; STACK CANARY CHECK
; ═══════════════════════════════════════════════════════════════════

protected_function:
    push   rbp
    mov    rbp, rsp
    sub    rsp, 0x50
    mov    rax, QWORD PTR fs:[0x28]  ; Load canary from TLS
    mov    QWORD PTR [rbp-0x8], rax  ; Store on stack

    ; ... function body ...

    mov    rax, QWORD PTR [rbp-0x8]  ; Load canary from stack
    xor    rax, QWORD PTR fs:[0x28]  ; Compare with original
    jne    .stack_smash_detected     ; If different, abort
    leave
    ret

.stack_smash_detected:
    call   __stack_chk_fail          ; Terminates program

; ═══════════════════════════════════════════════════════════════════
; FORMAT STRING VULNERABILITY
; ═══════════════════════════════════════════════════════════════════

; Vulnerable:
lea    rdi, [rbp-0x100]     ; User-controlled buffer
xor    eax, eax
call   printf               ; printf(user_input)
                            ; User can use %x, %n to read/write memory

; Safe:
lea    rdi, [rel .format]   ; "%s"
lea    rsi, [rbp-0x100]     ; User input
xor    eax, eax
call   printf               ; printf("%s", user_input)

; ═══════════════════════════════════════════════════════════════════
; ROP GADGETS (Return-Oriented Programming)
; ═══════════════════════════════════════════════════════════════════

; Useful gadgets to look for:
pop rdi; ret            ; Set first argument
pop rsi; ret            ; Set second argument
pop rdx; ret            ; Set third argument
pop rax; ret            ; Set syscall number
syscall; ret            ; Execute syscall
mov rdi, rax; ret       ; Move return value to argument
xchg rax, rdi; ret      ; Swap registers
# Find gadgets with ROPgadget
ROPgadget --binary ./program | grep "pop rdi"

# Check security features
checksec --file=./program
# Output shows: RELRO, Stack Canary, NX, PIE, etc.

# Check with readelf
readelf -l ./program | grep GNU_STACK
# If RWE (read-write-execute), stack is executable (bad!)

Practical Exercise: Crack a Password Check

// password_check.c - Compile and analyze
#include <string.h>
#include <stdio.h>

int check_password(const char *input) {
    const char *secret = "hunter2";
    return strcmp(input, secret) == 0;
}

int main(int argc, char **argv) {
    if (argc != 2) {
        printf("Usage: %s <password>\n", argv[0]);
        return 1;
    }
    if (check_password(argv[1])) {
        printf("Access granted!\n");
        return 0;
    } else {
        printf("Access denied!\n");
        return 1;
    }
}
# Compile
gcc -O0 -o password_check password_check.c

# Step 1: Find the check_password function
objdump -d -M intel password_check | grep -A 30 '<check_password>'

# Step 2: Find where secret string is loaded
# Look for: lea rsi, [rip+0x...] before strcmp call

# Step 3: Find the string
strings password_check | grep -E '^[a-z]{5,10}$'
# Or: objdump -s -j .rodata password_check

# Step 4: Using GDB
gdb ./password_check
(gdb) break check_password
(gdb) run test_password
(gdb) x/s $rsi       # After strcmp setup, RSI has secret
# Output: 0x402004: "hunter2"

# Step 5: Using ltrace
ltrace ./password_check test_password
# Shows: strcmp("test_password", "hunter2") = ...

# Verify
./password_check hunter2
# Output: Access granted!

Understanding Compiler Optimization

# Same code at different optimization levels
cat << 'EOF' > test.c
int sum(int *arr, int n) {
    int total = 0;
    for (int i = 0; i < n; i++) {
        total += arr[i];
    }
    return total;
}
EOF

# No optimization (-O0)
gcc -O0 -c test.c && objdump -d -M intel test.o
# Very verbose, matches C code closely
# Lots of stack access, no register reuse

# Optimization (-O2)
gcc -O2 -c test.c && objdump -d -M intel test.o
# Loop unrolling, vectorization (SIMD)
# Registers instead of stack
# May be hard to follow

# Aggressive optimization (-O3)
gcc -O3 -c test.c && objdump -d -M intel test.o
# Even more aggressive unrolling
# May use SSE/AVX for parallel addition

Common optimizations to recognize:

; Strength reduction: multiply → shift
imul eax, 8        ; Original
shl  eax, 3        ; Optimized (same result, faster)

; Loop invariant code motion
.loop:
    mov  eax, [rbx]    ; Load constant (moved outside)
    add  ecx, eax
    ...
; Becomes:
mov  eax, [rbx]        ; Load once before loop
.loop:
    add  ecx, eax      ; Use cached value
    ...

; Dead code elimination
mov  eax, 5
mov  eax, 10           ; First assignment eliminated
ret

; Inline expansion
call small_function    ; Before
; Becomes the function body inlined ; After

; Tail call optimization
func:
    ...
    call other_func    ; Normal
    ret
; Becomes:
func:
    ...
    jmp other_func     ; Tail call (saves stack frame)

Tools Quick Reference

Tool Purpose

objdump

Quick disassembly, symbols, sections

readelf

ELF header analysis, segments, sections

nm

Symbol table listing

strings

Find printable strings in binary

file

Identify file type

ldd

List shared library dependencies

strace

System call tracing

ltrace

Library call tracing

gdb

Interactive debugging

r2/radare2

Advanced reversing framework

ghidra

Decompilation (free, NSA)

IDA

Decompilation (commercial)

pwntools

CTF exploitation framework

ROPgadget

Find ROP gadgets

checksec

Check security features

# Quick binary analysis workflow
file ./binary
checksec --file=./binary
strings ./binary | head -50
nm ./binary 2>/dev/null || echo "stripped"
objdump -d -M intel ./binary | head -100