Reverse Engineering Patterns
Patterns for analyzing compiled code and binaries.
Why Reverse Engineering Matters
Even if you never write assembly, you need to READ it:
-
Debugging - When the debugger drops you into disassembly
-
Security - Understanding exploits, malware analysis
-
Performance - Seeing what the compiler actually generated
-
CTFs - Capture the flag competitions, crackmses
-
Understanding - What does
memcpyreally do?
Tools covered: objdump, GDB, strace, ltrace, radare2
objdump - Quick Disassembly
# Basic disassembly (default: AT&T syntax)
objdump -d /bin/ls
# Intel syntax (MUCH more readable)
objdump -d -M intel /bin/ls
# With source interleaved (if compiled with -g)
objdump -d -S -M intel ./my_program
# Only specific section
objdump -d -M intel --section=.text /bin/ls
# Show relocations
objdump -d -r -M intel ./my_program.o
# Show all headers
objdump -x /bin/ls
# Show symbols
objdump -t /bin/ls # All symbols
objdump -T /bin/ls # Dynamic symbols only
# Disassemble specific function (grep + context)
objdump -d -M intel /bin/ls | grep -A 30 '<main>:'
# With addresses relative to section start
objdump -d -M intel --adjust-vma=0 /bin/ls
objdump output explained:
0000000000401126 <main>:
401126: 55 push rbp
401127: 48 89 e5 mov rbp,rsp
^^^^^^^^ ^^^^^^^^ ^^^^^^^^^^^
Address Machine code (hex) Assembly instruction
GDB - Interactive Analysis
# Start GDB
gdb ./program
gdb -q ./program # Quiet mode (no banner)
gdb --args ./program arg1 arg2 # With arguments
# Basic commands
(gdb) run # Run program
(gdb) run arg1 arg2 # Run with arguments
(gdb) quit # Exit
# Breakpoints
(gdb) break main # Break at function
(gdb) break *0x401126 # Break at address
(gdb) break file.c:25 # Break at line (with debug info)
(gdb) info breakpoints # List breakpoints
(gdb) delete 1 # Delete breakpoint 1
(gdb) disable 1 # Disable breakpoint 1
# Execution control
(gdb) continue # Continue execution
(gdb) stepi # Step one instruction
(gdb) nexti # Next instruction (skip calls)
(gdb) finish # Run until current function returns
(gdb) until *0x401150 # Run until address
# Examine registers
(gdb) info registers # All registers
(gdb) info registers rax rbx # Specific registers
(gdb) print $rax # Print RAX value
(gdb) print/x $rax # Print in hex
(gdb) print/t $rax # Print in binary
GDB - Examining Memory
# x (examine) command format: x/NFU address
# N = count, F = format, U = unit size
# Format specifiers
# x = hex, d = decimal, u = unsigned, o = octal
# t = binary, a = address, c = char, s = string
# i = instruction
# Unit sizes
# b = byte, h = halfword (2), w = word (4), g = giant (8)
# Examples
(gdb) x/10x $rsp # 10 hex words at stack pointer
(gdb) x/20i $rip # 20 instructions at current position
(gdb) x/s 0x402000 # String at address
(gdb) x/10gx $rsp # 10 quad words (64-bit) in hex
(gdb) x/4i $rip # Next 4 instructions
(gdb) x/10bx $rax # 10 bytes starting at address in RAX
# Examine memory at expression
(gdb) x/s *((char**)$rdi) # String pointed to by pointer in RDI
# Show memory around address
(gdb) x/32xb 0x401000 # 32 bytes in hex
# Disassemble function
(gdb) disassemble main
(gdb) disassemble /r main # With machine code bytes
(gdb) disassemble /m main # With source (if available)
GDB - Advanced Usage
# Stack analysis
(gdb) backtrace # Call stack
(gdb) bt full # With local variables
(gdb) info frame # Current frame details
(gdb) info locals # Local variables
(gdb) info args # Function arguments
# Watch for memory changes
(gdb) watch *0x601050 # Break when memory changes
(gdb) rwatch *0x601050 # Break when read
(gdb) awatch *0x601050 # Break on read or write
# Conditional breakpoints
(gdb) break main if argc > 2
(gdb) break *0x401130 if $rax == 0
# Commands on breakpoint
(gdb) break *0x401130
(gdb) commands 1
> print/x $rax
> continue
> end
# Set register/memory values
(gdb) set $rax = 0x42
(gdb) set *(int*)0x601050 = 100
(gdb) set {char[10]}0x7fff1234 = "hello"
# Process memory map
(gdb) info proc mappings
# Follow child processes
(gdb) set follow-fork-mode child
# Pwndbg/GEF enhancement commands (if installed)
(gdb) context # Show registers, stack, code
(gdb) vmmap # Memory mappings
(gdb) checksec # Security features
(gdb) telescope $rsp 20 # Smart stack display
Recognizing Common Patterns
; ═══════════════════════════════════════════════════════════════════
; FUNCTION PROLOGUE / EPILOGUE
; ═══════════════════════════════════════════════════════════════════
; Standard prologue
push rbp
mov rbp, rsp
sub rsp, 0x20 ; 32 bytes for locals
; Standard epilogue
leave ; mov rsp, rbp; pop rbp
ret
; OR
add rsp, 0x20
pop rbp
ret
; Leaf function (no frame setup)
; Just starts with the function body, ends with ret
; ═══════════════════════════════════════════════════════════════════
; IF-ELSE
; ═══════════════════════════════════════════════════════════════════
; if (x > 10) { a(); } else { b(); }
cmp edi, 0xa ; Compare x to 10
jle .else_branch ; If x <= 10, goto else
call a ; Then branch
jmp .endif
.else_branch:
call b ; Else branch
.endif:
; ═══════════════════════════════════════════════════════════════════
; LOOPS
; ═══════════════════════════════════════════════════════════════════
; for (int i = 0; i < n; i++) { body(); }
xor ecx, ecx ; i = 0
jmp .check
.loop:
; ... loop body ...
inc ecx ; i++
.check:
cmp ecx, edi ; i < n?
jl .loop ; If so, continue
; while (condition) { body(); }
jmp .check
.loop:
; ... loop body ...
.check:
; ... compute condition ...
test eax, eax
jnz .loop
; ═══════════════════════════════════════════════════════════════════
; SWITCH/JUMP TABLE
; ═══════════════════════════════════════════════════════════════════
; switch (x) { case 0: ...; case 1: ...; case 2: ...; }
cmp edi, 2 ; Check if x > max case
ja .default
lea rax, [rel .jump_table]
movsxd rdi, edi
jmp [rax + rdi*8] ; Jump to table[x]
.jump_table:
dq .case_0
dq .case_1
dq .case_2
Recognizing Function Calls
; ═══════════════════════════════════════════════════════════════════
; PRINTF CALL
; ═══════════════════════════════════════════════════════════════════
; printf("x = %d, y = %d\n", x, y);
lea rdi, [rel .LC0] ; Format string address
mov esi, [rbp-4] ; x (second arg)
mov edx, [rbp-8] ; y (third arg)
xor eax, eax ; No floating point args (variadic)
call printf
; ═══════════════════════════════════════════════════════════════════
; MALLOC / FREE
; ═══════════════════════════════════════════════════════════════════
; void *p = malloc(100);
mov edi, 100
call malloc
mov [rbp-8], rax ; Store pointer
; free(p);
mov rdi, [rbp-8]
call free
; ═══════════════════════════════════════════════════════════════════
; STRCMP
; ═══════════════════════════════════════════════════════════════════
; if (strcmp(s1, s2) == 0)
mov rdi, [rbp-8] ; s1
mov rsi, [rbp-16] ; s2
call strcmp
test eax, eax ; strcmp returns 0 if equal
jne .not_equal
; ═══════════════════════════════════════════════════════════════════
; MEMCPY
; ═══════════════════════════════════════════════════════════════════
; memcpy(dest, src, n);
mov rdi, [rbp-8] ; dest
mov rsi, [rbp-16] ; src
mov edx, 100 ; n
call memcpy
; REP MOVSB (inline memcpy for small sizes)
lea rdi, [rbp-40] ; dest
lea rsi, [rbp-80] ; src
mov ecx, 32 ; count
rep movsb ; Copy ECX bytes from RSI to RDI
Analyzing Stripped Binaries
# Check if binary is stripped
file ./program
# output: "stripped" or "not stripped"
# Find main() in stripped PIE binary
# 1. Find __libc_start_main call
# 2. First argument (RDI) is main address
objdump -d -M intel ./program | grep -B5 '__libc_start_main'
# Look for: lea rdi, [rip+0x123] or similar before the call
# That offset + current address = main
# Use radare2 for automated analysis
r2 ./program
[0x00001060]> aaa # Analyze all
[0x00001060]> afl # List functions
[0x00001060]> s main # Seek to main
[0x00001060]> pdf # Print disassembly of function
# Identify library functions by their patterns
# strlen: looks for null byte in loop
# memcpy: rep movsb or SIMD instructions
# strcmp: byte-by-byte comparison loop
Pattern: Finding main in stripped binary
; Typical _start (entry point)
_start:
xor ebp, ebp ; Clear frame pointer
mov r9, rdx ; rtld_fini
pop rsi ; argc
mov rdx, rsp ; argv
and rsp, 0xfffffff0 ; Align stack
push rax ; padding
push rsp ; stack_end
lea r8, [rip+0x...] ; fini
lea rcx, [rip+0x...] ; init
lea rdi, [rip+0x123] ; ← THIS IS MAIN!
call __libc_start_main
strace and ltrace
# STRACE - System call trace
strace ./program # Basic trace
strace -f ./program # Follow forks
strace -e open ./program # Only open() calls
strace -e trace=file ./program # File-related calls
strace -e trace=network ./program # Network calls
strace -e trace=memory ./program # Memory calls
strace -c ./program # Summary statistics
strace -tt ./program # With timestamps
strace -o trace.log ./program # Output to file
# Filter by syscall
strace -e openat,read,write ./program
# Attach to running process
strace -p $(pgrep program_name)
# LTRACE - Library call trace
ltrace ./program # Basic trace
ltrace -e malloc+free ./program # Only malloc/free
ltrace -e '*' ./program # All library calls
ltrace -c ./program # Summary
# Combine for full picture
strace -f ./program 2>&1 | tee syscalls.log &
ltrace -f ./program 2>&1 | tee libcalls.log
# Example strace output
# openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3
# read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 2773
# close(3) = 0
Security-Relevant Patterns
; ═══════════════════════════════════════════════════════════════════
; BUFFER OVERFLOW SETUP (vulnerable)
; ═══════════════════════════════════════════════════════════════════
vulnerable_function:
push rbp
mov rbp, rsp
sub rsp, 0x40 ; 64 bytes buffer
lea rdi, [rbp-0x40] ; Buffer on stack
call gets ; ← DANGEROUS! No bounds check
; Can overwrite return address at [rbp+8]
leave
ret ; Returns to attacker-controlled address!
; ═══════════════════════════════════════════════════════════════════
; STACK CANARY CHECK
; ═══════════════════════════════════════════════════════════════════
protected_function:
push rbp
mov rbp, rsp
sub rsp, 0x50
mov rax, QWORD PTR fs:[0x28] ; Load canary from TLS
mov QWORD PTR [rbp-0x8], rax ; Store on stack
; ... function body ...
mov rax, QWORD PTR [rbp-0x8] ; Load canary from stack
xor rax, QWORD PTR fs:[0x28] ; Compare with original
jne .stack_smash_detected ; If different, abort
leave
ret
.stack_smash_detected:
call __stack_chk_fail ; Terminates program
; ═══════════════════════════════════════════════════════════════════
; FORMAT STRING VULNERABILITY
; ═══════════════════════════════════════════════════════════════════
; Vulnerable:
lea rdi, [rbp-0x100] ; User-controlled buffer
xor eax, eax
call printf ; printf(user_input)
; User can use %x, %n to read/write memory
; Safe:
lea rdi, [rel .format] ; "%s"
lea rsi, [rbp-0x100] ; User input
xor eax, eax
call printf ; printf("%s", user_input)
; ═══════════════════════════════════════════════════════════════════
; ROP GADGETS (Return-Oriented Programming)
; ═══════════════════════════════════════════════════════════════════
; Useful gadgets to look for:
pop rdi; ret ; Set first argument
pop rsi; ret ; Set second argument
pop rdx; ret ; Set third argument
pop rax; ret ; Set syscall number
syscall; ret ; Execute syscall
mov rdi, rax; ret ; Move return value to argument
xchg rax, rdi; ret ; Swap registers
# Find gadgets with ROPgadget
ROPgadget --binary ./program | grep "pop rdi"
# Check security features
checksec --file=./program
# Output shows: RELRO, Stack Canary, NX, PIE, etc.
# Check with readelf
readelf -l ./program | grep GNU_STACK
# If RWE (read-write-execute), stack is executable (bad!)
Practical Exercise: Crack a Password Check
// password_check.c - Compile and analyze
#include <string.h>
#include <stdio.h>
int check_password(const char *input) {
const char *secret = "hunter2";
return strcmp(input, secret) == 0;
}
int main(int argc, char **argv) {
if (argc != 2) {
printf("Usage: %s <password>\n", argv[0]);
return 1;
}
if (check_password(argv[1])) {
printf("Access granted!\n");
return 0;
} else {
printf("Access denied!\n");
return 1;
}
}
# Compile
gcc -O0 -o password_check password_check.c
# Step 1: Find the check_password function
objdump -d -M intel password_check | grep -A 30 '<check_password>'
# Step 2: Find where secret string is loaded
# Look for: lea rsi, [rip+0x...] before strcmp call
# Step 3: Find the string
strings password_check | grep -E '^[a-z]{5,10}$'
# Or: objdump -s -j .rodata password_check
# Step 4: Using GDB
gdb ./password_check
(gdb) break check_password
(gdb) run test_password
(gdb) x/s $rsi # After strcmp setup, RSI has secret
# Output: 0x402004: "hunter2"
# Step 5: Using ltrace
ltrace ./password_check test_password
# Shows: strcmp("test_password", "hunter2") = ...
# Verify
./password_check hunter2
# Output: Access granted!
Understanding Compiler Optimization
# Same code at different optimization levels
cat << 'EOF' > test.c
int sum(int *arr, int n) {
int total = 0;
for (int i = 0; i < n; i++) {
total += arr[i];
}
return total;
}
EOF
# No optimization (-O0)
gcc -O0 -c test.c && objdump -d -M intel test.o
# Very verbose, matches C code closely
# Lots of stack access, no register reuse
# Optimization (-O2)
gcc -O2 -c test.c && objdump -d -M intel test.o
# Loop unrolling, vectorization (SIMD)
# Registers instead of stack
# May be hard to follow
# Aggressive optimization (-O3)
gcc -O3 -c test.c && objdump -d -M intel test.o
# Even more aggressive unrolling
# May use SSE/AVX for parallel addition
Common optimizations to recognize:
; Strength reduction: multiply → shift
imul eax, 8 ; Original
shl eax, 3 ; Optimized (same result, faster)
; Loop invariant code motion
.loop:
mov eax, [rbx] ; Load constant (moved outside)
add ecx, eax
...
; Becomes:
mov eax, [rbx] ; Load once before loop
.loop:
add ecx, eax ; Use cached value
...
; Dead code elimination
mov eax, 5
mov eax, 10 ; First assignment eliminated
ret
; Inline expansion
call small_function ; Before
; Becomes the function body inlined ; After
; Tail call optimization
func:
...
call other_func ; Normal
ret
; Becomes:
func:
...
jmp other_func ; Tail call (saves stack frame)
Tools Quick Reference
| Tool | Purpose |
|---|---|
objdump |
Quick disassembly, symbols, sections |
readelf |
ELF header analysis, segments, sections |
nm |
Symbol table listing |
strings |
Find printable strings in binary |
file |
Identify file type |
ldd |
List shared library dependencies |
strace |
System call tracing |
ltrace |
Library call tracing |
gdb |
Interactive debugging |
r2/radare2 |
Advanced reversing framework |
ghidra |
Decompilation (free, NSA) |
IDA |
Decompilation (commercial) |
pwntools |
CTF exploitation framework |
ROPgadget |
Find ROP gadgets |
checksec |
Check security features |
# Quick binary analysis workflow
file ./binary
checksec --file=./binary
strings ./binary | head -50
nm ./binary 2>/dev/null || echo "stripped"
objdump -d -M intel ./binary | head -100