Process Management
Complete guide to process monitoring, control, and debugging.
╔══════════════════════════════════════════════════════════════╗ ║ LINUX PROCESS MANAGEMENT ║ ╠══════════════════════════════════════════════════════════════╣ ║ ║ ║ Viewing Processes: ║ ║ ├─ ps - Snapshot of processes ║ ║ ├─ top/htop - Real-time monitoring ║ ║ ├─ pgrep - Find process by name ║ ║ └─ pstree - Process tree ║ ║ ║ ║ Controlling Processes: ║ ║ ├─ kill - Send signals ║ ║ ├─ pkill - Kill by name ║ ║ ├─ nice - Set priority ║ ║ └─ taskset - Set CPU affinity ║ ║ ║ ║ Background Jobs: ║ ║ ├─ & - Run in background ║ ║ ├─ jobs - List jobs ║ ║ ├─ fg/bg - Foreground/background ║ ║ ├─ disown - Detach from shell ║ ║ └─ nohup - Ignore hangup ║ ║ ║ ╚══════════════════════════════════════════════════════════════╝
1. Quick Reference
2. Process Fundamentals
2.1. What is a Process?
A process is a running instance of a program with:
-
PID: Unique process ID
-
PPID: Parent process ID
-
UID: User ID running the process
-
Memory space: Virtual memory mapping
-
File descriptors: Open files/sockets
-
State: Running, sleeping, stopped, zombie
2.2. Process States
┌─────────────────────────────────────────────────────────────────────────────┐ │ PROCESS STATES │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────┐ │ │ │ Created │───────────────────────────────┐ │ │ └────┬────┘ │ │ │ │ ▼ │ │ ┌────▼────┐ schedule ┌──────────┐ │ │ │ Ready │───────────────▶│ Running │ │ │ │ (R) │◀───────────────│ (R) │ │ │ └────┬────┘ preempt └────┬─────┘ │ │ │ │ │ │ │ │ wait for I/O │ │ │ ▼ │ │ │ ┌──────────┐ │ │ │ │ Sleeping │ │ │ │ │ (S/D) │ │ │ │ └────┬─────┘ │ │ │ │ I/O complete │ │ │◀────────────────────────┘ │ │ │ │ │ │ SIGSTOP ┌──────────┐ │ │ └────────────────────▶│ Stopped │ │ │ │ (T) │ │ │ └────┬─────┘ │ │ │ SIGCONT │ │ ▼ │ │ Back to Ready │ │ │ │ State Codes: │ │ R = Running/Runnable S = Interruptible Sleep D = Uninterruptible │ │ T = Stopped Z = Zombie X = Dead │ │ │ └─────────────────────────────────────────────────────────────────────────────┘
2.3. Process State Codes
| Code | State | Description |
|---|---|---|
R |
Running |
Currently executing or ready to run |
S |
Sleeping |
Interruptible sleep (waiting for event) |
D |
Disk Sleep |
Uninterruptible sleep (waiting for I/O) |
T |
Stopped |
Stopped by signal (SIGSTOP, SIGTSTP) |
t |
Traced |
Stopped by debugger |
Z |
Zombie |
Terminated but not reaped by parent |
X |
Dead |
Should never be seen |
I |
Idle |
Kernel thread (idle) |
2.4. Zombie and Orphan Processes
# Zombie: Child finished but parent hasn't read exit status
# - Shows as 'Z' in ps
# - Can't be killed (already dead)
# - Parent must call wait() or die
# Find zombies
ps aux | grep -w Z
ps aux | awk '$8=="Z"'
# Orphan: Parent died, inherited by init (PID 1)
# - Not a problem, init will reap them
# - PPID becomes 1
3. Viewing Processes
3.1. ps - Process Snapshot
# All processes, full format
ps aux
# Column meanings:
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# │ │ │ │ │ │ │ │ │ │ │
# │ │ │ │ │ │ │ │ │ │ └── Command
# │ │ │ │ │ │ │ │ │ └── CPU time used
# │ │ │ │ │ │ │ │ └── Start time
# │ │ │ │ │ │ │ └── Process state
# │ │ │ │ │ │ └── Terminal
# │ │ │ │ │ └── Resident memory (KB)
# │ │ │ │ └── Virtual memory (KB)
# │ │ │ └── Memory percentage
# │ │ └── CPU percentage
# │ └── Process ID
# └── User
# Tree view
ps auxf
# Custom columns
ps -eo pid,ppid,user,%cpu,%mem,stat,start,time,cmd
# Sort by memory
ps aux --sort=-%mem | head
# Sort by CPU
ps aux --sort=-%cpu | head
# Specific user
ps -u username
# Specific process
ps -p 1234
# Show threads
ps -eLf
# Long format with PPID
ps -efl
3.2. Available ps Columns
| Column | Description |
|---|---|
|
Process ID |
|
Parent process ID |
|
Process group ID |
|
Session ID |
|
Username |
|
User ID |
|
CPU percentage |
|
Memory percentage |
|
Virtual memory size (KB) |
|
Resident set size (KB) |
|
Process state |
|
Start time |
|
CPU time |
|
Command |
|
Full command with args |
|
Nice value |
|
Priority |
|
Terminal |
|
Kernel function (sleeping) |
3.3. top - Real-time Monitoring
# Basic
top
# Update every 1 second
top -d 1
# Show specific user
top -u username
# Show specific PID
top -p 1234
# Batch mode (for scripts)
top -b -n 1
# Interactive commands:
# h - Help
# k - Kill process
# r - Renice process
# u - Filter by user
# M - Sort by memory
# P - Sort by CPU
# T - Sort by time
# c - Toggle full command
# 1 - Show per-CPU stats
# q - Quit
3.4. htop - Better Interactive View
# Install
sudo pacman -S htop # Arch
sudo apt install htop # Debian
# Run
htop
# Interactive commands:
# F1 - Help
# F2 - Setup
# F3 - Search
# F4 - Filter
# F5 - Tree view
# F6 - Sort by column
# F7 - Decrease nice
# F8 - Increase nice
# F9 - Kill
# F10 - Quit
# Space - Tag process
# U - Untag all
# k - Kill tagged
4. Finding Processes
4.1. pgrep - Find by Name
# Find PID by name
pgrep nginx
# Full command line match
pgrep -f "python script.py"
# List with names
pgrep -l nginx
# With full command
pgrep -a nginx
# Newest match only
pgrep -n nginx
# Oldest match only
pgrep -o nginx
# Specific user
pgrep -u root nginx
# Count matches
pgrep -c nginx
# Inverse match (not matching)
pgrep -v nginx
4.2. pidof - Get PID
# Get PID of program
pidof firefox
# Get PID of shell script
pidof -x script.sh
4.3. lsof - Find by Port/File
# Process using port
lsof -i :8080
# Process using specific port/protocol
lsof -i tcp:80
lsof -i udp:53
# All network connections for process
lsof -i -p 1234
# Process using file
lsof /path/to/file
# All files opened by process
lsof -p 1234
# All files opened by user
lsof -u username
# Network connections
lsof -i
# Listening ports
lsof -i -P -n | grep LISTEN
5. Signals and Killing
5.1. Signal Reference
| Signal | Number | Default | Description |
|---|---|---|---|
SIGHUP |
1 |
Terminate |
Hangup (terminal closed) |
SIGINT |
2 |
Terminate |
Interrupt (Ctrl+C) |
SIGQUIT |
3 |
Core dump |
Quit (Ctrl+\\) |
SIGKILL |
9 |
Terminate |
Force kill (can’t catch) |
SIGTERM |
15 |
Terminate |
Graceful termination |
SIGSTOP |
19 |
Stop |
Pause (can’t catch) |
SIGCONT |
18 |
Continue |
Resume stopped process |
SIGTSTP |
20 |
Stop |
Terminal stop (Ctrl+Z) |
SIGUSR1 |
10 |
Terminate |
User-defined |
SIGUSR2 |
12 |
Terminate |
User-defined |
5.2. kill - Send Signal
# Graceful shutdown (SIGTERM, default)
kill 1234
kill -15 1234
kill -TERM 1234
kill -SIGTERM 1234
# Force kill (SIGKILL)
kill -9 1234
kill -KILL 1234
# Pause process (SIGSTOP)
kill -STOP 1234
kill -19 1234
# Resume process (SIGCONT)
kill -CONT 1234
kill -18 1234
# Reload config (SIGHUP) - many daemons support this
kill -HUP 1234
# List all signals
kill -l
5.3. pkill - Kill by Name
# Kill by name
pkill nginx
# Full command match
pkill -f "python script.py"
# Kill specific user's processes
pkill -u username
# Send specific signal
pkill -9 nginx
pkill -KILL nginx
# Kill oldest match
pkill -o nginx
# Kill newest match
pkill -n nginx
# Kill processes older than 1 hour
pkill -o --older 3600 process_name
5.4. killall - Kill All by Name
# Kill all by exact name
killall nginx
# Force kill all
killall -9 nginx
# Interactive (confirm each)
killall -i nginx
# Wait for processes to die
killall -w nginx
# Kill by user
killall -u username
5.5. Safe Kill Pattern
# Try graceful first, then force
kill $PID && sleep 5 && kill -0 $PID 2>/dev/null && kill -9 $PID
# Function version
safe_kill() {
local pid=$1
local timeout=${2:-5}
kill "$pid" 2>/dev/null || return 0
for ((i=0; i<timeout; i++)); do
kill -0 "$pid" 2>/dev/null || return 0
sleep 1
done
kill -9 "$pid" 2>/dev/null
}
6. Background Jobs
6.1. Running in Background
# Start in background
command &
# Start and disown immediately
command & disown
# With nohup (survives logout)
nohup command &
# Redirect output
nohup command > output.log 2>&1 &
# With setsid (new session)
setsid command
6.2. jobs - List Jobs
# List background jobs
jobs
# With PIDs
jobs -l
# Only running
jobs -r
# Only stopped
jobs -s
# Job states:
# Running - Currently running
# Stopped - Paused (Ctrl+Z or SIGSTOP)
# Done - Completed
6.3. fg/bg - Foreground/Background
# Bring job to foreground
fg # Most recent
fg %1 # Job 1
fg %command # By command name
# Send to background
bg # Most recent
bg %1 # Job 1
# Ctrl+Z - Suspend current foreground job
6.4. disown - Detach from Shell
# Disown most recent job
disown
# Disown specific job
disown %1
# Disown all jobs
disown -a
# Disown but keep in jobs table
disown -h %1
6.5. nohup vs disown vs setsid
| Method | Survives Logout | New Session | Ignores HUP |
|---|---|---|---|
|
No |
No |
No |
|
Yes |
No |
Yes |
|
Yes |
No |
Yes |
|
Yes |
Yes |
Yes |
# nohup - Ignore hangup, redirect output
nohup command > output.log 2>&1 &
# disown - Detach running job from shell
command &
disown %1
# setsid - Create new session
setsid command
# Most robust
nohup setsid command > output.log 2>&1 &
7. Process Priority
7.1. Nice Values
Nice values range from -20 (highest priority) to 19 (lowest):
Priority Scale: -20 ─────────────────────────────────────────── 19 High Priority Low Priority (system/root only) (anyone) Default nice value: 0
7.2. nice - Start with Priority
# Start with lower priority (nice)
nice command
nice -n 10 command
# Start with higher priority (requires root)
sudo nice -n -10 command
# Very low priority
nice -n 19 command
7.3. renice - Change Running Process
# Lower priority of running process
renice -n 10 -p 1234
# Higher priority (requires root)
sudo renice -n -5 -p 1234
# Change all processes of user
renice -n 10 -u username
# Change process group
renice -n 10 -g 1234
7.4. ionice - I/O Priority
# I/O scheduling classes:
# 0 - None (default)
# 1 - Real-time (root only)
# 2 - Best-effort (default)
# 3 - Idle (only when no other I/O)
# Start with idle I/O priority
ionice -c 3 command
# Start with real-time I/O (root)
sudo ionice -c 1 command
# Change running process
ionice -c 3 -p 1234
# Check current
ionice -p 1234
8. CPU Affinity
8.1. taskset - Set CPU Affinity
# Run on specific CPU
taskset -c 0 command # CPU 0 only
taskset -c 0,2 command # CPU 0 and 2
taskset -c 0-3 command # CPUs 0-3
# Using mask (binary)
taskset 0x1 command # CPU 0 (0001)
taskset 0x3 command # CPUs 0,1 (0011)
taskset 0xf command # CPUs 0-3 (1111)
# Get affinity of running process
taskset -p 1234
# Set affinity of running process
taskset -p -c 0,1 1234
taskset -p 0x3 1234
9. Process Limits
9.1. ulimit - Shell Limits
# Show all limits
ulimit -a
# Show hard limits
ulimit -aH
# Show soft limits
ulimit -aS
# Common limits:
ulimit -n # Max open files
ulimit -u # Max user processes
ulimit -v # Max virtual memory
ulimit -m # Max resident memory
ulimit -s # Max stack size
ulimit -c # Max core file size
# Set limits (session only)
ulimit -n 65535 # Open files
ulimit -u 4096 # Processes
# Unlimited
ulimit -c unlimited # Core dumps
9.2. /etc/security/limits.conf
# Permanent limits
# /etc/security/limits.conf
# Format: <domain> <type> <item> <value>
# Examples:
* soft nofile 65535
* hard nofile 65535
username soft nproc 4096
@groupname hard memlock unlimited
9.3. Per-Process Limits (/proc/PID/limits)
# View process limits
cat /proc/1234/limits
# Example output:
# Limit Soft Limit Hard Limit Units
# Max cpu time unlimited unlimited seconds
# Max file size unlimited unlimited bytes
# Max data size unlimited unlimited bytes
# Max stack size 8388608 unlimited bytes
# Max core file size 0 unlimited bytes
# Max resident set unlimited unlimited bytes
# Max processes 30785 30785 processes
# Max open files 1024 1048576 files
# Max locked memory 8388608 8388608 bytes
10. /proc/PID Deep Dive
10.1. Key Files
# Command line
cat /proc/1234/cmdline | tr '\0' ' '
# Environment
cat /proc/1234/environ | tr '\0' '\n'
# Current working directory
readlink /proc/1234/cwd
# Executable path
readlink /proc/1234/exe
# File descriptors
ls -la /proc/1234/fd/
# Memory maps
cat /proc/1234/maps
# Status (readable format)
cat /proc/1234/status
# Statistics (one line)
cat /proc/1234/stat
# I/O statistics
cat /proc/1234/io
# OOM score
cat /proc/1234/oom_score
# OOM adjustment (-1000 to 1000)
cat /proc/1234/oom_score_adj
# Limits
cat /proc/1234/limits
# Network stats
cat /proc/1234/net/dev
# Threads
ls /proc/1234/task/
10.2. Memory Information
# Memory map
cat /proc/1234/maps
# Summarized memory
cat /proc/1234/smaps_rollup
# Detailed per-mapping
cat /proc/1234/smaps
# Status fields
cat /proc/1234/status | grep -E "VmPeak|VmSize|VmRSS|VmSwap"
# VmPeak - Peak virtual memory
# VmSize - Current virtual memory
# VmRSS - Resident set size (physical)
# VmSwap - Swapped out memory
10.3. File Descriptors
# List open files
ls -la /proc/1234/fd/
# Types:
# 0 -> /dev/pts/0 (stdin)
# 1 -> /dev/pts/0 (stdout)
# 2 -> /dev/pts/0 (stderr)
# 3 -> /path/to/file (file)
# 4 -> socket:[12345] (socket)
# 5 -> pipe:[12346] (pipe)
# 6 -> anon_inode:[...] (anonymous)
# Detailed fd info
cat /proc/1234/fdinfo/3
11. Systemd Process Control
11.1. Managing Services
# Start/stop/restart
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx # Reload config only
# Enable/disable at boot
systemctl enable nginx
systemctl disable nginx
# Status
systemctl status nginx
systemctl is-active nginx
systemctl is-enabled nginx
# List all services
systemctl list-units --type=service
systemctl list-units --type=service --state=running
# Show process tree for service
systemctl status nginx --no-pager -l
11.2. Viewing Service Processes
# Show main PID
systemctl show nginx --property=MainPID
# Show all PIDs in service cgroup
systemctl status nginx | grep -E "├|└|CGroup"
# Using cgroupsv2
cat /sys/fs/cgroup/system.slice/nginx.service/cgroup.procs
11.3. Controlling Service Resources
# Limit CPU
systemctl set-property nginx.service CPUQuota=50%
# Limit memory
systemctl set-property nginx.service MemoryMax=500M
# Limit I/O
systemctl set-property nginx.service IOWeight=50
# View current limits
systemctl show nginx.service | grep -E "CPU|Memory|IO"
12. Cgroups
12.1. What are Cgroups?
Control groups (cgroups) limit, account, and isolate resource usage:
-
CPU time
-
Memory
-
I/O bandwidth
-
Network
-
Device access
12.2. Cgroups v2 Basics
# Check cgroup version
cat /proc/filesystems | grep cgroup
# Mount point
mount | grep cgroup
# List controllers
cat /sys/fs/cgroup/cgroup.controllers
# List enabled controllers
cat /sys/fs/cgroup/cgroup.subtree_control
12.3. Viewing Process Cgroups
# Process cgroup
cat /proc/1234/cgroup
# List processes in cgroup
cat /sys/fs/cgroup/user.slice/user-1000.slice/cgroup.procs
# Cgroup resource usage
cat /sys/fs/cgroup/user.slice/user-1000.slice/cpu.stat
cat /sys/fs/cgroup/user.slice/user-1000.slice/memory.current
12.4. Limiting Resources with Cgroups
# Create cgroup
sudo mkdir /sys/fs/cgroup/my_group
# Enable controllers
echo "+cpu +memory" | sudo tee /sys/fs/cgroup/cgroup.subtree_control
# Set CPU limit (100000 = 100%)
echo "50000 100000" | sudo tee /sys/fs/cgroup/my_group/cpu.max
# Set memory limit (500MB)
echo "500M" | sudo tee /sys/fs/cgroup/my_group/memory.max
# Add process to cgroup
echo 1234 | sudo tee /sys/fs/cgroup/my_group/cgroup.procs
# Run command in cgroup
sudo cgexec -g cpu,memory:my_group command
13. Process Debugging
13.1. strace - System Call Trace
# Trace running process
strace -p 1234
# Trace command
strace command
# Follow forks
strace -f command
# Trace specific syscalls
strace -e open,read,write command
strace -e file command # File operations
strace -e network command # Network operations
strace -e process command # Process operations
# With timing
strace -t command # Time of day
strace -tt command # Microseconds
strace -T command # Time spent in syscall
# Count syscalls
strace -c command
# Output to file
strace -o trace.log command
13.2. ltrace - Library Call Trace
# Trace library calls
ltrace command
# Trace running process
ltrace -p 1234
# Count calls
ltrace -c command
13.3. gdb - Debug Processes
# Attach to running process
gdb -p 1234
# Commands:
# bt - Backtrace
# info threads - List threads
# thread N - Switch to thread
# continue - Resume
# detach - Detach
# quit - Exit
13.4. /proc Debug Info
# Stack trace (requires kernel.yama.ptrace_scope=0)
cat /proc/1234/stack
# Wait channel (kernel function)
cat /proc/1234/wchan
# System call being executed
cat /proc/1234/syscall
13.5. Common Debugging Scenarios
# Why is process using 100% CPU?
strace -p PID -c # Count syscalls
top -H -p PID # Show threads
# Why is process stuck?
cat /proc/PID/stack
cat /proc/PID/wchan
strace -p PID # See what it's waiting for
# What files is process accessing?
lsof -p PID
strace -e file -p PID
# What network connections?
ss -tnp | grep PID
lsof -i -p PID
14. Common Patterns
14.1. Kill Process on Port
# Find and kill
kill $(lsof -t -i:8080)
# Force kill
kill -9 $(lsof -t -i:8080)
# Using fuser
fuser -k 8080/tcp
14.2. Wait for Process to Exit
# Wait for specific PID
while kill -0 $PID 2>/dev/null; do
sleep 1
done
echo "Process $PID has exited"
# Or use wait (for child processes)
command &
wait $!
echo "Done"
14.3. Run with Timeout
# Using timeout command
timeout 30 command
# Kill after timeout
timeout --signal=KILL 30 command
# With warning before kill
timeout --signal=TERM --kill-after=10 30 command
14.4. Process Watchdog
#!/bin/bash
# restart-if-dead.sh
COMMAND="my-daemon"
while true; do
if ! pgrep -x "$COMMAND" > /dev/null; then
echo "Process died, restarting..."
$COMMAND &
fi
sleep 60
done