Process Management

Complete guide to process monitoring, control, and debugging.

╔══════════════════════════════════════════════════════════════╗
║              LINUX PROCESS MANAGEMENT                        ║
╠══════════════════════════════════════════════════════════════╣
║                                                              ║
║  Viewing Processes:                                          ║
║  ├─ ps        - Snapshot of processes                        ║
║  ├─ top/htop  - Real-time monitoring                         ║
║  ├─ pgrep     - Find process by name                         ║
║  └─ pstree    - Process tree                                 ║
║                                                              ║
║  Controlling Processes:                                      ║
║  ├─ kill      - Send signals                                 ║
║  ├─ pkill     - Kill by name                                 ║
║  ├─ nice      - Set priority                                 ║
║  └─ taskset   - Set CPU affinity                             ║
║                                                              ║
║  Background Jobs:                                            ║
║  ├─ &         - Run in background                            ║
║  ├─ jobs      - List jobs                                    ║
║  ├─ fg/bg     - Foreground/background                        ║
║  ├─ disown    - Detach from shell                            ║
║  └─ nohup     - Ignore hangup                                ║
║                                                              ║
╚══════════════════════════════════════════════════════════════╝

1. Quick Reference

1.1. Viewing

ps aux
ps auxf
ps -eo pid,ppid,cmd,%mem,%cpu
top -d 1
htop
pstree -p

1.2. Finding

pgrep nginx
pgrep -f "python script.py"
pidof firefox
lsof -i :8080
fuser -v /path/to/file

1.3. Killing

kill PID
kill -9 PID
kill -STOP PID
kill -CONT PID
pkill nginx
killall node

1.4. Background Jobs

command &
jobs
fg %1
bg %1
disown %1
nohup command &

1.5. Priority

nice -n 10 command
renice -n 5 -p PID
ionice -c 3 command

2. Process Fundamentals

2.1. What is a Process?

A process is a running instance of a program with:

  • PID: Unique process ID

  • PPID: Parent process ID

  • UID: User ID running the process

  • Memory space: Virtual memory mapping

  • File descriptors: Open files/sockets

  • State: Running, sleeping, stopped, zombie

2.2. Process States

┌─────────────────────────────────────────────────────────────────────────────┐
│                         PROCESS STATES                                       │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────┐                                                               │
│   │ Created │───────────────────────────────┐                               │
│   └────┬────┘                               │                               │
│        │                                    ▼                               │
│   ┌────▼────┐    schedule    ┌──────────┐                                   │
│   │  Ready  │───────────────▶│ Running  │                                   │
│   │   (R)   │◀───────────────│   (R)    │                                   │
│   └────┬────┘   preempt      └────┬─────┘                                   │
│        │                          │                                          │
│        │                          │ wait for I/O                            │
│        │                          ▼                                          │
│        │                    ┌──────────┐                                    │
│        │                    │ Sleeping │                                    │
│        │                    │  (S/D)   │                                    │
│        │                    └────┬─────┘                                    │
│        │                         │ I/O complete                             │
│        │◀────────────────────────┘                                          │
│        │                                                                     │
│        │  SIGSTOP            ┌──────────┐                                   │
│        └────────────────────▶│ Stopped  │                                   │
│                              │   (T)    │                                   │
│                              └────┬─────┘                                   │
│                                   │ SIGCONT                                 │
│                                   ▼                                         │
│                              Back to Ready                                  │
│                                                                              │
│   State Codes:                                                              │
│   R = Running/Runnable    S = Interruptible Sleep   D = Uninterruptible     │
│   T = Stopped             Z = Zombie                X = Dead                │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

2.3. Process State Codes

Code State Description

R

Running

Currently executing or ready to run

S

Sleeping

Interruptible sleep (waiting for event)

D

Disk Sleep

Uninterruptible sleep (waiting for I/O)

T

Stopped

Stopped by signal (SIGSTOP, SIGTSTP)

t

Traced

Stopped by debugger

Z

Zombie

Terminated but not reaped by parent

X

Dead

Should never be seen

I

Idle

Kernel thread (idle)

2.4. Zombie and Orphan Processes

# Zombie: Child finished but parent hasn't read exit status
# - Shows as 'Z' in ps
# - Can't be killed (already dead)
# - Parent must call wait() or die
# Find zombies
ps aux | grep -w Z
ps aux | awk '$8=="Z"'
# Orphan: Parent died, inherited by init (PID 1)
# - Not a problem, init will reap them
# - PPID becomes 1

3. Viewing Processes

3.1. ps - Process Snapshot

# All processes, full format
ps aux
# Column meanings:
# USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
# │    │   │    │    │   │   │   │    │     │    │
# │    │   │    │    │   │   │   │    │     │    └── Command
# │    │   │    │    │   │   │   │    │     └── CPU time used
# │    │   │    │    │   │   │   │    └── Start time
# │    │   │    │    │   │   │   └── Process state
# │    │   │    │    │   │   └── Terminal
# │    │   │    │    │   └── Resident memory (KB)
# │    │   │    │    └── Virtual memory (KB)
# │    │   │    └── Memory percentage
# │    │   └── CPU percentage
# │    └── Process ID
# └── User
# Tree view
ps auxf
# Custom columns
ps -eo pid,ppid,user,%cpu,%mem,stat,start,time,cmd
# Sort by memory
ps aux --sort=-%mem | head
# Sort by CPU
ps aux --sort=-%cpu | head
# Specific user
ps -u username
# Specific process
ps -p 1234
# Show threads
ps -eLf
# Long format with PPID
ps -efl

3.2. Available ps Columns

Column Description

pid

Process ID

ppid

Parent process ID

pgid

Process group ID

sid

Session ID

user

Username

uid

User ID

%cpu

CPU percentage

%mem

Memory percentage

vsz

Virtual memory size (KB)

rss

Resident set size (KB)

stat

Process state

start

Start time

time

CPU time

cmd

Command

args

Full command with args

ni

Nice value

pri

Priority

tty

Terminal

wchan

Kernel function (sleeping)

3.3. top - Real-time Monitoring

# Basic
top
# Update every 1 second
top -d 1
# Show specific user
top -u username
# Show specific PID
top -p 1234
# Batch mode (for scripts)
top -b -n 1
# Interactive commands:
# h     - Help
# k     - Kill process
# r     - Renice process
# u     - Filter by user
# M     - Sort by memory
# P     - Sort by CPU
# T     - Sort by time
# c     - Toggle full command
# 1     - Show per-CPU stats
# q     - Quit

3.4. htop - Better Interactive View

# Install
sudo pacman -S htop   # Arch
sudo apt install htop # Debian
# Run
htop
# Interactive commands:
# F1    - Help
# F2    - Setup
# F3    - Search
# F4    - Filter
# F5    - Tree view
# F6    - Sort by column
# F7    - Decrease nice
# F8    - Increase nice
# F9    - Kill
# F10   - Quit
# Space - Tag process
# U     - Untag all
# k     - Kill tagged

3.5. pstree - Process Tree

# Basic tree
pstree
# Show PIDs
pstree -p
# Show for specific user
pstree username
# Show for specific PID
pstree -p 1234
# Highlight process and ancestors
pstree -H 1234
# Show command line arguments
pstree -a
# Compact view
pstree -c

4. Finding Processes

4.1. pgrep - Find by Name

# Find PID by name
pgrep nginx
# Full command line match
pgrep -f "python script.py"
# List with names
pgrep -l nginx
# With full command
pgrep -a nginx
# Newest match only
pgrep -n nginx
# Oldest match only
pgrep -o nginx
# Specific user
pgrep -u root nginx
# Count matches
pgrep -c nginx
# Inverse match (not matching)
pgrep -v nginx

4.2. pidof - Get PID

# Get PID of program
pidof firefox
# Get PID of shell script
pidof -x script.sh

4.3. lsof - Find by Port/File

# Process using port
lsof -i :8080
# Process using specific port/protocol
lsof -i tcp:80
lsof -i udp:53
# All network connections for process
lsof -i -p 1234
# Process using file
lsof /path/to/file
# All files opened by process
lsof -p 1234
# All files opened by user
lsof -u username
# Network connections
lsof -i
# Listening ports
lsof -i -P -n | grep LISTEN

4.4. fuser - Find Process Using File

# Who's using a file?
fuser -v /path/to/file
# Who's using a mount point?
fuser -mv /mnt/usb
# Who's using a port?
fuser -v 8080/tcp
# Kill process using file
fuser -k /path/to/file
# Kill with specific signal
fuser -k -SIGKILL /path/to/file

4.5. ss/netstat - Network Connections

# All listening ports with process
ss -tlnp
# All connections with process
ss -tnp
# Find process on port
ss -tlnp | grep :8080
# netstat equivalent (older)
netstat -tlnp

5. Signals and Killing

5.1. Signal Reference

Signal Number Default Description

SIGHUP

1

Terminate

Hangup (terminal closed)

SIGINT

2

Terminate

Interrupt (Ctrl+C)

SIGQUIT

3

Core dump

Quit (Ctrl+\\)

SIGKILL

9

Terminate

Force kill (can’t catch)

SIGTERM

15

Terminate

Graceful termination

SIGSTOP

19

Stop

Pause (can’t catch)

SIGCONT

18

Continue

Resume stopped process

SIGTSTP

20

Stop

Terminal stop (Ctrl+Z)

SIGUSR1

10

Terminate

User-defined

SIGUSR2

12

Terminate

User-defined

5.2. kill - Send Signal

# Graceful shutdown (SIGTERM, default)
kill 1234
kill -15 1234
kill -TERM 1234
kill -SIGTERM 1234
# Force kill (SIGKILL)
kill -9 1234
kill -KILL 1234
# Pause process (SIGSTOP)
kill -STOP 1234
kill -19 1234
# Resume process (SIGCONT)
kill -CONT 1234
kill -18 1234
# Reload config (SIGHUP) - many daemons support this
kill -HUP 1234
# List all signals
kill -l

5.3. pkill - Kill by Name

# Kill by name
pkill nginx
# Full command match
pkill -f "python script.py"
# Kill specific user's processes
pkill -u username
# Send specific signal
pkill -9 nginx
pkill -KILL nginx
# Kill oldest match
pkill -o nginx
# Kill newest match
pkill -n nginx
# Kill processes older than 1 hour
pkill -o --older 3600 process_name

5.4. killall - Kill All by Name

# Kill all by exact name
killall nginx
# Force kill all
killall -9 nginx
# Interactive (confirm each)
killall -i nginx
# Wait for processes to die
killall -w nginx
# Kill by user
killall -u username

5.5. Safe Kill Pattern

# Try graceful first, then force
kill $PID && sleep 5 && kill -0 $PID 2>/dev/null && kill -9 $PID
# Function version
safe_kill() {
  local pid=$1
  local timeout=${2:-5}

  kill "$pid" 2>/dev/null || return 0

  for ((i=0; i<timeout; i++)); do
    kill -0 "$pid" 2>/dev/null || return 0
    sleep 1
  done

  kill -9 "$pid" 2>/dev/null
}

6. Background Jobs

6.1. Running in Background

# Start in background
command &
# Start and disown immediately
command & disown
# With nohup (survives logout)
nohup command &
# Redirect output
nohup command > output.log 2>&1 &
# With setsid (new session)
setsid command

6.2. jobs - List Jobs

# List background jobs
jobs
# With PIDs
jobs -l
# Only running
jobs -r
# Only stopped
jobs -s
# Job states:
# Running   - Currently running
# Stopped   - Paused (Ctrl+Z or SIGSTOP)
# Done      - Completed

6.3. fg/bg - Foreground/Background

# Bring job to foreground
fg          # Most recent
fg %1       # Job 1
fg %command # By command name
# Send to background
bg          # Most recent
bg %1       # Job 1
# Ctrl+Z - Suspend current foreground job

6.4. disown - Detach from Shell

# Disown most recent job
disown
# Disown specific job
disown %1
# Disown all jobs
disown -a
# Disown but keep in jobs table
disown -h %1

6.5. nohup vs disown vs setsid

Method Survives Logout New Session Ignores HUP

&

No

No

No

disown

Yes

No

Yes

nohup

Yes

No

Yes

setsid

Yes

Yes

Yes

# nohup - Ignore hangup, redirect output
nohup command > output.log 2>&1 &
# disown - Detach running job from shell
command &
disown %1
# setsid - Create new session
setsid command
# Most robust
nohup setsid command > output.log 2>&1 &

7. Process Priority

7.1. Nice Values

Nice values range from -20 (highest priority) to 19 (lowest):

Priority Scale:
-20 ─────────────────────────────────────────── 19
High Priority                            Low Priority
(system/root only)                       (anyone)

Default nice value: 0

7.2. nice - Start with Priority

# Start with lower priority (nice)
nice command
nice -n 10 command
# Start with higher priority (requires root)
sudo nice -n -10 command
# Very low priority
nice -n 19 command

7.3. renice - Change Running Process

# Lower priority of running process
renice -n 10 -p 1234
# Higher priority (requires root)
sudo renice -n -5 -p 1234
# Change all processes of user
renice -n 10 -u username
# Change process group
renice -n 10 -g 1234

7.4. ionice - I/O Priority

# I/O scheduling classes:
# 0 - None (default)
# 1 - Real-time (root only)
# 2 - Best-effort (default)
# 3 - Idle (only when no other I/O)
# Start with idle I/O priority
ionice -c 3 command
# Start with real-time I/O (root)
sudo ionice -c 1 command
# Change running process
ionice -c 3 -p 1234
# Check current
ionice -p 1234

8. CPU Affinity

8.1. taskset - Set CPU Affinity

# Run on specific CPU
taskset -c 0 command        # CPU 0 only
taskset -c 0,2 command      # CPU 0 and 2
taskset -c 0-3 command      # CPUs 0-3
# Using mask (binary)
taskset 0x1 command         # CPU 0 (0001)
taskset 0x3 command         # CPUs 0,1 (0011)
taskset 0xf command         # CPUs 0-3 (1111)
# Get affinity of running process
taskset -p 1234
# Set affinity of running process
taskset -p -c 0,1 1234
taskset -p 0x3 1234

8.2. numactl - NUMA Control

# Run on specific NUMA node
numactl --cpunodebind=0 command
# Prefer memory from specific node
numactl --preferred=0 command
# Show NUMA topology
numactl --hardware

9. Process Limits

9.1. ulimit - Shell Limits

# Show all limits
ulimit -a
# Show hard limits
ulimit -aH
# Show soft limits
ulimit -aS
# Common limits:
ulimit -n      # Max open files
ulimit -u      # Max user processes
ulimit -v      # Max virtual memory
ulimit -m      # Max resident memory
ulimit -s      # Max stack size
ulimit -c      # Max core file size
# Set limits (session only)
ulimit -n 65535    # Open files
ulimit -u 4096     # Processes
# Unlimited
ulimit -c unlimited   # Core dumps

9.2. /etc/security/limits.conf

# Permanent limits
# /etc/security/limits.conf

# Format: <domain> <type> <item> <value>

# Examples:
*               soft    nofile          65535
*               hard    nofile          65535
username        soft    nproc           4096
@groupname      hard    memlock         unlimited

9.3. Per-Process Limits (/proc/PID/limits)

# View process limits
cat /proc/1234/limits
# Example output:
# Limit                     Soft Limit     Hard Limit     Units
# Max cpu time              unlimited      unlimited      seconds
# Max file size             unlimited      unlimited      bytes
# Max data size             unlimited      unlimited      bytes
# Max stack size            8388608        unlimited      bytes
# Max core file size        0              unlimited      bytes
# Max resident set          unlimited      unlimited      bytes
# Max processes             30785          30785          processes
# Max open files            1024           1048576        files
# Max locked memory         8388608        8388608        bytes

10. /proc/PID Deep Dive

10.1. Key Files

# Command line
cat /proc/1234/cmdline | tr '\0' ' '
# Environment
cat /proc/1234/environ | tr '\0' '\n'
# Current working directory
readlink /proc/1234/cwd
# Executable path
readlink /proc/1234/exe
# File descriptors
ls -la /proc/1234/fd/
# Memory maps
cat /proc/1234/maps
# Status (readable format)
cat /proc/1234/status
# Statistics (one line)
cat /proc/1234/stat
# I/O statistics
cat /proc/1234/io
# OOM score
cat /proc/1234/oom_score
# OOM adjustment (-1000 to 1000)
cat /proc/1234/oom_score_adj
# Limits
cat /proc/1234/limits
# Network stats
cat /proc/1234/net/dev
# Threads
ls /proc/1234/task/

10.2. Memory Information

# Memory map
cat /proc/1234/maps
# Summarized memory
cat /proc/1234/smaps_rollup
# Detailed per-mapping
cat /proc/1234/smaps
# Status fields
cat /proc/1234/status | grep -E "VmPeak|VmSize|VmRSS|VmSwap"
# VmPeak  - Peak virtual memory
# VmSize  - Current virtual memory
# VmRSS   - Resident set size (physical)
# VmSwap  - Swapped out memory

10.3. File Descriptors

# List open files
ls -la /proc/1234/fd/
# Types:
# 0 -> /dev/pts/0        (stdin)
# 1 -> /dev/pts/0        (stdout)
# 2 -> /dev/pts/0        (stderr)
# 3 -> /path/to/file     (file)
# 4 -> socket:[12345]    (socket)
# 5 -> pipe:[12346]      (pipe)
# 6 -> anon_inode:[...]  (anonymous)
# Detailed fd info
cat /proc/1234/fdinfo/3

11. Systemd Process Control

11.1. Managing Services

# Start/stop/restart
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx    # Reload config only
# Enable/disable at boot
systemctl enable nginx
systemctl disable nginx
# Status
systemctl status nginx
systemctl is-active nginx
systemctl is-enabled nginx
# List all services
systemctl list-units --type=service
systemctl list-units --type=service --state=running
# Show process tree for service
systemctl status nginx --no-pager -l

11.2. Viewing Service Processes

# Show main PID
systemctl show nginx --property=MainPID
# Show all PIDs in service cgroup
systemctl status nginx | grep -E "├|└|CGroup"
# Using cgroupsv2
cat /sys/fs/cgroup/system.slice/nginx.service/cgroup.procs

11.3. Controlling Service Resources

# Limit CPU
systemctl set-property nginx.service CPUQuota=50%
# Limit memory
systemctl set-property nginx.service MemoryMax=500M
# Limit I/O
systemctl set-property nginx.service IOWeight=50
# View current limits
systemctl show nginx.service | grep -E "CPU|Memory|IO"

11.4. Service Logs

# View logs
journalctl -u nginx
# Follow logs
journalctl -u nginx -f
# Recent logs
journalctl -u nginx --since "1 hour ago"
# With process details
journalctl -u nginx -o verbose

12. Cgroups

12.1. What are Cgroups?

Control groups (cgroups) limit, account, and isolate resource usage:

  • CPU time

  • Memory

  • I/O bandwidth

  • Network

  • Device access

12.2. Cgroups v2 Basics

# Check cgroup version
cat /proc/filesystems | grep cgroup
# Mount point
mount | grep cgroup
# List controllers
cat /sys/fs/cgroup/cgroup.controllers
# List enabled controllers
cat /sys/fs/cgroup/cgroup.subtree_control

12.3. Viewing Process Cgroups

# Process cgroup
cat /proc/1234/cgroup
# List processes in cgroup
cat /sys/fs/cgroup/user.slice/user-1000.slice/cgroup.procs
# Cgroup resource usage
cat /sys/fs/cgroup/user.slice/user-1000.slice/cpu.stat
cat /sys/fs/cgroup/user.slice/user-1000.slice/memory.current

12.4. Limiting Resources with Cgroups

# Create cgroup
sudo mkdir /sys/fs/cgroup/my_group
# Enable controllers
echo "+cpu +memory" | sudo tee /sys/fs/cgroup/cgroup.subtree_control
# Set CPU limit (100000 = 100%)
echo "50000 100000" | sudo tee /sys/fs/cgroup/my_group/cpu.max
# Set memory limit (500MB)
echo "500M" | sudo tee /sys/fs/cgroup/my_group/memory.max
# Add process to cgroup
echo 1234 | sudo tee /sys/fs/cgroup/my_group/cgroup.procs
# Run command in cgroup
sudo cgexec -g cpu,memory:my_group command

13. Process Debugging

13.1. strace - System Call Trace

# Trace running process
strace -p 1234
# Trace command
strace command
# Follow forks
strace -f command
# Trace specific syscalls
strace -e open,read,write command
strace -e file command      # File operations
strace -e network command   # Network operations
strace -e process command   # Process operations
# With timing
strace -t command           # Time of day
strace -tt command          # Microseconds
strace -T command           # Time spent in syscall
# Count syscalls
strace -c command
# Output to file
strace -o trace.log command

13.2. ltrace - Library Call Trace

# Trace library calls
ltrace command
# Trace running process
ltrace -p 1234
# Count calls
ltrace -c command

13.3. gdb - Debug Processes

# Attach to running process
gdb -p 1234
# Commands:
# bt         - Backtrace
# info threads - List threads
# thread N   - Switch to thread
# continue   - Resume
# detach     - Detach
# quit       - Exit

13.4. /proc Debug Info

# Stack trace (requires kernel.yama.ptrace_scope=0)
cat /proc/1234/stack
# Wait channel (kernel function)
cat /proc/1234/wchan
# System call being executed
cat /proc/1234/syscall

13.5. Common Debugging Scenarios

# Why is process using 100% CPU?
strace -p PID -c   # Count syscalls
top -H -p PID      # Show threads
# Why is process stuck?
cat /proc/PID/stack
cat /proc/PID/wchan
strace -p PID      # See what it's waiting for
# What files is process accessing?
lsof -p PID
strace -e file -p PID
# What network connections?
ss -tnp | grep PID
lsof -i -p PID

14. Common Patterns

14.1. Kill Process on Port

# Find and kill
kill $(lsof -t -i:8080)
# Force kill
kill -9 $(lsof -t -i:8080)
# Using fuser
fuser -k 8080/tcp

14.2. Wait for Process to Exit

# Wait for specific PID
while kill -0 $PID 2>/dev/null; do
  sleep 1
done
echo "Process $PID has exited"
# Or use wait (for child processes)
command &
wait $!
echo "Done"

14.3. Run with Timeout

# Using timeout command
timeout 30 command
# Kill after timeout
timeout --signal=KILL 30 command
# With warning before kill
timeout --signal=TERM --kill-after=10 30 command

14.4. Process Watchdog

#!/bin/bash
# restart-if-dead.sh
COMMAND="my-daemon"
while true; do
  if ! pgrep -x "$COMMAND" > /dev/null; then
    echo "Process died, restarting..."
    $COMMAND &
  fi
  sleep 60
done

14.5. Resource Monitor One-Liner

# Top CPU consumers
ps aux --sort=-%cpu | head -5
# Top memory consumers
ps aux --sort=-%mem | head -5
# Processes by state
ps aux | awk 'NR>1 {count[$8]++} END {for (s in count) print s, count[s]}'

14.6. Find Zombie Processes

# Find zombies
ps aux | awk '$8=="Z" {print}'
# Find parent of zombies
ps aux | awk '$8=="Z" {print $2}' | xargs -I{} ps -o ppid= -p {}
# Kill parent (zombies will be reaped)
kill $(ps aux | awk '$8=="Z" {print $3}' | sort -u)

15. See Also

  • /proc filesystem details

  • Systems Administration Mastery

  • man ps

  • man kill

  • man signal

  • man cgroups