DevOps

Linux Commands Every DevOps Engineer Should Know

Kunle··9 min read

The Linux commands you need for DevOps fall into seven categories: file management, process control, networking, text processing, disk and storage, user management, and package management. You do not need to memorise hundreds of commands. You need working fluency in roughly 50 that you will use daily, plus the knowledge to look up the rest when needed.

This guide covers the essential commands in each category with real-world DevOps examples -- not textbook syntax, but the actual ways you will use these commands when managing servers, debugging deployments, and writing automation scripts.

File management

File operations are the most basic and most frequent commands you will run. Every deployment, every configuration change, and every debugging session starts with navigating the filesystem and manipulating files.

# List files with details (permissions, size, modification date)
ls -la

# List files sorted by modification time (newest first)
ls -lt

# Show current directory
pwd

# Change directory
cd /var/log

# Go back to previous directory
cd -

DevOps use case: When you SSH into a production server to check why a deployment failed, ls -lt /var/log/ shows you the most recently modified log files -- the ones most likely to contain your error.

Copying, moving, and deleting

# Copy a file
cp config.yaml config.yaml.backup

# Copy a directory recursively
cp -r /app/config /app/config-backup

# Move or rename a file
mv old-name.conf new-name.conf

# Delete a file
rm unwanted-file.log

# Delete a directory and its contents
rm -rf /tmp/build-artifacts

DevOps use case: Before editing a production configuration file, always create a backup: cp nginx.conf nginx.conf.bak. If the change breaks something, you can restore immediately with mv nginx.conf.bak nginx.conf.

Finding files

# Find all YAML files in the current directory tree
find . -name "*.yaml"

# Find files modified in the last 24 hours
find /var/log -mtime -1

# Find files larger than 100MB (useful for disk space issues)
find / -size +100M -type f 2>/dev/null

# Find and delete old log files (older than 30 days)
find /var/log -name "*.log" -mtime +30 -delete

DevOps use case: A server is running out of disk space. find / -size +100M -type f 2>/dev/null immediately shows you the largest files consuming storage. This is often the first command you run during a disk space incident.

Permissions

# Make a script executable
chmod +x deploy.sh

# Set specific permissions (owner: read/write/execute, group: read/execute, others: read/execute)
chmod 755 deploy.sh

# Change file ownership
chown appuser:appgroup /app/config.yaml

# Recursively change ownership of a directory
chown -R www-data:www-data /var/www/html

DevOps use case: A CI/CD pipeline deploys your application, but the web server cannot read the files because the ownership is wrong. chown -R www-data:www-data /var/www/html fixes it. Permission issues are one of the most common causes of deployment failures.

Process management

Knowing how to inspect and control running processes is essential for troubleshooting server issues and managing services.

# Show all running processes
ps aux

# Filter processes by name
ps aux | grep nginx

# Real-time process monitor (CPU, memory usage)
top

# Better alternative to top (install with apt install htop)
htop

# Kill a process by PID
kill 12345

# Force kill an unresponsive process
kill -9 12345

# Kill all processes matching a name
pkill -f "node server.js"

Service management with systemctl

Modern Linux uses systemd to manage services. These commands are essential:

# Check if a service is running
systemctl status nginx

# Start / stop / restart a service
systemctl start nginx
systemctl stop nginx
systemctl restart nginx

# Enable a service to start on boot
systemctl enable nginx

# View service logs
journalctl -u nginx --since "1 hour ago"

# Follow logs in real time
journalctl -u nginx -f

DevOps use case: After deploying a new version of your application, you restart the service: systemctl restart myapp. Then you immediately check the logs: journalctl -u myapp -f to verify it started cleanly. If it crashes, the logs tell you exactly why.

Networking

Networking commands are critical for debugging connectivity issues, testing APIs, and verifying DNS resolution -- problems you will encounter weekly in DevOps.

Testing connectivity and APIs

# Test if a host is reachable
ping -c 4 google.com

# Make an HTTP request (test an API endpoint)
curl -s https://api.example.com/health

# Make a POST request with JSON data
curl -X POST -H "Content-Type: application/json" -d '{"key": "value"}' https://api.example.com/data

# Download a file
wget https://example.com/install.sh

# Download a file with curl
curl -O https://example.com/install.sh

DNS and port inspection

# Look up DNS records
dig example.com

# Simpler DNS lookup
nslookup example.com

# Show active network connections and listening ports
ss -tlnp

# Legacy equivalent (still on many systems)
netstat -tlnp

# Test if a specific port is open on a remote host
nc -zv example.com 443

# Trace the network path to a host
traceroute example.com

DevOps use case: Your application cannot connect to a database. First, check DNS: dig db.internal.example.com. Then check if the port is reachable: nc -zv db.internal.example.com 5432. Then check if anything is listening locally: ss -tlnp | grep 5432. These three commands diagnose 90% of connectivity problems.

Transferring files

# Copy a file to a remote server
scp deploy.tar.gz user@server:/tmp/

# Copy a file from a remote server
scp user@server:/var/log/app.log ./

# Sync a directory to a remote server (only transfers changed files)
rsync -avz ./build/ user@server:/var/www/html/

DevOps use case: rsync is preferred over scp for deployments because it only transfers files that have changed. Deploying a 500MB application directory where only 2 files changed means rsync transfers kilobytes instead of the full 500MB.

Text processing

Log analysis and configuration file manipulation are daily tasks. These commands let you search, filter, and transform text efficiently.

Viewing files

# View a file
cat config.yaml

# View a large file with pagination
less /var/log/syslog

# View the last 50 lines of a file
tail -50 /var/log/app.log

# Follow a log file in real time (new lines appear as they are written)
tail -f /var/log/app.log

# View the first 20 lines
head -20 /var/log/app.log

Searching with grep

# Search for a pattern in a file
grep "ERROR" /var/log/app.log

# Search recursively in a directory
grep -r "database_url" /app/config/

# Case-insensitive search
grep -i "timeout" /var/log/app.log

# Show line numbers with results
grep -n "404" /var/log/nginx/access.log

# Count occurrences
grep -c "ERROR" /var/log/app.log

# Show lines that do NOT match
grep -v "DEBUG" /var/log/app.log

DevOps use case: Your application is throwing 500 errors. grep "500" /var/log/nginx/access.log | tail -20 shows the most recent 500 errors. grep -c "500" /var/log/nginx/access.log tells you how many total. Adding grep -v "healthcheck" filters out noise from load balancer health checks.

Advanced text processing with awk and sed

# Print the 5th column of output (useful for parsing ps, df, etc.)
df -h | awk '{print $1, $5}'

# Sum values in a column
awk '{sum += $1} END {print sum}' numbers.txt

# Replace text in a file
sed -i 's/old-value/new-value/g' config.yaml

# Delete lines matching a pattern
sed -i '/^#/d' config.yaml

# Extract fields from structured logs
cat access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -10

DevOps use case: The last example is a classic -- it extracts IP addresses from an Nginx access log, counts how many requests each IP made, and shows the top 10. This is how you quickly identify traffic patterns or potential abuse during an incident.

Disk and storage

Running out of disk space is one of the most common production incidents. These commands help you monitor and manage storage.

# Show disk usage for all mounted filesystems
df -h

# Show directory sizes (sorted, human-readable)
du -sh /var/log/*

# Find the largest directories from root
du -h --max-depth=1 / 2>/dev/null | sort -hr | head -20

# Show disk I/O statistics
iostat -x 1

# Mount a new volume
mount /dev/xvdf /data

# Check filesystem for errors
fsck /dev/xvdf
CommandPurposeTypical DevOps use
df -hFilesystem disk spaceAlert when disk exceeds 80%
du -sh *Directory sizesFind which directory is consuming space
lsblkList block devicesVerify attached EBS volumes
mountMount filesystemsAttach new storage volumes
iostatI/O statisticsDiagnose slow disk performance

DevOps use case: Monitoring alerts you that a server is at 95% disk usage. You SSH in, run df -h to confirm which filesystem is full, then du -sh /var/* to find the culprit directory. It is almost always /var/log. You then rotate or clear old logs and set up log rotation to prevent recurrence.

User management

Managing users and access is part of server administration and security hardening.

# Add a new user
useradd -m -s /bin/bash deploy

# Set a password
passwd deploy

# Add user to a group (e.g., sudo, docker)
usermod -aG docker deploy

# Switch to another user
su deploy

# View current user
whoami

# View user groups
groups deploy

# Delete a user
userdel -r olduser

DevOps use case: Your CI/CD pipeline needs to run Docker commands on a server. You create a dedicated deploy user, add it to the docker group with usermod -aG docker deploy, and configure the pipeline to SSH as that user. This follows the principle of least privilege -- the deploy user can run containers but does not have root access.

Package management

Installing and updating software is different depending on the Linux distribution. The two most common package managers in DevOps are apt (Debian/Ubuntu) and yum/dnf (RHEL/CentOS/Amazon Linux).

apt (Debian/Ubuntu)

# Update package lists
apt update

# Install a package
apt install -y nginx

# Remove a package
apt remove nginx

# Upgrade all packages
apt upgrade -y

# Search for a package
apt search docker

yum/dnf (RHEL/CentOS/Amazon Linux)

# Install a package
yum install -y nginx

# Remove a package
yum remove nginx

# Update all packages
yum update -y

# Search for a package
yum search docker

# On newer RHEL-based systems, dnf replaces yum
dnf install -y nginx

DevOps use case: Provisioning a new server often starts with installing packages: apt update && apt install -y nginx certbot python3-certbot-nginx. In practice, you automate this with configuration management tools or bake it into a machine image (AMI), but understanding the underlying commands is essential for debugging.

Pro tips for daily DevOps use

These patterns combine multiple commands and are the kind of practical knowledge that separates efficient operators from beginners.

1. Chain commands for quick diagnostics

# Check disk, memory, and CPU in one line
df -h / && free -h && uptime

2. Use watch for real-time monitoring

# Re-run a command every 2 seconds
watch -n 2 'kubectl get pods'

3. Search command history

# Search your command history
history | grep "docker"

# Interactive reverse search (press Ctrl+R, then type)
# This is the fastest way to find and re-run previous commands

4. Use tee to save and display output simultaneously

# Run a command and save its output to a file while still displaying it
terraform plan | tee plan-output.txt

5. Background long-running tasks

# Run a process in the background that survives SSH disconnection
nohup ./long-running-script.sh > output.log 2>&1 &

6. Create command aliases for frequent operations

# Add to ~/.bashrc
alias k='kubectl'
alias tf='terraform'
alias dc='docker compose'
alias gs='git status'

These aliases save thousands of keystrokes per week. Every experienced DevOps engineer has a curated set.

Where Linux fits in the DevOps toolchain

Linux is not just one skill on a list -- it is the foundation that every other DevOps tool depends on. Docker containers run Linux. Kubernetes nodes run Linux. Terraform provisions Linux servers. CI/CD pipelines execute Linux shell commands. Prometheus and Grafana run on Linux.

Without Linux fluency, every other tool becomes harder to learn, use, and debug. This is why the CloudPros curriculum starts with Linux fundamentals in the first two weeks before moving to Git, Docker, CI/CD, and cloud platforms.

The recommended learning sequence:

  1. Linux fundamentals (commands in this article) -- 2 weeks
  2. Bash scripting -- combine commands into automation scripts -- 1 week
  3. Git version control -- 1 week
  4. Docker -- containers run Linux under the hood -- 2 weeks
  5. Cloud platforms (AWS) -- every EC2 instance is a Linux server -- 3 weeks
  6. Infrastructure as Code (Terraform) -- provisions the Linux servers -- 2 weeks
  7. Kubernetes -- orchestrates containers on Linux nodes -- 3 weeks

Every step builds on Linux. Master these commands first, and the rest of the DevOps toolchain becomes dramatically easier.

Frequently Asked Questions

Ola

Ola

Founder, CloudPros

Building the most hands-on DevOps bootcamp for the AI era. 16 weeks of real infrastructure, real projects, real career outcomes.

Related Articles