awk and sed: Processing Logs Without Opening Files
When your log file is 10 GB, trying to open it in a normal editor is a sure way to hang the system. This is where the "old guard" of Unix utilities comes in: sed and awk.
🛠 1. What's the difference?
- sed is about lines and text. Ideal for replacing words or deleting lines.
- awk is about columns and data. It's a programming language that understands table structures.
📝 2. sed: Stream Editor
Replace data on the fly
bash
sed 's/session_id=[0-9]*/session_id=HIDDEN/g' access.log
💎 3. awk: Powerful Reporting
awk automatically breaks lines into variables: $1 (first word), $2 (second), etc.
Print specific fields
bash
awk '{print $1, $7}' access.log
📊 4. Practical Recipes
Count unique IP addresses
bash
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -n 10
🏁 Summary
Using awk and sed saves hours of routine work. Instead of downloading logs, analyze them right where they live — on the server.