Overview
This series teaches awk through 85 heavily annotated, self-contained code examples. Each example
focuses on a single concept and includes inline annotations explaining what each line does, why
it matters, and what value or state results from it. All examples use echo or heredoc input
piped into awk — they run directly in any POSIX shell without additional setup.
Series Structure
The examples are organized into three levels based on complexity:
- Beginner — Field printing, separators, built-in variables, pattern matching, BEGIN/END blocks, arithmetic, string operations, and basic output formatting (Examples 1–28)
- Intermediate — Associative arrays, user-defined functions, string functions, getline, multiple file processing, record/field separators, and environment variables (Examples 29–56)
- Advanced — Multidimensional arrays, coprocesses, CSV parsing, report generation, state machines, real-world data analysis pipelines, and gawk-specific extensions (Examples 57–85)
Structure of Each Example
Every example follows a consistent five-part format:
- Brief Explanation — what the example demonstrates and why it matters (2-3 sentences)
- Mermaid Diagram — visual representation of data flow or concept relationships (when appropriate)
- Heavily Annotated Code — self-contained awk program with
# =>comments showing values, states, and output at each step - Key Takeaway — the core insight to retain from the example (1-2 sentences)
- Why It Matters — production relevance and real-world application (50-100 words)
How to Use This Series
Each example is a complete, runnable shell snippet. The # => annotations show expected output
and intermediate values inline — read them alongside the code rather than running each example
independently. Examples within each level build on each other, so reading sequentially within
a level provides the fullest understanding. Readers already familiar with awk basics can jump
directly to Intermediate or Advanced.
Examples by Level
Beginner (Examples 1–28)
- Example 1: Print Every Line
- Example 2: Print a Specific Field
- Example 3: Print Multiple Fields
- Example 4: Print the Last Field with $NF
- Example 5: Print the Entire Record with $0
- Example 6: Custom Input Field Separator with -F
- Example 7: Tab-Separated Input
- Example 8: Setting OFS — Output Field Separator
- Example 9: Multi-Character Field Separator
- Example 10: NR — Record Number
- Example 11: NF — Number of Fields
- Example 12: Combining NR and NF
- Example 13: Regex Pattern Matching
- Example 14: Numeric Comparison Pattern
- Example 15: String Comparison Pattern
- Example 16: Compound Patterns with && and ||
- Example 17: Negated Pattern with Exclamation Mark
- Example 18: Range Pattern
- Example 19: BEGIN Block
- Example 20: END Block
- Example 21: BEGIN and END Together
- Example 22: Arithmetic Operations
- Example 23: Increment, Decrement, and Assignment Operators
- Example 24: String Concatenation
- Example 25: printf for Formatted Output
- Example 26: Redirect Output to a File
- Example 27: Pipe Output to a Command
- Example 28: Multiple Statements and Comments
Intermediate (Examples 29–56)
- Example 29: Basic Associative Array
- Example 30: Array Iteration with for..in
- Example 31: Testing if a Key Exists
- Example 32: Deleting Array Elements
- Example 33: Array as a Set for Deduplication
- Example 34: Defining and Calling a Function
- Example 35: Local Variables in Functions
- Example 36: length() — String and Array Length
- Example 37: substr() — Substring Extraction
- Example 38: index() — Finding a Substring
- Example 39: split() — Split String into Array
- Example 40: sub() and gsub() — Substitution
- Example 41: match() — Regex Match with Position
- Example 42: tolower() and toupper()
- Example 43: sprintf() — Format String Without Printing
- Example 44: getline from Standard Input
- Example 45: getline from a File
- Example 46: getline from a Pipe
- Example 47: FILENAME Variable
- Example 48: Using FNR and NR Together for Two-File Join
- Example 49: ARGC and ARGV
- Example 50: RS — Custom Record Separator
- Example 51: ORS — Output Record Separator
- Example 52: Multiline Records with RS=""
- Example 53: ~ and !~ Operators
- Example 54: OFMT and CONVFMT
- Example 55: ENVIRON Array
- Example 56: Ternary Operator
Advanced (Examples 57–85)
- Example 57: Multidimensional Arrays with SUBSEP
- Example 58: Checking Multi-Key Existence
- Example 59: Recursive Functions
- Example 60: Functions Modifying Arrays by Reference
- Example 61: systime() and strftime() — Date and Time
- Example 62: system() — Running Shell Commands
- Example 63: Coprocesses with |& (gawk)
- Example 64: FPAT for CSV Parsing (gawk)
- Example 65: Log File Analysis — Access Log Parser
- Example 66: Report Generation with Headers and Footers
- Example 67: Word Frequency Counter
- Example 68: State Machine Pattern
- Example 69: Histogram Generation
- Example 70: Transposing Rows and Columns
- Example 71: Cross-Referencing Two Files
- Example 72: Pivot Table Generation
- Example 73: Deduplication by Field Value
- Example 74: Generating JSON-Like Output
- Example 75: awk Script as a File with Shebang
- Example 76: Command-Line Variable Passing with -v
- Example 77: @include — Including Other awk Files (gawk)
- Example 78: gawk Profiling with --profile
- Example 79: Network Programming with /inet/ (gawk)
- Example 80: Real-World Pipeline — Nginx Log to Alert
- Example 81: Frequency Table with Percentages
- Example 82: Running Average and Standard Deviation
- Example 83: AWK Automation Pipeline — CSV to SQL INSERT
- Example 84: In-Place File Editing Pattern
- Example 85: Complete Data Pipeline — Sales Analysis
Last updated March 31, 2026