What is Regex
Regular expressions (regex) are patterns used to match and manipulate text. Think of them as a super-powered search function that lets you find specific patterns in strings—like “all email addresses,” “phone numbers,” or “words starting with ‘cat’"—instead of just exact matches. They’re everywhere: text editors, command-line tools, programming languages, and even your IDE’s search-and-replace feature.
Why Should You Care?
Regex saves you hours of manual text processing. Whether you’re parsing log files, validating user input, extracting data from documents, or refactoring code, regex turns 100 lines of string manipulation into a single elegant pattern. It’s like having a Swiss Army knife for text—once you learn it, you’ll wonder how you ever lived without it.
The Building Blocks
Literals and Character Classes
The simplest regex is just plain text: cat matches “cat”. But the magic begins with character classes:
[abc]matches any single character: a, b, or c[a-z]matches any lowercase letter[0-9]matches any digit[^abc]matches anything EXCEPT a, b, or c (the^inside brackets means “not”)
The Dot and Escapes
.matches any single character except newline\.matches a literal period (backslash escapes special characters)\dmatches any digit (shorthand for[0-9])\wmatches any word character (letters, digits, underscore)\smatches any whitespace (spaces, tabs, newlines)
Quantifiers (The Power Multipliers)
Quantifiers tell you how many times a pattern should repeat:
*means zero or more times (e.g.,a*matches “”, “a”, “aa”, “aaa”)+means one or more times (e.g.,a+matches “a”, “aa”, but not “”)?means zero or one time (makes something optional){3}means exactly 3 times{3,}means 3 or more times{3,6}means between 3 and 6 times
Anchors
Anchors don’t match characters—they match positions:
^matches the start of a line$matches the end of a line\bmatches a word boundary (the edge between a word and non-word character)
Groups and Alternation
(pattern)creates a capture group|means “or” (e.g.,cat|dogmatches “cat” or “dog”)
Quick Examples
^\d{3}-\d{4}$matches phone numbers like “123-4567”\w+@\w+\.\w+matches simple email addresses^[A-Z]matches lines starting with an uppercase lettercolou?rmatches both “color” and “colour”\d{2,4}matches 2 to 4 consecutive digits
Practice Time: Meet rg (ripgrep)
Time to get your hands dirty! rg (ripgrep) is a blazing-fast command-line search tool that uses regex. Let’s install it and practice.
Installing ripgrep
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install ripgrep
MacOS:
brew install ripgrep
Windows:
Using Scoop (no admin required):
# Install Scoop first
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
Invoke-RestMethod -Uri https://get.scoop.sh | Invoke-Expression
# Then install ripgrep
scoop install ripgrep
Verify installation:
rg --version
The Challenge
Create a sample file to practice on:
cat > sample.txt << 'EOF'
My email is john@example.com
Call me at 321-555-1234 or 543-555-5678
The year 2024 was great!
Colors: color, colour, favorite, favourite
Error: Failed to connect on 2024-11-08
192.168.1.1 is the router IP
IMPORTANT: Read this carefully
The quick brown fox jumps
http://example.com and https://secure.com
EOF
Now try these exercises. First, try writing the regex yourself, then test it:
Exercise 1: Find all lines containing email addresses
Hint
Think about the pattern: word characters, @, word characters, dot, word charactersSolution
rg '\w+@\w+\.\w+' sample.txt
Exercise 2: Find all phone numbers (format: 321-555-1234)
Hint
Use \d for digits and remember to escape the hyphen or not (it's literal outside brackets)Solution
rg '\d{3}-\d{3}-\d{4}' sample.txt
Exercise 3: Find lines starting with uppercase words
Hint
Use ^ for line start and character classes for uppercase lettersSolution
rg '^[A-Z]' sample.txt
Exercise 4: Match both “color” and “colour” (and their variants)
Hint
Make the 'u' optional using ?Solution
rg 'colou?r' sample.txt
Exercise 5: Find IP addresses (simple version)
Hint
Pattern: digits, dot, digits, dot, digits, dot, digitsSolution
rg '\d+\.\d+\.\d+\.\d+' sample.txt
Exercise 6: Find URLs (starting with http or https)
Hint
Use alternation (|) for http or https, and .+ for the restSolution
rg 'https?://\S+' sample.txt
Bonus Tips for rg
rg -i patternfor case-insensitive searchrg -w patternto match whole words onlyrg -n patternto show line numbersrg -c patternto count matches per filerg -A 2 patternto show 2 lines after each match
You’re Now a Regex Practitioner
With these fundamentals, you can handle 95% of everyday regex tasks. The key to mastery? Practice! Start using regex in your daily workflow—in your code editor’s search, command-line tools, or scripts. You’ll be pattern-matching like a pro in no time.
Remember: regex can get complex fast, but you don’t need to memorize everything. Keep this guide handy, and don’t be afraid to test your patterns incrementally. Happy pattern matching!
