๐ŸชบToolNest
developer9 min read

Regular Expressions (Regex) for Beginners: A Practical Guide

Learn regex from scratch โ€” syntax, character classes, quantifiers, anchors, and 10 practical patterns for validating emails, URLs, phone numbers, and more.

TN

ToolNest Team

September 10, 2025

#regex#regular expressions#developer tools

What Is Regex?

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Regex is built into almost every programming language and text editor. It's used for:

  • Validation โ€” Does this string match the expected format?
  • Search and replace โ€” Find all occurrences of a pattern and replace them
  • Extraction โ€” Pull specific data out of a larger string
  • Splitting โ€” Split a string on complex patterns

Regex looks intimidating at first, but you only need a handful of concepts to handle 90% of use cases.

Basic Syntax

Literal characters โ€” Match themselves exactly:

Pattern: cat
Matches: "cat" in "the cat sat on the mat"

The dot (.) โ€” Matches any single character (except newline by default):

Pattern: c.t
Matches: "cat", "cut", "cot", "c@t"

Character Classes

Square brackets define a set of characters to match:

[abc]     โ€” matches a, b, or c
[a-z]     โ€” matches any lowercase letter
[A-Z]     โ€” matches any uppercase letter
[0-9]     โ€” matches any digit
[a-zA-Z]  โ€” matches any letter
[^abc]    โ€” matches anything EXCEPT a, b, or c (negation)

Shorthand character classes:

\d   โ€” any digit [0-9]
\D   โ€” any non-digit [^0-9]
\w   โ€” word character [a-zA-Z0-9_]
\W   โ€” non-word character
\s   โ€” whitespace (space, tab, newline)
\S   โ€” non-whitespace

Quantifiers

Quantifiers control how many times a character or group must appear:

*     โ€” 0 or more
+     โ€” 1 or more
?     โ€” 0 or 1 (optional)
{3}   โ€” exactly 3
{2,5} โ€” 2 to 5
{3,}  โ€” 3 or more

Examples:

\d+       โ€” one or more digits: "42", "1000"
[a-z]{3}  โ€” exactly 3 lowercase letters: "cat", "dog"
colou?r   โ€” "color" or "colour" (u is optional)

Anchors

Anchors don't match characters โ€” they match positions:

^  โ€” start of string (or line with m flag)
$  โ€” end of string (or line with m flag)
\b โ€” word boundary
^hello    โ€” "hello" at the start of the string
world$    โ€” "world" at the end
^hello$   โ€” string is exactly "hello", nothing else
\bcat\b   โ€” "cat" as a whole word, not "category" or "catfish"

Groups and Alternation

Parentheses create a capturing group:

(ab)+    โ€” one or more repetitions of "ab": "ab", "abab", "ababab"
(cat|dog) โ€” matches "cat" OR "dog"

Non-capturing group (for grouping without capturing):

(?:cat|dog)+s    โ€” "cats", "dogs", "catcats"

Flags

Flags modify regex behavior:

/pattern/g   โ€” global: find all matches (not just first)
/pattern/i   โ€” case-insensitive: "Cat" matches /cat/i
/pattern/m   โ€” multiline: ^ and $ match line boundaries
/pattern/gi  โ€” global + case-insensitive

10 Practical Regex Patterns

1. Email address (simplified):

/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

2. US phone number:

/^\+?1?[-.\s]?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}$/

Matches: (555) 123-4567, 555-123-4567, 5551234567, +1 555 123 4567

3. URL:

/https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_+.~#?&/=]*)/

4. Date (YYYY-MM-DD):

/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/

5. IP address (IPv4):

/^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/

6. Hex color code:

/^#([0-9A-Fa-f]{6}|[0-9A-Fa-f]{3})$/

7. Postal code (US ZIP):

/^\d{5}(-\d{4})?$/

8. Strong password:

/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%]).{12,}$/

(Lookaheads require at least one lowercase, uppercase, digit, and special char)

9. Extract content between tags:

/<h1>(.*?)<\/h1>/g

10. Remove extra whitespace:

/\s+/g โ†’ replace with " "

Greedy vs Lazy Matching

By default, quantifiers are greedy โ€” they match as much as possible.

Pattern: <.+>
String:  <div>Hello</div>
Match:   <div>Hello</div>  (the whole thing!)

Adding ? makes them lazy โ€” they match as little as possible:

Pattern: <.+?>
String:  <div>Hello</div>
Matches: <div> and </div> separately

Common Pitfalls

  1. Forgetting to escape special characters โ€” ., *, +, ?, (, ), [, ], {, }, \, ^, $, | all have special meanings. Use \. to match a literal dot.

  2. Greedy vs lazy โ€” If your match is too long, add ? after the quantifier.

  3. Catastrophic backtracking โ€” Some regex patterns on certain inputs can take exponential time. Avoid nested quantifiers like (a+)+.

  4. Forgetting anchors for full-string match โ€” \d+ matches any string that contains digits, including "abc123xyz". Use ^\d+$ to ensure the entire string is digits.

Use our free Regex Tester to test and debug your regular expressions with live match highlighting.

Share this article

Try the Free Tool