The first lesson showed you what regexWhat is regex?A compact pattern language for matching, searching, and replacing text, built into nearly every programming language and code editor. is for. This one gives you the actual building blocks. By the end, you will be able to read and write the patterns that cover 80% of everyday use cases. Everything here applies directly to JavaScript, but the same syntax works in Python, Ruby, Java, and most other languages.
Literal characters
The simplest possible regexWhat is regex?A compact pattern language for matching, searching, and replacing text, built into nearly every programming language and code editor. is a sequence of ordinary letters or digits. They match themselves, exactly, wherever they appear in the string.
/cat/.test('cat'); // true - exact match
/cat/.test('concatenate'); // true - contains "cat"
/cat/.test('dog'); // false - no "cat" anywhereNotice that /cat/ does not require the entire string to be "cat." It just checks whether the string contains that sequence somewhere. If you want an exact full-string match, you need anchors, more on those shortly.
Escaping special characters
Twelve characters have special meaning in regex: . * + ? ^ $ { } [ ] \ | ( ). To match them literally, prefix them with a backslash.
/\$5\.00/.test('$5.00'); // true - matches literal $ and .
/\$5\.00/.test('X5Y00'); // false - . does not match without escape. is one of the most common regex bugs. Unescaped, . matches any character, so /3.14/ would match "3X14" just as happily as "3.14".The dot wildcard
The . metacharacterWhat is metacharacter?A character with special meaning in regex (e.g., ., *, +, ?) that must be escaped with a backslash to be matched literally. matches any single character except a newline. Think of it as a one-character blank that accepts anything.
/c.t/.test('cat'); // true - 'a' fills the blank
/c.t/.test('cut'); // true - 'u' fills the blank
/c.t/.test('ct'); // false - nothing between c and t
/c.t/.test('cart'); // false - two characters between, not oneCharacter classes
Predefined shorthand classes
Rather than listing every digit individually, regexWhat is regex?A compact pattern language for matching, searching, and replacing text, built into nearly every programming language and code editor. provides shorthand classes that cover common groups.
| Shorthand | Matches | Opposite | Matches |
|---|---|---|---|
\d | Any digit (0-9) | \D | Any non-digit |
\w | Word character (a-z, A-Z, 0-9, _) | \W | Any non-word character |
\s | Whitespace (space, tab, newline) | \S | Any non-whitespace |
// \d - any digit (0–9)
/\d/.test('5'); // true
/\d/.test('a'); // false
// \w - any word character (a–z, A–Z, 0–9, underscore)
/\w/.test('_'); // true
/\w/.test('-'); // false
// \s - any whitespace (space, tab, newline)
/\s/.test('\t'); // true
/\s/.test('a'); // false
// Uppercase = the opposite
/\D/.test('a'); // true - NOT a digit
/\W/.test('@'); // true - NOT a word character
/\S/.test('a'); // true - NOT whitespaceCustom character classes with [ ]
Square brackets let you define your own set of allowed characters. The pattern matches if the input character is any one of the listed options.
/[aeiou]/.test('hello'); // true - 'e' and 'o' are vowels
/[aeiou]/.test('rhythm'); // false - no vowels at all
/[a-z]/.test('A'); // false - uppercase not in the range
/[a-zA-Z]/.test('A'); // true - either case is fineNegate a class by starting it with ^ inside the brackets:
/[^aeiou]/.test('x'); // true - 'x' is not a vowel
/[^0-9]/.test('5'); // false - '5' is in 0–9Anchors
Anchors do not match characters. They match positions within the string. This is the key to requiring an exact match rather than just a substring match.
// ^ - match must start at the beginning of the string
/^hello/.test('hello world'); // true
/^hello/.test('say hello'); // false - 'hello' is not at the start
// $ - match must end at the end of the string
/world$/.test('hello world'); // true
/world$/.test('worldwide'); // false - string continues after 'world'
// Together: the entire string must equal 'hello'
/^hello$/.test('hello'); // true
/^hello$/.test('hello world'); // false^ and $ is what turns a contains-check into an exact-match check. Forgetting them is a classic validation bug, /\d{4}/.test('abc1234xyz') returns true even though the string is clearly not a four-digit number.Quantifiers
Quantifiers attach to the tokenWhat is token?The smallest unit of text an LLM processes - roughly three-quarters of a word. API pricing is based on how many tokens you use. immediately before them and control how many times it must appear.
// * - zero or more
/a*/.test(''); // true - zero 'a's is fine
/a*/.test('aaaa'); // true
// + - one or more
/a+/.test(''); // false - needs at least one
/a+/.test('aaaa'); // true
// ? - zero or one (makes something optional)
/colou?r/.test('color'); // true - 'u' is absent
/colou?r/.test('colour'); // true - 'u' is present
// {n} - exactly n times
/\d{4}/.test('2024'); // true
/\d{4}/.test('202'); // false - only 3 digits
// {n,} - n or more
/\d{2,}/.test('1'); // false - only 1 digit
/\d{2,}/.test('12345'); // true
// {n,m} - between n and m times
/\d{2,4}/.test('1'); // false - below minimum
/\d{2,4}/.test('12'); // true
/\d{2,4}/.test('1234'); // true
/\d{2,4}/.test('12345'); // false - above maximum (with anchors)Quick reference
| Token | Meaning | Example |
|---|---|---|
. | Any character except newline | /c.t/ matches "cat", "cut" |
\d | Digit (0–9) | /\d{3}/ matches "123" |
\w | Word character (a-z, A-Z, 0-9, _) | /\w+/ matches "hello_world" |
\s | Whitespace | /\s+/ matches spaces and tabs |
[abc] | Any of a, b, or c | /[aeiou]/ matches any vowel |
[^abc] | Anything except a, b, or c | /[^0-9]/ matches non-digits |
^ | Start of string | /^hello/ requires start |
$ | End of string | /world$/ requires end |
* | Zero or more | /a*/ matches "", "a", "aaa" |
+ | One or more | /a+/ matches "a", "aaa" |
? | Zero or one | /colou?r/ matches both spellings |
{n,m} | Between n and m times | /\d{2,4}/ matches 2 to 4 digits |
// Testing basic regex patterns
const tests = [
{
pattern: /^\d{3}-\d{3}-\d{4}$/,
valid: ['555-123-4567', '000-000-0000'],
invalid: ['55-123-4567', '5551234567', '555-12-4567']
},
{
pattern: /^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$/,
valid: ['user@example.com', 'test.user@sub.domain.org'],
invalid: ['invalid', '@example.com', 'user@']
},
{
pattern: /\b\w{5}\b/,
valid: ['hello', 'world'],
invalid: ['hi', 'hello world', 'helloworld']
}
];
tests.forEach(({ pattern, valid, invalid }) => {
console.log(`\nPattern: /${pattern.source}/`);
valid.forEach(str => {
const match = pattern.test(str);
console.log(` ${match ? '✅' : '❌'} "${str}" should match`);
});
invalid.forEach(str => {
const match = pattern.test(str);
console.log(` ${!match ? '✅' : '❌'} "${str}" should NOT match`);
});
});
// Extracting with groups
const datePattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = '2024-03-15'.match(datePattern);
if (match) {
console.log('\nDate extraction:');
console.log(' Full match:', match[0]); // '2024-03-15'
console.log(' Year:', match[1]); // '2024'
console.log(' Month:', match[2]); // '03'
console.log(' Day:', match[3]); // '15'
}