mattvr/Regex_Guide.md

## Regex_Guide.md

      
    Raw
  

              Regex_Guide.md
            
          
    ← Back to index
⌕|  Regex


In computing, Regular Expressions (regex) serves as a powerful search tool that helps you find, replace, or match text in a document based on defined patterns or sequences.
Think of it as Swiss Army Knife for working with text, most often used for pattern matching and fuzzy searches over text and documents.
It looks a bit like an alien language, but there's only about 10-20 "words" or concepts you need to understand to be effective. Check out Table of Elements for most of the language.

Examples


Phone number regex
/\d{3}-\d{3}-\d{4}/

The above is a quick example which captures phone numbers in the XXX-XXX-XXXX format.
Reading from left to right...

The /   / slashes surround regexes as a convention.
\d matches a digit.
{3} indicates the preceding character (a digit in this case) must be repeated exactly 3 times.
- matches the literal dash character -
Then the number blocks are repeated for another 3 and 4 digits.
This pattern will match phone numbers in any input document! All other text is ignored.


Match "gray" or "grey"
/gr[ae]y/


The [ae] matches any character in the brackets once.


Match "earth", "wind", OR "fire"
/(earth|wind|fire)/


The ( |  | ) bars inside of the parentheses separate possible strings.


Match everything up to "The End."
/^.*The End\./


The ^ starts the matching at the start of the line.
The .* matches any characters, unlimited times.
Until the last instance of: The End is reached.
The \. distinguishes the literal dot (.) from the "match anything character" dot, so that this doesn't match with an exclamation point for example.

Tips


Tip
Name
Details


Use Regex101 for testing & development.
Regex Testing
Use a tool like Regex101 that allows for immediate feedback while learning and experimenting with regex patterns.


Test iteratively
Iterative Development
Build and validate your regex incrementally, testing each section before adding more complexity.


Table of Elements


Element
Description
Example


Characters


\d
Decimal (number)
\d{2} matches "12"


\w
Word (character)
\w+ matches "Hello"


\s
Space (whitespace, newlines)
\s matches " "


[abc]
Match a, b, or c
[AaBb]+ matches "Baba"


[A-Z]
Uppercase characters
[A-Z]+ matches "HELLO"


[^A-Z]
Anything but uppercase
[^A-Z]+ matches "hello"


.
Any character
. matches "a"


Quantifiers


.*
Match anything unlimited times
.* matches "Hello 123"


.?
Match 0-1 times
a? matches "a" or ""


.+
Match 1+ times
a+ matches "aaa"


.{3}
Match exactly 3 times
a.{3} matches "abbb"


.{2,4}
Match between 2-4 times
a.{2,4} matches "azzzz"


.{2,}
Match 2 or more times
a.{2,} matches "azxzxz"


Capture Groups


(\w+)
Save text into numbered capture group
(\w+) matches "word" and captures "word"


(?<name>\w+)
Save text into "name" capture group
(?<name>\w+) captures "word" as "name"


(a|b)
Match either A or B (and capture)
(abc|def) matches "abc" or "def" and captures it


(?:don't match me)
Don't save into capture group
(?:a|b) matches "a" or "b" but doesn't capture


Lookarounds


(?=...)
Positive lookahead
x(?=y) matches "x" only if "x" is followed by "y"


(?!...)
Negative lookahead
x(?!y) matches "x" only if "x" is not followed by "y"


(?<=...)
Positive lookbehind
(?<=x)y matches "y" only if "y" is preceded by "x"


(?<!...)
Negative lookbehind
(?<!x)y matches "y" only if "y" is not preceded by "x"


Predefined Classes


\D
Not a digit
\D+ matches "abc"


\W
Not a word character
\W+ matches "@"


\S
Not a whitespace
\S+ matches "hello"


Special Characters


\t
Tab
\t matches a tab character


\r
Carriage return
\r matches a carriage return


\n
Newline
\n matches a newline character


More

More tips


Tip
Name
Details


.*? instead of .*
Non-Greedy Matching
Utilize ? after +, *, or {} to make your match non-greedy, stopping at the first match rather than last.


\ (before ., ( ), ?, etc.)
Escape Special Characters
Use \ before special characters when you want to match them literally, for more precise matches.


^ and $
Anchoring
Employ ^ to match the start and $ to match the end of a line, preventing unexpected matches elsewhere.


(?#comment)
Comments
Incorporate inline comments within your regex to explain complex sections and enhance readability.


More examples


Email Matching:
/^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$/


URL Matching:
/https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)/


Date in YYYY-MM-DD:
/\b(19[0-9]{2}|200[0-9]|201[0-9])-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])\b/


Extracting Hashtags:
/#\w+/


Resources


RegExr - Another powerful tool for learning and testing regex.
Regular-Expressions.info - Comprehensive resource for learning regex.
Mozilla Developer Network (MDN) - For a solid understanding of regex in JavaScript.
Tip	Name	Details
Use Regex101 for testing & development.	Regex Testing	Use a tool like Regex101 that allows for immediate feedback while learning and experimenting with regex patterns.
Test iteratively	Iterative Development	Build and validate your regex incrementally, testing each section before adding more complexity.
Element	Description	Example
Characters
`\d`	Decimal (number)	`\d{2}` matches "12"
`\w`	Word (character)	`\w+` matches "Hello"
`\s`	Space (whitespace, newlines)	`\s` matches " "
`[abc]`	Match a, b, or c	`[AaBb]+` matches "Baba"
`[A-Z]`	Uppercase characters	`[A-Z]+` matches "HELLO"
`[^A-Z]`	Anything but uppercase	`[^A-Z]+` matches "hello"
`.`	Any character	`.` matches "a"
Quantifiers
`.*`	Match anything unlimited times	`.*` matches "Hello 123"
`.?`	Match 0-1 times	`a?` matches "a" or ""
`.+`	Match 1+ times	`a+` matches "aaa"
`.{3}`	Match exactly 3 times	`a.{3}` matches "abbb"
`.{2,4}`	Match between 2-4 times	`a.{2,4}` matches "azzzz"
`.{2,}`	Match 2 or more times	`a.{2,}` matches "azxzxz"
Capture Groups
`(\w+)`	Save text into numbered capture group	`(\w+)` matches "word" and captures "word"
`(?<name>\w+)`	Save text into "name" capture group	`(?<name>\w+)` captures "word" as "name"
`(a\|b)`	Match either A or B (and capture)	`(abc\|def)` matches "abc" or "def" and captures it
`(?:don't match me)`	Don't save into capture group	`(?:a\|b)` matches "a" or "b" but doesn't capture
Lookarounds
`(?=...)`	Positive lookahead	`x(?=y)` matches "x" only if "x" is followed by "y"
`(?!...)`	Negative lookahead	`x(?!y)` matches "x" only if "x" is not followed by "y"
`(?<=...)`	Positive lookbehind	`(?<=x)y` matches "y" only if "y" is preceded by "x"
`(?<!...)`	Negative lookbehind	`(?<!x)y` matches "y" only if "y" is not preceded by "x"
Predefined Classes
`\D`	Not a digit	`\D+` matches "abc"
`\W`	Not a word character	`\W+` matches "@"
`\S`	Not a whitespace	`\S+` matches "hello"
Special Characters
`\t`	Tab	`\t` matches a tab character
`\r`	Carriage return	`\r` matches a carriage return
`\n`	Newline	`\n` matches a newline character
Tip	Name	Details
`.?` instead of `.`	Non-Greedy Matching	Utilize `?` after `+`, `*`, or `{}` to make your match non-greedy, stopping at the first match rather than last.
`\` (before `.`, `( )`, `?`, etc.)	Escape Special Characters	Use `\` before special characters when you want to match them literally, for more precise matches.
`^` and `$`	Anchoring	Employ `^` to match the start and `$` to match the end of a line, preventing unexpected matches elsewhere.
`(?#comment)`	Comments	Incorporate inline comments within your regex to explain complex sections and enhance readability.