Regex Tester Online
Test and debug regular expressions online. Match, replace, and validate regex patterns with real-time results.
Other Text Cleaner Tools
Mistral Style Analyzer
Analyze writing style and consistency in Mistral-generated text.
Open Tool →Perplexity Humanizer
Humanize Perplexity text to make it sound more natural and human-written.
Open Tool →XML Formatter Online
Format and indent XML documents online. Beautify minified XML with proper indentation and syntax for free.
Open Tool →Spanish AI Humanizer
Humanize Spanish AI-generated text to sound natural and bypass AI detectors online free.
Open Tool →Wattpad Story Humanizer
Humanize AI-generated Wattpad stories to sound emotional, immersive, and reader-friendly online free.
Open Tool →ChatGPT Originality Checker
Check the originality and authenticity of ChatGPT-generated content.
Open Tool →GPT-4.5 Humanizer
Humanize GPT-4.5-generated text to sound natural and bypass AI detectors online free.
Open Tool →ChatGPT Paragraph Rewriter
Rewrite entire paragraphs from ChatGPT to enhance flow and readability.
Open Tool →Regex Tester: Free Online Regular Expression Debugger, Validator, and Reference
Regular expressions are simultaneously the most powerful string-manipulation tool in a developer's toolkit and the most likely to cause silent bugs, catastrophic performance failures, and "it works on my machine" mysteries. A regex that looks correct in your head can match far too much, far too little, or take exponential time on certain inputs "” none of which becomes obvious until you run it against real data.
Our free online regex tester gives you a live, interactive environment to write, test, and debug regular expressions with real-time match highlighting, capture group visualization, flag toggles, a named group inspector, match count display, and detailed per-match metadata including start/end indices. Every keystroke updates the results instantly. No need to modify source code and rerun a test suite just to see if your pattern works "” paste your test string, type your regex, and see exactly what it matches in milliseconds.
Whether you are parsing log files, validating form inputs, extracting data from API responses, writing a search-and-replace, building a tokenizer, or studying regex syntax for the first time, this tool makes the invisible visible.
What Are Regular Expressions? A Complete Foundation
A regular expression (commonly abbreviated regex or regexp) is a sequence of characters that defines a pattern used for searching, matching, extracting, and transforming text. The theoretical foundations trace to the 1950s work of mathematician Stephen Kleene, who developed the concept of regular languages and regular sets. Ken Thompson implemented the first practical regex engine in the QED editor and later in Unix tools like grep (Global Regular Expression Print), bringing regex into everyday software development.
Today, regular expressions are built into every major programming language and are used in:
- Form validation (email, phone, postal code, credit card numbers)
- Log parsing and structured data extraction from unstructured text
- Search-and-replace operations in text editors and IDEs
- Lexical analysis and tokenization in compilers and interpreters
- URL routing and pattern matching in web frameworks
- Data cleaning and transformation pipelines
- Network security tools "” intrusion detection patterns, WAF rules
- Text mining and natural language preprocessing
- Code analysis and linting rules
- Database text search (REGEXP_LIKE in MySQL, ~ operator in PostgreSQL)
The same core syntax "” with minor variations "” works across Python, JavaScript, Java, Go, Ruby, Perl, PHP, C++, Rust, .NET, Bash, and countless other environments. Learning regex is one of the highest-return technical investments a developer can make because it applies everywhere.
Regex Engines: How Pattern Matching Actually Works
Understanding the engine behind regex matching is essential for writing patterns that are both correct and performant. There are two main engine types:
NFA: Nondeterministic Finite Automaton
Most modern regex engines "” including those in Python, Java, JavaScript, Perl, PHP, .NET, and Ruby "” use NFA-based engines. NFA engines use backtracking: when the engine encounters a choice (like whether to match 'a' once or twice for a*), it tries one path. If that path fails, it backtracks and tries the other path. NFA engines support powerful features like backreferences, lookaheads, lookbehinds, and possessive quantifiers.
The power of backtracking comes with a cost: in the worst case, NFA engines can exhibit exponential time complexity on patterns with ambiguous quantifiers. This is the source of ReDoS (Regular Expression Denial of Service) vulnerabilities.
DFA: Deterministic Finite Automaton
DFA engines compile the pattern into a state machine that processes each character exactly once, guaranteeing linear time matching regardless of pattern complexity. The tradeoff: DFA engines cannot support backreferences or most lookaround assertions because these features require memory of what was matched, which DFAs do not have.
Google's RE2 engine is the most prominent DFA-based implementation. It is used in Go'sregexp package, Google's internal infrastructure, and is available as a library for Python, Java, and other languages. RE2 is safe to use with user-supplied patterns because it guarantees linear time.
POSIX-compliant tools like grep (without PCRE flags) and awk use DFA-based matching as well.
Regex Metacharacters: The Building Blocks
Twelve characters have special meaning in most regex flavors and must be escaped with a backslash to match literally: . ^ $ * + ? { } [ ] \ | ( )
The Dot: Any Character
A dot (.) matches any single character except a newline by default. This is one of the most commonly misused regex constructs "” using .* to match "anything" in the middle of a pattern often matches far more than intended due to greedy expansion. To match newlines as well, enable the dotAll (s) flag or use [\s\S] as a cross-engine alternative.
When you actually want to match a literal dot (e.g., in a filename pattern or version number), you must escape it: \. matches a literal dot while . matches any character. The pattern version 1.0 would match "version 100" (since .matches '0') while version 1\.0 matches only "version 1.0".
Anchors: Position Matching
Anchors match positions in the string, not characters. They are zero-width "” consuming no characters from the input:
^matches the start of the string (or start of each line with themmultiline flag)$matches the end of the string (or end of each line withm)\bmatches a word boundary "” the position between a word character (\w) and a non-word character (\W) or string boundary\Bmatches a non-word-boundary position\Amatches only at the start of the string (Python, .NET; not available in JavaScript)\Zmatches at the end of the string or before a final newline (Python, .NET)
Without anchors, a pattern can match anywhere in the string. The pattern catmatches "concatenate", "catalog", "education". The pattern ^cat$ matches only the exact string "cat".
Word boundaries are particularly useful for whole-word matching. \bcat\b matches "cat" in "the cat sat" but not in "concatenate" or "category". This avoids false positives that are common when searching for short words that appear as substrings of longer words.
Character Classes: Matching Sets of Characters
Shorthand Character Classes
These widely supported shorthands cover the most common character sets:
\d"” digit, equivalent to[0-9]\D"” non-digit, equivalent to[^0-9]\w"” word character:[a-zA-Z0-9_]\W"” non-word character:[^a-zA-Z0-9_]\s"” whitespace: space, tab, newline, carriage return, form feed, vertical tab\S"” non-whitespace\h"” horizontal whitespace (PCRE, not JavaScript)\v"” vertical whitespace (PCRE) or vertical tab character (JavaScript)
Note: with the Unicode (u) flag in JavaScript, \d still only matches ASCII digits 0-9, not Unicode digit characters from other scripts. For full Unicode digit matching, use \p{Decimal_Number} with the u flag and Unicode property escapes.
Custom Character Classes
Square brackets define a custom set: [aeiou] matches any vowel. [a-z]matches any lowercase ASCII letter. [a-zA-Z0-9] matches any alphanumeric character.[^aeiou] (negated with ^ at start) matches any non-vowel.
Inside a character class, most metacharacters lose their special meaning. [.] matches a literal dot, not any character. Exceptions that retain special meaning inside brackets:] (closes the class), \ (escape), ^ at the start (negation), and - between characters (range). To include a literal - in a class, put it first, last, or escape it: [-aeiou] or [aeiou-].
Unicode Property Escapes (ES2018+)
With the u flag in JavaScript or re.UNICODE in Python, Unicode property escapes let you match characters by their Unicode properties:
\p{Letter}"” any Unicode letter in any script\p{Decimal_Number}"” any Unicode decimal digit\p{Script=Greek}"” Greek script characters\p{Emoji}"” emoji characters
These are invaluable for internationalized applications that need to validate or parse text in non-Latin scripts.
Quantifiers: Controlling Repetition in Depth
Basic Quantifiers
*"” zero or more occurrences+"” one or more occurrences?"” zero or one occurrence (also makes quantifiers lazy when appended){n}"” exactly n occurrences{n,}"” n or more occurrences{n,m}"” between n and m occurrences (inclusive)
Greedy Quantifiers: The Default
By default, all quantifiers are greedy "” they match as many characters as possible while still allowing the overall pattern to succeed. Consider the pattern <.+> applied to<b>bold text</b> and <i>italic</i>. A greedy.+ expands to match everything from the first < to the last>, capturing the entire string. The engine has to backtrack from the end of the string until it finds a position where > matches.
Lazy (Non-Greedy) Quantifiers
Adding ? after any quantifier makes it lazy "” it matches as few characters as possible:*?, +?, ??, {n,m}?. The pattern<.+?> matches <b> then stops, rather than continuing to the end of the string. Lazy quantifiers are useful when you need to match the shortest possible sequence between delimiters.
However, lazy quantifiers are not inherently faster than greedy ones "” in many cases they are slower because the engine has to try many small expansions before finding one that lets the overall pattern match. For optimal performance, be specific: <[^>]+>(match anything that is not a closing angle bracket) is both more correct and more efficient than <.+?>.
Possessive Quantifiers and Atomic Groups
Possessive quantifiers (*+, ++, ?+ "” supported in PCRE, Java, but not JavaScript) never backtrack "” once they consume characters, those characters are committed. Atomic groups (?>...) (PCRE, Java) achieve the same effect. These constructs eliminate catastrophic backtracking at the cost of sometimes not finding matches that would require backtracking. They are advanced optimizations for high-performance pattern matching.
Groups: Capturing, Non-Capturing, and Named
Capturing Groups
Parentheses create capturing groups that extract matched substrings. Groups are numbered 1, 2, 3 from left to right by their opening parenthesis. In JavaScript:
const match = '2024-03-15'.match(/(\d{4})-(\d{2})-(\d{2})/);
// match[1] = '2024', match[2] = '03', match[3] = '15'
Capturing groups also create backreference targets: \1 later in the pattern matches the same text that group 1 captured. This enables patterns like\b(\w+)\s+\1\b to detect doubled words ("the the", "is is").
Non-Capturing Groups
(?:...) groups elements for quantification or alternation without capturing. This matters when you want to apply a quantifier to a multi-character sequence without extracting it:(?:https?://)?www\. makes the protocol optional without creating a capture group for it. Non-capturing groups are also marginally more efficient than capturing ones for large patterns with many groups.
Named Capturing Groups
Named groups significantly improve readability of complex patterns:
- JavaScript (ES2018+):
(?<year>\d{4}), accessed asmatch.groups.year - Python:
(?P<year>\d{4}), accessed asmatch.group('year') - PCRE/.NET/Java:
(?<year>\d{4})
Named backreferences: \k<year> (JavaScript, PCRE, .NET) or(?P=year) (Python). Named groups make patterns self-documenting and resilient to reordering "” accessing match.groups.year works correctly even if you add more groups before the year group, unlike numbered backreferences which would break.
Lookaround Assertions: Context Without Consumption
Lookaround assertions are zero-width assertions that match based on surrounding context without consuming characters. They are one of the most powerful and frequently misunderstood regex features.
Lookaheads
Positive lookahead (?=pattern): asserts the engine is at a position where pattern follows. \d+(?= USD) matches numbers only when followed by " USD". The " USD" part is not included in the match "” the lookahead is zero-width.
Negative lookahead (?!pattern): asserts pattern does NOT follow. \bcat(?!nap\b) matches "cat" but not "catnap".
Lookbehinds
Positive lookbehind (?<=pattern): asserts patternprecedes the current position. (?<=\$)\d+(?:\.\d{2})? matches dollar amounts after a $ sign without including the $ in the match.
Negative lookbehind (?<!pattern): asserts patterndoes NOT precede. (?<!\d)\d{3}(?!\d) matches exactly 3 consecutive digits not surrounded by other digits.
JavaScript added lookbehind support in ES2018. Before that, JavaScript only supported lookaheads. Python, PCRE, .NET, and Java have supported both for much longer. RE2 (Go) does not support lookbehind.
PCRE2 and .NET support variable-length lookbehinds. Python's re module requires fixed-width lookbehinds (though the newer regex module lifts this restriction). JavaScript ES2018 supports variable-length lookbehinds.
Regex Flags and Modifiers
Case Insensitive (i)
Makes the pattern match regardless of letter case. /hello/i matches "Hello", "HELLO", "hElLo". With the u flag for Unicode, case folding follows the Unicode standard including characters like ß (matches SS in German).
Global (g)
Finds all matches in the string rather than stopping at the first. Required forString.prototype.matchAll() and for replacing all occurrences inString.prototype.replace(). Note: the g flag causes the RegExp object to maintain state (the lastIndex property), which can cause surprising behavior when reusing the same regex object "” prefer matchAll() which resets state.
Multiline (m)
Makes ^ and $ match at line boundaries (start/end of each line) rather than only at the start and end of the entire string. Essential for processing multi-line text where you need to anchor patterns to individual lines.
DotAll (s)
Makes . match newline characters as well. Added in JavaScript ES2018. Without this flag (the default), . does not match \n or \r. Use [\s\S] as a cross-engine alternative.
Unicode (u)
Enables full Unicode mode in JavaScript. Without u, the regex engine treats strings as sequences of UTF-16 code units; with u, it treats them as sequences of Unicode code points (correctly handling surrogate pairs for characters outside the BMP). The u flag also enables Unicode property escapes \p{...}. Always use u when working with non-ASCII text.
Sticky (y)
Makes the pattern match only at the current lastIndex position, not anywhere in the string. Used for streaming tokenizers that process input position by position. Less common than other flags but powerful for parsing.
Essential Regex Patterns for Common Tasks
Email Address Validation
The practical, widely-used pattern (not full RFC 5321 compliance "” that requires a 6KB regex):
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
Important: regex can only validate format. Whether the email address actually exists and accepts mail requires sending a confirmation email. Many technically valid email addresses fail in practice (e.g., those with quoted local parts containing spaces).
HTTP/HTTPS URL
/^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_+.~#?&/=]*)$/i
For JavaScript, prefer the URL constructor for parsing and validation "” it handles edge cases regex cannot, like internationalized domain names (IDN) and complex query strings.
Password Complexity
Minimum 8 characters, at least one uppercase letter, one lowercase letter, one digit, one special character:
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/
IPv4 Address
/^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.{3})(25[0-5]|2[0-4]\d|[01]?\d\d?)$/
The alternation handles all valid ranges: 0-9, 10-99, 100-199, 200-249, 250-255.
IPv6 Address (simplified)
Full IPv6 validation with all abbreviation forms (consecutive zeros as ::, etc.) requires a complex pattern. For input validation purposes:
/^([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}$/ (full form only). For all forms including ::, use a dedicated IP validation library.
ISO 8601 Date
/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/ "” validates YYYY-MM-DD format with month range 01-12 and day range 01-31. Note this does not validate that the day is valid for the specific month (February 30 would pass). Combine with date parsing for full validation.
US Phone Number
/^[+]?1?\s*\(?(\d{3})\)?[\s.-]?(\d{3})[\s.-]?(\d{4})$/"” accepts (555) 123-4567, 555-123-4567, +1 555 123 4567, and other common formats.
HTML Tag Extraction
/<([a-zA-Z][a-zA-Z0-9-]*)(?:\s[^>]*)?>(.*?)<\/\1>/gs "” captures tag name and inner content for simple HTML. For production HTML parsing, use a dedicated HTML parser "” regex cannot correctly handle malformed HTML, self-closing tags, or nested identical tags.
Hex Color Code
/^#([0-9a-fA-F]{3}|[0-9a-fA-F]{4}|[0-9a-fA-F]{6}|[0-9a-fA-F]{8})$/"” matches 3, 4, 6, or 8-digit hex color codes (including alpha variants).
Credit Card Number (format only)
/^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|6(?:011|5[0-9][0-9])[0-9]{12})$/"” covers Visa (4xxx), Mastercard (51-55xxx), Amex (34/37xxx), Diners (300-305/36/38xxx), Discover (6011/65xxx). Always use Luhn algorithm validation in addition to format matching.
ReDoS: Regular Expression Denial of Service
ReDoS is a class of denial-of-service attack that exploits catastrophic backtracking in NFA-based regex engines. By crafting an input that forces the engine into exponential backtracking, an attacker can make a server spend minutes or hours on a single regex evaluation.
Classic vulnerable patterns: (a+)+, (a*)*,([a-zA-Z]+)*, (a|aa)+. Applied to a string of many 'a' characters followed by a character that does not match, these cause 2^n combinations to be tried.
Real-world ReDoS vulnerabilities have affected widely-used npm packages (moment.js, email-validator, ua-parser-js), server frameworks, and WAF rules. OWASP includes ReDoS in its list of security concerns for web applications.
Defenses: (1) use specific, unambiguous patterns that avoid nested quantifiers; (2) use RE2-based engines for user-supplied patterns; (3) set regex timeouts in your framework; (4) use our tester's catastrophic backtracking detector before deploying patterns to production.
Regex in Major Languages: Key Differences
JavaScript
Regex literals: /pattern/flags. Constructor: new RegExp(pattern, flags). ES2018 added: lookbehind assertions, named capturing groups, Unicode property escapes, s(dotAll) flag. The g flag makes RegExp objects stateful (tracks lastIndex). Use String.prototype.matchAll() for safe iteration over all matches. No native possessive quantifiers or atomic groups.
Python
Import the re standard library module. Use raw strings (r"\d+") to avoid double-escaping backslashes. Key distinction: re.match() anchors to the start of the string; re.search() finds the first match anywhere. Both returnNone on failure (always check before accessing match object). Use re.compile()for patterns used multiple times "” it caches the compiled pattern. The third-partyregex module adds possessive quantifiers, atomic groups, overlapping matches, and variable-width lookbehinds.
Java
java.util.regex.Pattern and Matcher. Patterns must be compiled:Pattern.compile(pattern, flags). Java regex supports possessive quantifiers (*+, ++) and atomic groups natively, making it safer against ReDoS than many engines. The Matcher.group(name) method accesses named groups.
Go
Go's regexp package uses RE2 semantics "” guaranteed linear time, no backreferences, no lookaheads or lookbehinds. This is a deliberate safety-first choice. Theregexp/syntax package exposes the parser for building regex-based tools. For PCRE features in Go, use the github.com/dlclark/regexp2 package (at the cost of the linear-time guarantee).
PCRE / PHP
PCRE (Perl-Compatible Regular Expressions) is the gold standard for feature richness. PHP'spreg_ functions use PCRE. Features unique to PCRE: recursive patterns ((?R)), conditional patterns, callouts, Unicode grapheme clusters, and PCRE2's extended Unicode support. PCRE is the engine used in Nginx, Apache, and many security tools.
Debugging Strategies with Our Regex Tester
Build Incrementally
Start with a simple literal pattern and add complexity one piece at a time. Test each addition against both matching and non-matching examples before proceeding. This "build-and-verify" approach catches mistakes immediately at the point of introduction.
Use Named Groups for Readability
Complex patterns with many capture groups become hard to understand. Named groups like(?<year>\d{4}) make patterns self-documenting and make our tester's group visualization much more useful "” you see "year: 2024" instead of "group 1: 2024".
Test Edge Cases Explicitly
Always test: empty string, single character, maximum length input, input with only special characters, Unicode characters, strings that almost-but-not-quite match, and inputs at the exact boundaries of quantifier ranges. These edge cases reveal subtle pattern bugs that happy-path testing misses.
Use Comments for Complex Patterns
Many regex flavors support verbose/extended mode (x flag in Python, PCRE) that ignores whitespace and allows # comments. Breaking a complex pattern across multiple lines with inline comments makes it maintainable. Our tester shows the full pattern while allowing you to paste commented versions for development.
Performance and Privacy
All regex matching in our tester runs in your browser using a Web Worker so the UI never freezes. The worker has a configurable timeout to catch patterns that could run indefinitely on long inputs. No text, patterns, or results are sent to our servers "” your code and data stay private. The tool works offline once the page is loaded and is safe to use with sensitive log data, PII, or proprietary code.
Frequently Asked Questions
Common questions about the Regex Tester Online.
FAQ
General
1.What is a regex tester and why do I need one?
A regex tester is an interactive tool that shows you what your regular expression matches in real time "” highlighting matches, displaying capture groups, and counting results. Without one, you must modify code and rerun tests to see if a pattern works. A tester gives instant feedback, dramatically speeding up regex development and debugging.
2.What regex flavor does this tester use?
Our tester uses the JavaScript RegExp engine, which includes all ES2018+ features: named capturing groups, lookbehind assertions, Unicode property escapes, and the s (dotAll) flag. Results directly represent what you will get in JavaScript code and are broadly applicable to other PCRE-based languages.
3.Is my text safe to paste into this regex tester?
Yes. All regex matching runs locally in your browser "” no text, patterns, or results are sent to any server. The tool is safe to use with log files, PII, proprietary code, or any sensitive content.
Syntax
4.What does the dot (.) match and when should I escape it?
A dot matches any single character except newline by default. Enable the s (dotAll) flag to match newlines too. To match a literal dot (e.g., in a filename or version number), escape it: \. "” otherwise .json matches "ajson" and version.1 matches "version 1" (treating . as any character).
5.What is the difference between * and + quantifiers?
* means "zero or more" "” the element is optional. + means "one or more" "” at least one occurrence is required. Use + when the element must be present at least once. \d+ requires at least one digit; \d* matches even an empty string.
6.What is the difference between greedy and lazy quantifiers?
Greedy quantifiers (default) match as many characters as possible. Lazy quantifiers (add ? after: *?, +?, ??) match as few as possible. Example: <.+> applied to "<b>text</b>" matches the entire string; <.+?> matches only "<b>". Lazy is not always faster "” use specific character classes like [^>]+ for best performance.
7.How do I match special characters like ( ) . * + ? literally?
Escape them with a backslash: \( matches a literal parenthesis, \. matches a literal dot, \* matches a literal asterisk. The full list of metacharacters to escape: . ^ $ * + ? { } [ ] \ | ( ). Inside character classes [brackets], most metacharacters lose their special meaning.
8.How do I match the start and end of a line vs the whole string?
Without the m (multiline) flag, ^ matches only the very start of the string and $ matches only the very end. With the m flag, ^ and $ match at the start and end of each line (after each \n). Use \A and \Z in Python/.NET for string-only anchors regardless of multiline mode.
Flags
9.What does the global (g) flag do and when is it required?
The g flag makes the regex find all matches in the string rather than stopping at the first. It is required for String.replace() to replace all occurrences (without g, only the first match is replaced) and for String.matchAll() to work.
10.When should I use the m (multiline) flag?
Use m when you need ^ and $ to match line boundaries in multi-line text. For example, to find lines starting with "Error" in a log file: /^Error.*/gm. Without m, ^ only matches the very beginning of the entire string.
11.What does the s (dotAll) flag do?
The s flag makes the dot (.) match newline characters (\n, \r). Without it, . skips newlines. This is useful for matching content that spans multiple lines, like HTML blocks or multi-line strings. Alternative: use [\s\S] for cross-engine compatibility.
Groups
12.What is a capturing group and how do I access the captured text?
Parentheses create capturing groups. In JavaScript: const m = "2024-03-15".match(/(\d{4})-(\d{2})-(\d{2})/); gives m[1]="2024", m[2]="03", m[3]="15". Named groups: /(?<year>\d{4})-/ gives m.groups.year="2024". Our tester displays all captured groups for each match.
13.What is a non-capturing group (?:...) and why use it?
Non-capturing groups (?:...) group elements for quantifiers or alternation without creating a capture reference. Use them when you need grouping but don't need to extract the matched text. They are more efficient than capturing groups and don't pollute your match result with unwanted groups.
14.How do backreferences work in regex?
Backreferences (\1, \2, or \k<name> for named groups) match the same text that a capturing group matched earlier in the pattern. \b(\w+)\s+\1\b matches doubled words ("the the"). In replacement strings, $1 and $2 reference captured groups.
Lookaround
15.What is a lookahead and how is it different from a regular match?
A lookahead (?=pattern) asserts the pattern follows at the current position without consuming those characters. \d+(?= USD) matches numbers only before " USD" "” the " USD" is not part of the match result. This is called zero-width matching and lets you match based on context without including context in the result.
16.What is a lookbehind assertion?
A positive lookbehind (?<=pattern) asserts the pattern precedes the current position. (?<=\$)\d+ matches digits preceded by a dollar sign without capturing the $. JavaScript added lookbehind support in ES2018. Note: Go's RE2 engine and POSIX tools do not support lookbehinds.
Performance
17.What is catastrophic backtracking?
Catastrophic backtracking occurs when a pattern like (a+)+ or (a|aa)+ is applied to a string that almost matches. The engine tries exponentially many combinations of how to split the input between quantifier levels, taking seconds or hours instead of milliseconds. Avoid nested quantifiers and use specific character classes.
18.What is ReDoS and how can regex cause denial of service?
ReDoS (Regular Expression DoS) exploits catastrophic backtracking. An attacker submits carefully crafted input to a form or API that uses a vulnerable regex, causing the server's regex engine to spend exponential time. Real-world ReDoS attacks have taken down services. Mitigate with RE2 engines for user-supplied patterns, regex timeouts, and pattern review.
Languages
19.How does Python regex differ from JavaScript regex?
Key differences: Python has re.match() (anchored at start) vs re.search() (anywhere); JavaScript has no direct equivalent. Python uses r"raw strings" to avoid double-escaping. Python named groups use (?P<name>...) while JavaScript uses (?<name>...). Python's re.compile() caches patterns; JavaScript should use regex literals for the same effect.
20.Why does Go's regex not support lookaheads or backreferences?
Go uses the RE2 engine which guarantees linear-time matching by restricting features that require backtracking. RE2 does not support lookaheads, lookbehinds, or backreferences. This is a deliberate safety tradeoff "” RE2 cannot have catastrophic backtracking. Use github.com/dlclark/regexp2 for PCRE features in Go.
Common Patterns
21.What regex validates an email address?
A practical email validation regex: /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ "” This catches common formatting errors but cannot verify deliverability. Always use email confirmation for final validation.
22.What regex validates a US phone number?
/^[+]?1?\s*\(?([2-9]\d{2})\)?[\s.-]?([2-9]\d{2})[\s.-]?(\d{4})$/ "” accepts (555) 123-4567, 555-123-4567, +1 555 123 4567, and similar formats. The [2-9] constraint excludes area codes and exchanges starting with 0 or 1 (invalid in NANP).
23.How do I write a regex to match an entire line?
With the m (multiline) flag: /^.*your pattern.*$/m matches lines containing your pattern. Without m, ^ and $ refer to the entire string. Use /^.*your pattern.*$/gm to find all matching lines in multi-line text.
Debugging
24.Why does my regex match too much?
Most likely cause: greedy quantifiers expanding beyond the intended boundary. Solutions: (1) switch to lazy quantifiers (*? instead of *); (2) use a negated character class ([^>]* instead of .*); (3) add anchors (^ and $) if you need a full-string match; (4) add boundary assertions (\b for word boundaries).
25.Why does my regex match nothing even though it looks correct?
Common causes: (1) unescaped special characters (\. needed for literal dot); (2) case mismatch without i flag; (3) missing m flag for multi-line ^ and $ anchors; (4) invisible characters or different newline styles (\r\n vs \n) in the test string; (5) wrong quantifier "” missing + or * making a character required exactly once.