Regex Tester Online
The ultimate tool to test, debug, and validate regular expressions. Features real-time highlighting, detailed explanations, and support for JavaScript, PHP, Python, and Java regex flavors.
Results
Match Information:
Capture Groups:
Regex Explanation
Common Regex Patterns
How to Use the Regex Tester
Our online regex tester is designed to be intuitive and powerful. Follow these simple steps to test your regular expressions:
- Enter Your Pattern: Type your regular expression in the "Regular Expression Pattern" input field. You don't need to include the forward slashes (
/) as they are provided. - Set Flags: Add any necessary flags (like
gfor global,ifor case-insensitive) in the flags input box. - Input Test String: Paste or type the text you want to test against in the "Test String" area.
- Select Flavor: Choose your target programming language (JavaScript, PCRE/PHP, Python, or Java) to ensure accurate behavior.
- Analyze Results:
- Matches: See highlighted matches in your test string instantly.
- Match Info: View detailed information about each match, including index and content.
- Capture Groups: Inspect captured groups in a structured table format.
- Explanation: Read the auto-generated explanation to understand exactly how your pattern works.
Deep Dive: What is a Regular Expression?
A regular expression (often abbreviated as regex or regexp) is a sequence of characters that specifies a search pattern in text. Usually, such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.
Regex is a fundamental tool for developers, data scientists, and system administrators. It allows you to:
- Validate Input: Ensure user inputs like email addresses, phone numbers, and passwords meet specific criteria.
- Search and Replace: Find complex patterns in code or text files and replace them efficiently.
- Parse Data: Extract specific information (like dates, prices, or IDs) from unstructured text logs.
- Web Scraping: Identify and extract relevant data from HTML or XML content.
Regex Syntax Guide
Understanding regex syntax is key to mastering pattern matching. Here are the core components:
1. Anchors
Anchors do not match any character but assert a position in the string.
^: Matches the start of the string (or line in multiline mode).$: Matches the end of the string (or line in multiline mode).\b: Matches a word boundary (position between a word character and a non-word character).
2. Character Classes
Character classes match a single character from a specific set.
.: Matches any single character except newline.[abc]: Matches any one of the characters a, b, or c.[^abc]: Matches any character except a, b, or c.\d: Matches any digit (equivalent to[0-9]).\w: Matches any word character (alphanumeric plus underscore).\s: Matches any whitespace character (space, tab, newline).
3. Quantifiers
Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found.
*: Matches 0 or more times.+: Matches 1 or more times.?: Matches 0 or 1 time.{n}: Matches exactly n times.{n,}: Matches n or more times.{n,m}: Matches between n and m times.
4. Groups and Lookarounds
(abc): Capturing group. Matches "abc" and remembers the match.(?:abc): Non-capturing group. Matches "abc" but does not remember it.(?=abc): Positive lookahead. Matches a group after the main expression without including it in the result.(?!abc): Negative lookahead. Specifies a group that can not match after the main expression.
Common Regex Patterns Explained
Here are some of the most frequently used regex patterns with detailed breakdowns:
Email Address Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
^: Start of the string.[a-zA-Z0-9._%+-]+: One or more alphanumeric characters, dots, underscores, percent signs, plus signs, or hyphens (the username part).@: The literal "@" symbol.[a-zA-Z0-9.-]+: One or more alphanumeric characters, dots, or hyphens (the domain name).\.: A literal dot.[a-zA-Z]{2,}: Two or more letters (the top-level domain, e.g., .com, .org).$: End of the string.
Strong Password
^(?=.*[A-Za-z])(?=.*\d)[A-Za-z\d]{8,}$
^: Start of string.(?=.*[A-Za-z]): Positive lookahead ensuring at least one letter exists.(?=.*\d): Positive lookahead ensuring at least one digit exists.[A-Za-z\d]{8,}: Matches 8 or more alphanumeric characters.$: End of string.
Date Validation (YYYY-MM-DD)
^(19|20)\d{2}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
^: Start of string.(19|20): Matches either 19 or 20 (for the century).\d{2}: Matches exactly 2 digits (year).-: Literal hyphen.(0[1-9]|1[0-2]): Matches 01-09 OR 10-12 (month).-: Literal hyphen.(0[1-9]|[12]\d|3[01]): Matches 01-09 OR 10-29 OR 30-31 (day).$: End of string.
IPv4 Address
^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
^: Start of string.(?:...){3}: Non-capturing group repeated 3 times for the first 3 octets.25[0-5]: Matches 250-255.2[0-4][0-9]: Matches 200-249.[01]?[0-9][0-9]?: Matches 0-199 (with optional leading zeros).\.: Literal dot.- The last part matches the final octet without a trailing dot.
Regex in Different Languages
While the core concepts of regex are universal, different programming languages have slight variations in their implementation (flavors).
| Language | Flavor | Key Characteristics |
|---|---|---|
| JavaScript | ECMAScript | Standard for web browsers. Supports lookaheads, but lookbehinds are a recent addition (ES2018). |
| PHP | PCRE | Perl Compatible Regular Expressions. Very powerful and feature-rich, supporting recursion and atomic grouping. |
| Python | Python `re` | Similar to PCRE but with some differences in syntax and behavior. Supports named groups and conditional matching. |
| Java | Java.util.regex | Strict and verbose. Requires double escaping for backslashes in string literals (e.g., \\d). |
Best Practices for Writing Regex
- Be Specific: Avoid using
.*unless absolutely necessary. It can be slow and match more than intended (greedy matching). - Use Anchors: Always use
^and$when validating entire strings to prevent partial matches. - Comment Your Regex: In languages that support it (like PCRE with
xflag), use comments to explain complex patterns. - Test Extensively: Use tools like this Regex Tester to verify your patterns against various test cases, including edge cases.
- Keep it Simple: If a regex becomes too complex, consider breaking it down or using string manipulation functions instead.
Troubleshooting Common Regex Issues
Even experienced developers run into issues with regular expressions. Here are some common problems and how to solve them:
1. Catastrophic Backtracking
This occurs when a regex engine takes an exponential amount of time to find a match (or confirm a non-match). It often happens with nested quantifiers like (a+)+. To avoid this:
- Avoid nested quantifiers where possible.
- Use atomic groups or possessive quantifiers (if supported by your flavor).
- Be specific about what you match instead of using
.*.
2. Forgetting to Escape Special Characters
Characters like ., *, ?, +, [, ], (, ), {, }, ^, $, and | have special meanings. If you want to match them literally, you must escape them with a backslash (e.g., \. for a dot).
3. Multiline vs. Singleline Mode
Confusion often arises around how the dot (.) and anchors (^, $) behave:
- Multiline Mode (
mflag):^and$match the start/end of each line, not just the string. - Singleline (Dotall) Mode (
sflag): The dot (.) matches every character, including newlines (which it normally doesn't).
4. Greedy vs. Lazy Matching
By default, quantifiers are greedy (match as much as possible). If your regex is matching too much text (e.g., from the first opening tag to the last closing tag), switch to lazy matching by adding a ? (e.g., .*? instead of .*).
Frequently Asked Questions (FAQ)
<.*> on <div>content</div> will match the entire string. Non-greedy (lazy) matching, indicated by adding a ? after the quantifier (e.g., <.*?>), matches as little as possible, so it would match just <div>.
\). For example, to match a dot, use \.. To match an asterisk, use \*.
g(Global): Find all matches, not just the first one.i(Case-insensitive): Match letters regardless of case.m(Multiline): Treat the string as multiple lines (affects^and$).