Regex Tester Online

The ultimate tool to test, debug, and validate regular expressions. Features real-time highlighting, detailed explanations, and support for JavaScript, PHP, Python, and Java regex flavors.

/ /
Example: \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b for email matching
Enter the text you want to test against your regular expression

Results

Enter a regex pattern and test string above to see results

Match Information:

No matches found yet

Capture Groups:

No capture groups found yet

Regex Explanation

Enter a regex pattern to see an explanation

Common Regex Patterns

How to Use the Regex Tester

Our online regex tester is designed to be intuitive and powerful. Follow these simple steps to test your regular expressions:

  1. Enter Your Pattern: Type your regular expression in the "Regular Expression Pattern" input field. You don't need to include the forward slashes (/) as they are provided.
  2. Set Flags: Add any necessary flags (like g for global, i for case-insensitive) in the flags input box.
  3. Input Test String: Paste or type the text you want to test against in the "Test String" area.
  4. Select Flavor: Choose your target programming language (JavaScript, PCRE/PHP, Python, or Java) to ensure accurate behavior.
  5. Analyze Results:
    • Matches: See highlighted matches in your test string instantly.
    • Match Info: View detailed information about each match, including index and content.
    • Capture Groups: Inspect captured groups in a structured table format.
    • Explanation: Read the auto-generated explanation to understand exactly how your pattern works.

Deep Dive: What is a Regular Expression?

A regular expression (often abbreviated as regex or regexp) is a sequence of characters that specifies a search pattern in text. Usually, such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.

Regex is a fundamental tool for developers, data scientists, and system administrators. It allows you to:

  • Validate Input: Ensure user inputs like email addresses, phone numbers, and passwords meet specific criteria.
  • Search and Replace: Find complex patterns in code or text files and replace them efficiently.
  • Parse Data: Extract specific information (like dates, prices, or IDs) from unstructured text logs.
  • Web Scraping: Identify and extract relevant data from HTML or XML content.

Regex Syntax Guide

Understanding regex syntax is key to mastering pattern matching. Here are the core components:

1. Anchors

Anchors do not match any character but assert a position in the string.

  • ^: Matches the start of the string (or line in multiline mode).
  • $: Matches the end of the string (or line in multiline mode).
  • \b: Matches a word boundary (position between a word character and a non-word character).

2. Character Classes

Character classes match a single character from a specific set.

  • .: Matches any single character except newline.
  • [abc]: Matches any one of the characters a, b, or c.
  • [^abc]: Matches any character except a, b, or c.
  • \d: Matches any digit (equivalent to [0-9]).
  • \w: Matches any word character (alphanumeric plus underscore).
  • \s: Matches any whitespace character (space, tab, newline).

3. Quantifiers

Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found.

  • *: Matches 0 or more times.
  • +: Matches 1 or more times.
  • ?: Matches 0 or 1 time.
  • {n}: Matches exactly n times.
  • {n,}: Matches n or more times.
  • {n,m}: Matches between n and m times.

4. Groups and Lookarounds

  • (abc): Capturing group. Matches "abc" and remembers the match.
  • (?:abc): Non-capturing group. Matches "abc" but does not remember it.
  • (?=abc): Positive lookahead. Matches a group after the main expression without including it in the result.
  • (?!abc): Negative lookahead. Specifies a group that can not match after the main expression.

Common Regex Patterns Explained

Here are some of the most frequently used regex patterns with detailed breakdowns:

Email Address Validation

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
  • ^: Start of the string.
  • [a-zA-Z0-9._%+-]+: One or more alphanumeric characters, dots, underscores, percent signs, plus signs, or hyphens (the username part).
  • @: The literal "@" symbol.
  • [a-zA-Z0-9.-]+: One or more alphanumeric characters, dots, or hyphens (the domain name).
  • \.: A literal dot.
  • [a-zA-Z]{2,}: Two or more letters (the top-level domain, e.g., .com, .org).
  • $: End of the string.

Strong Password

^(?=.*[A-Za-z])(?=.*\d)[A-Za-z\d]{8,}$
  • ^: Start of string.
  • (?=.*[A-Za-z]): Positive lookahead ensuring at least one letter exists.
  • (?=.*\d): Positive lookahead ensuring at least one digit exists.
  • [A-Za-z\d]{8,}: Matches 8 or more alphanumeric characters.
  • $: End of string.

Date Validation (YYYY-MM-DD)

^(19|20)\d{2}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
  • ^: Start of string.
  • (19|20): Matches either 19 or 20 (for the century).
  • \d{2}: Matches exactly 2 digits (year).
  • -: Literal hyphen.
  • (0[1-9]|1[0-2]): Matches 01-09 OR 10-12 (month).
  • -: Literal hyphen.
  • (0[1-9]|[12]\d|3[01]): Matches 01-09 OR 10-29 OR 30-31 (day).
  • $: End of string.

IPv4 Address

^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
  • ^: Start of string.
  • (?:...){3}: Non-capturing group repeated 3 times for the first 3 octets.
  • 25[0-5]: Matches 250-255.
  • 2[0-4][0-9]: Matches 200-249.
  • [01]?[0-9][0-9]?: Matches 0-199 (with optional leading zeros).
  • \.: Literal dot.
  • The last part matches the final octet without a trailing dot.

Regex in Different Languages

While the core concepts of regex are universal, different programming languages have slight variations in their implementation (flavors).

Language Flavor Key Characteristics
JavaScript ECMAScript Standard for web browsers. Supports lookaheads, but lookbehinds are a recent addition (ES2018).
PHP PCRE Perl Compatible Regular Expressions. Very powerful and feature-rich, supporting recursion and atomic grouping.
Python Python `re` Similar to PCRE but with some differences in syntax and behavior. Supports named groups and conditional matching.
Java Java.util.regex Strict and verbose. Requires double escaping for backslashes in string literals (e.g., \\d).

Best Practices for Writing Regex

  • Be Specific: Avoid using .* unless absolutely necessary. It can be slow and match more than intended (greedy matching).
  • Use Anchors: Always use ^ and $ when validating entire strings to prevent partial matches.
  • Comment Your Regex: In languages that support it (like PCRE with x flag), use comments to explain complex patterns.
  • Test Extensively: Use tools like this Regex Tester to verify your patterns against various test cases, including edge cases.
  • Keep it Simple: If a regex becomes too complex, consider breaking it down or using string manipulation functions instead.

Troubleshooting Common Regex Issues

Even experienced developers run into issues with regular expressions. Here are some common problems and how to solve them:

1. Catastrophic Backtracking

This occurs when a regex engine takes an exponential amount of time to find a match (or confirm a non-match). It often happens with nested quantifiers like (a+)+. To avoid this:

  • Avoid nested quantifiers where possible.
  • Use atomic groups or possessive quantifiers (if supported by your flavor).
  • Be specific about what you match instead of using .*.

2. Forgetting to Escape Special Characters

Characters like ., *, ?, +, [, ], (, ), {, }, ^, $, and | have special meanings. If you want to match them literally, you must escape them with a backslash (e.g., \. for a dot).

3. Multiline vs. Singleline Mode

Confusion often arises around how the dot (.) and anchors (^, $) behave:

  • Multiline Mode (m flag): ^ and $ match the start/end of each line, not just the string.
  • Singleline (Dotall) Mode (s flag): The dot (.) matches every character, including newlines (which it normally doesn't).

4. Greedy vs. Lazy Matching

By default, quantifiers are greedy (match as much as possible). If your regex is matching too much text (e.g., from the first opening tag to the last closing tag), switch to lazy matching by adding a ? (e.g., .*? instead of .*).


Frequently Asked Questions (FAQ)

Greedy matching (default) tries to match as much text as possible. For example, <.*> on <div>content</div> will match the entire string. Non-greedy (lazy) matching, indicated by adding a ? after the quantifier (e.g., <.*?>), matches as little as possible, so it would match just <div>.

To match a special character literally, you need to escape it using a backslash (\). For example, to match a dot, use \.. To match an asterisk, use \*.

Flags are optional parameters that modify the behavior of the regex engine. Common flags include:
  • g (Global): Find all matches, not just the first one.
  • i (Case-insensitive): Match letters regardless of case.
  • m (Multiline): Treat the string as multiple lines (affects ^ and $).

This is likely due to differences in regex flavors. While the basics are consistent, advanced features like lookbehinds, named groups, and recursion support vary between engines (e.g., JS vs. PCRE vs. Python). Always check the documentation for your specific environment.

While possible for simple cases, it is generally not recommended to parse HTML with regex because HTML is not a regular language. It's better to use a dedicated HTML parser (like DOMParser in JS or BeautifulSoup in Python) for reliable results.