CXX. Regular Expression Functions (Perl-Compatible)

Introduction

The syntax for patterns used in these functions closely resembles Perl. The expression should be enclosed in the delimiters, a forward slash (/), for example. Any character can be used for delimiter as long as it's not alphanumeric or backslash (\). If the delimiter character has to be used in the expression itself, it needs to be escaped by backslash. Since PHP 4.0.4, you can also use Perl-style (), {}, [], and <> matching delimiters. See Pattern Syntax for detailed explanation.

The ending delimiter may be followed by various modifiers that affect the matching. See Pattern Modifiers.

PHP also supports regular expressions using a POSIX-extended syntax using the POSIX-extended regex functions.

Note: This extension maintains a global per-thread cache of compiled regular expressions (up to 4096).

Warning

You should be aware of some limitations of PCRE. Read http://www.pcre.org/pcre.txt for more info.

Requirements

No external libraries are needed to build this extension.

Installation

Beginning with PHP 4.2.0 these functions are enabled by default. You can disable the pcre functions with --without-pcre-regex. Use --with-pcre-regex=DIR to specify DIR where PCRE's include and library files are located, if not using bundled library. For older versions you have to configure and compile PHP with --with-pcre-regex[=DIR] in order to use these functions.

The windows version of PHP has built in support for this extension. You do not need to load any additional extension in order to use these functions.

Runtime Configuration

The behaviour of these functions is affected by settings in php.ini.

Table 1. PCRE Configuration Options

NameDefaultChangeableChangelog
pcre.backtrack_limit100000PHP_INI_ALLAvailable since PHP 5.2.0.
pcre.recursion_limit100000PHP_INI_ALLAvailable since PHP 5.2.0.
For further details and definitions of the PHP_INI_* constants, see the Appendix I.

Here's a short explanation of the configuration directives.

pcre.backtrack_limit integer

PCRE's backtracking limit.

pcre.recursion_limit integer

PCRE's recursion limit. Please note that if you set this value to a high number you may consume all the available process stack and eventually crash PHP (due to reaching the stack size limit imposed by the Operating System).

Resource Types

This extension has no resource types defined.

Predefined Constants

The constants below are defined by this extension, and will only be available when the extension has either been compiled into PHP or dynamically loaded at runtime.

Table 2. PREG constants

constantdescription
PREG_PATTERN_ORDER Orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on. This flag is only used with preg_match_all().
PREG_SET_ORDER Orders results so that $matches[0] is an array of first set of matches, $matches[1] is an array of second set of matches, and so on. This flag is only used with preg_match_all().
PREG_OFFSET_CAPTURE See the description of PREG_SPLIT_OFFSET_CAPTURE. This flag is available since PHP 4.3.0.
PREG_SPLIT_NO_EMPTY This flag tells preg_split() to return only non-empty pieces.
PREG_SPLIT_DELIM_CAPTURE This flag tells preg_split() to capture parenthesized expression in the delimiter pattern as well. This flag is available since PHP 4.0.5.
PREG_SPLIT_OFFSET_CAPTURE If this flag is set, for every occurring match the appendant string offset will also be returned. Note that this changes the return values in an array where every element is an array consisting of the matched string at offset 0 and its string offset within subject at offset 1. This flag is available since PHP 4.3.0 and is only used for preg_split().
PREG_NO_ERROR Returned by preg_last_error() if there were no errors. Available since PHP 5.2.0.
PREG_INTERNAL_ERROR Returned by preg_last_error() if there was an internal PCRE error. Available since PHP 5.2.0.
PREG_BACKTRACK_LIMIT_ERROR Returned by preg_last_error() if backtrack limit was exhausted. Available since PHP 5.2.0.
PREG_RECURSION_LIMIT_ERROR Returned by preg_last_error() if recursion limit was exhausted. Available since PHP 5.2.0.
PREG_BAD_UTF8_ERROR Returned by preg_last_error() if the last error was caused by malformed UTF-8 data (only when running a regex in UTF-8 mode). Available since PHP 5.2.0.

Examples

Example 1. Examples of valid patterns

  • /<\/\w+>/

  • |(\d{3})-\d+|Sm

  • /^(?i)php[34]/

  • {^\s+(\s+)?$}

Example 2. Examples of invalid patterns

  • /href='(.*)' - missing ending delimiter

  • /\w+\s*\w+/J - unknown modifier 'J'

  • 1-\d3-\d3-\d4| - missing starting delimiter

Table of Contents
Pattern Modifiers -- Describes possible modifiers in regex patterns
Pattern Syntax -- Describes PCRE regex syntax
preg_grep -- Return array entries that match the pattern
preg_last_error -- Returns the error code of the last PCRE regex execution
preg_match_all -- Perform a global regular expression match
preg_match -- Perform a regular expression match
preg_quote -- Quote regular expression characters
preg_replace_callback -- Perform a regular expression search and replace using a callback
preg_replace -- Perform a regular expression search and replace
preg_split -- Split string by a regular expression