Oracle® Database SQL Reference 10g Release 2 (10.2) Part Number B14200-02 |
|
|
View PDF |
Syntax
Purpose
REGEXP_SUBSTR
extends the functionality of the SUBSTR
function by letting you search a string for a regular expression pattern. It is also similar to REGEXP_INSTR
, but instead of returning the position of the substring, it returns the substring itself. This function is useful if you need the contents of a match string but not its position in the source string. The function returns the string as VARCHAR2
or CLOB
data in the same character set as source_char
.
This function complies with the POSIX regular expression standard and the Unicode Regular Expression Guidelines. For more information, please refer to Appendix C, "Oracle Regular Expression Support".
source_char
is a character expression that serves as the search value. It is commonly a character column and can be of any of the datatypes CHAR
, VARCHAR2
, NCHAR
, NVARCHAR2
, CLOB
, or NCLOB
.
pattern
is the regular expression. It is usually a text literal and can be of any of the datatypes CHAR
, VARCHAR2
, NCHAR
, or NVARCHAR2
. It can contain up to 512 bytes. If the datatype of pattern
is different from the datatype of source_char
, Oracle Database converts pattern
to the datatype of source_char
. For a listing of the operators you can specify in pattern
, please refer to Appendix C, "Oracle Regular Expression Support".
position
is a positive integer indicating the character of source_char
where Oracle should begin the search. The default is 1, meaning that Oracle begins the search at the first character of source_char
.
occurrence
is a positive integer indicating which occurrence of pattern
in source_char
Oracle should search for. The default is 1, meaning that Oracle searches for the first occurrence of pattern
.
match_parameter
is a text literal that lets you change the default matching behavior of the function. You can specify one or more of the following values for match_parameter
:
'i'
specifies case-insensitive matching.
'c'
specifies case-sensitive matching.
'n'
allows the period (.), which is the match-any-character character, to match the newline character. If you omit this parameter, the period does not match the newline character.
'm'
treats the source string as multiple lines. Oracle interprets ^
and $
as the start and end, respectively, of any line anywhere in the source string, rather than only at the start or end of the entire source string. If you omit this parameter, Oracle treats the source string as a single line.
'x' ignores whitespace characters. By default, whitespace characters match themselves.
If you specify multiple contradictory values, Oracle uses the last value. For example, if you specify 'ic'
, then Oracle uses case-sensitive matching. If you specify a character other than those shown above, then Oracle returns an error.
If you omit match_parameter
, then:
The default case sensitivity is determined by the value of the NLS_SORT
parameter.
A period (.) does not match the newline character.
The source string is treated as a single line.
Examples
The following example examines the string, looking for the first substring bounded by commas. Oracle Database searches for a comma followed by one or more occurrences of non-comma characters followed by a comma. Oracle returns the substring, including the leading and trailing commas.
SELECT REGEXP_SUBSTR('500 Oracle Parkway, Redwood Shores, CA', ',[^,]+,') "REGEXPR_SUBSTR" FROM DUAL; REGEXPR_SUBSTR ----------------- , Redwood Shores,
The following example examines the string, looking for http://
followed by a substring of one or more alphanumeric characters and optionally, a period (.
). Oracle searches for a minimum of three and a maximum of four occurrences of this substring between http://
and either a slash (/
) or the end of the string.
SELECT REGEXP_SUBSTR('http://www.oracle.com/products', 'http://([[:alnum:]]+\.?){3,4}/?') "REGEXP_SUBSTR" FROM DUAL; REGEXP_SUBSTR ---------------------- http://www.oracle.com/