match-regexp

$Revision: 5.0.2.2 $

Function

Package: EXCL

Arguments: (string-or-regexp string-to-match &key newlines-special case-fold return start end shortest)

The string-or-regexp argument is a regular expression object (the result of compile-regexp) or it is a string (in which case it will be compiled into a regular expression object). The string-to-match is a string to match against the regular expression. This function will attempt to match the regular expression against the string-to-match starting at the first character of the string-to-match, and if that fails it will look inside the string-to-match for a match (unless the regular expression begins with a caret).

The keyword arguments are:

newlines-special If true then a newline will not match the . regular expression. This is useful to prevent multiline matches.
case-fold If true then the string-to-match is effectively mapped to lower case before doing the match. Thus lower case characters in the regular expression match either case and upper case characters match only upper case characters.
return The return value from a failed match is nil. If the value of return is :string then the return value from a successful match are multiple values. The first value is t. The second value is the substring of the string-to-match that matched the regular expression. The third value (if any) is the substring that matched group 1. The fourth value is the substring that matched group 2. An so on. If you use the \| form, then some groups may have no associated match in which case nil will be returned as that value. In highly nested \| forms, a group may return a match string when in the final match that group had no match.

If the value of return is :index then it is just like :string except that instead of the strings being returned, a cons is returned giving the start and end indices in the original string-to-match of the match. The end index is one greater than the last character in the substring.

If the value of return is nil then the one value t is returned when the match succeeds.

start The first character in the string-to-match to match against.
end One past the last character in the string-to-match to match against.
shortest This makes match-regexp return the shortest rather than the longest match. One motivation for this is parsing html. Suppose you want to search for the next item in italics, which in html looks like <i>foo</i>. If you do (match-regexp "<i>.*</i>" string) then if the string is <i>foo</i> and <i>bar</i> then you'll match the whole string, including the non-italic part. However if you use the shortest keyword then you'll only match the <i>foo</i> part.

Compilation note: there is a compiler macro defined for match-regexp that will handle in a special way match-regexp calls where the first argument is a constant string. That is, this form (match-regexp "foo" x) will compile to code that will arrange to call compile-regexp on the string when the code is fasled in. Since the cost of compile-regexp is high, this saves a lot of time. 

See regexp.htm for more information.

The general documentation description is in introduction.htm. The index is in index.htm.

Copyright (C) 1998-1999, Franz Inc., Berkeley, CA. All Rights Reserved.