RegularExpression["regex"]
represents the generalized regular expression specified by the string "regex".
RegularExpression
RegularExpression["regex"]
represents the generalized regular expression specified by the string "regex".
Details
- RegularExpression can be used to represent classes of strings in functions like StringMatchQ, StringReplace, StringCases, and StringSplit.
- RegularExpression supports standard regular expression syntax of the kind used in typical string manipulation languages.
- The following basic elements can be used in regular expression strings:
-
c the literal character c . any character except newline [c1c2…] any of the characters ci [c1-c2] any character in the range c1–c2 [^c1c2…] any character except the ci p* p repeated zero or more times p+ p repeated one or more times p? zero or one occurrence of p p{m,n} p repeated between m and n times p*?,p+?,p?? the shortest consistent strings that match (p1p2…) strings matching the sequence p1, p2, … p1|p2 strings matching p1 or p2 - The following represent classes of characters:
-
\\d digit 0–9 \\D nondigit \\s space, newline, tab, or other whitespace character \\S non-whitespace character \\w word character (letter, digit, or _) \\W nonword character [[:class:]] characters in a named class [^[:class:]] characters not in a named class - The following named classes can be used: alnum, alpha, ascii, blank, cntrl, digit, graph, lower, print, punct, space, upper, word, xdigit.
- The following represent positions in strings:
-
^ the beginning of the string (or line) $ the end of the string (or line) \\b word boundary \\B anywhere except a word boundary - The following set options for all regular expression elements that follow them:
-
(?i) treat uppercase and lowercase as equivalent (ignore case) (?m) make ^ and $ match start and end of lines (multiline mode) (?s) allow . to match newline (?-c) unset options - \\., \\[, etc. represent literal characters ., [, etc.
- Analogs of named Wolfram Language patterns such as x:expr can be set up in regular expression strings using (regex).
- Within a regular expression string, \\gn represents the substring matched by the n
parenthesized regular expression object (regex). The shorter \\n is often equivalent to \\gn. - For the purpose of functions such as StringReplace and StringCases, any $n appearing in the right‐hand side of a rule RegularExpression["regex"]->rhs is taken to correspond to the substring matched by the n
parenthesized regular expression object in regex. $0 represents the whole matched string.
Examples
open all close allBasic Examples (2)
Find words involving the characters a, b, c, d, e:
StringCases["adefgh12c34", RegularExpression["[a-e]+"]]Equivalent form using string patterns:
StringCases["adefgh12c34", CharacterRange["a", "e"]..]Decide whether the string consists of words and whitespace:
StringMatchQ["abcd
efgh
1234", RegularExpression["(\\w|\\s)*"]]Equivalent form using string patterns:
StringMatchQ["abcd
efgh
1234", (WordCharacter | Whitespace)...]Scope (22)
Basic Constructs (17)
Extract any character except newline:
StringCases["a23b42c63d80, 123", RegularExpression["."]]StringCases["a23b42c63d80, 123", Except["
", _] ]Either of the characters "a" and "b":
StringCases["a13b12c1da32efg", RegularExpression["[ab]"]]StringCases["a13b12c1da32efg", "a" | "b"]Any character between "a" and "e", including "a" and "e":
StringCases["adefgh12c34", RegularExpression["[a-e]"]]StringCases["adefgh12c34", CharacterRange["a", "e"]]Any character except "a" and "1":
StringCases["a13b12c17a32", RegularExpression["[^a1]"]]StringCases["a13b12c17a32", Except["a" | "1", _]]Any digit repeated one or more times:
StringCases["a23b4222c63333d80", RegularExpression["\\d+"]]StringCases["a23b4222c63333d80", NumberString]The character "a" repeated 2 or 3 times:
StringCases["aabc1aaaagh2ade", RegularExpression["a{2,3}"]]StringCases["aabc1aaaagh2ade", w : (x_ ...) /; (2 ≤ StringLength[w] ≤ 3) ]StringCases["a2322c63333d80", RegularExpression["\\d"]]StringCases["a2322c63333d80", DigitCharacter]StringCases["a2322c63333d80", RegularExpression["\\D"]]StringCases["a2322c63333d80", Except[DigitCharacter]]Space, newline, tab, or other whitespace character:
StringCases["13
a22 bbb", RegularExpression["\\s"]]//InputFormStringCases["13
a22 bbb", WhitespaceCharacter]//InputFormStringCases["13
a22 bbb", RegularExpression["\\S"]]StringCases["13
a22 bbb", Except[WhitespaceCharacter]]StringCases["a23b42c63,d80", RegularExpression["\\w"]]StringCases["a23b42c63,d80", WordCharacter]StringCases["a23b:42c63;d80", RegularExpression["\\W"]]StringCases["a23b:42c63;d80", Except[WordCharacter]]StringCases["AaBBccDDeefG", RegularExpression["[[:upper:]]+"]]StringCases["AaBBccDDeefG", CharacterRange["A", "Z"]..]Split a string at the beginning of a new line:
StringSplit["line1
line2
line3", RegularExpression["(?m)^"]]//InputFormStringSplit["line1
line2
line3", StartOfLine]//InputFormSplit a string at the end of a new line:
StringSplit["line1
line2
line3", RegularExpression["(?m)$"]]//InputFormStringSplit["line1
line2
line3", EndOfLine]//InputFormInsert a character at the boundary of each word:
StringReplace["123 45 6 789", RegularExpression["\\b"] :> "X"]StringReplace["123 45 6 789", WordBoundary :> "X"]Split a string at every character except at the boundary of a word:
StringSplit["12X X5X X89", RegularExpression["\\B"]]StringSplit["12X X5X X89", Except[WordBoundary]]Compound Constructs (5)
StringExpression can contain RegularExpression objects:
StringCases["a13b12c17a32", "a" ~~ x : RegularExpression["\\d+"] -> x]StringCases["a13b12c17a32", "a" ~~ x : DigitCharacter.. -> x]StringCases["a23b42c63d80, 123", x : RegularExpression["\\d+"] /; Mod[ToExpression[x], 2] == 0]StringCases["a23b42c63d80, 123", x : DigitCharacter.. /; Mod[ToExpression[x], 2] == 0]Use alternatives to match one or more line breaks:
StringMatchQ["abcd
efgh
1234", RegularExpression["(.*|\\s*)*"]]StringMatchQ["abcd
efgh
1234", (WordCharacter... | Whitespace)...]Non-greedy matches are done by appending a question mark "?" to the quantifiers:
StringCases["abc1agh2cde", RegularExpression["a.+?\\d"]]StringCases["abc1agh2cde", Shortest["a" ~~ __ ~~ DigitCharacter]]The $1 refers to the letter matched by (.):
StringCases["aaabcccabbaacba", RegularExpression["(.)\\g1"] -> "$1"]StringCases["aaabcccabbaacba", x_ ~~ x_ -> x]StringCases["a1b6a3b3a3c3a8b8", RegularExpression["(a(\\d))b\\g2"] -> {"$0", "$1", "$2"}]StringCases["a1b6a3b3a3c3a8b8", g0 : ((g1 : ("a" ~~ g2 : DigitCharacter)) ~~ "b" ~~ g2_) :> {g0, g1, g2}]Properties & Relations (3)
Use StringMatchQ to determine string pattern matches:
StringMatchQ["12345", RegularExpression["\\d+"]]Use StringCases to find matching substrings:
StringCases["aaaa bbbb 1234", RegularExpression["[a-z]+"]]Use StringSplit to split a string into substrings using a delimiter pattern:
StringSplit["1.23, 4.56 7.89", RegularExpression["(\\s|,)+"]]See Also
StringExpression StringCases StringReplace SearchQueryString
Function Repository: ToRegularExpression BioSequenceToRegularExpression
Tech Notes
History
Introduced in 2004 (5.1)
Text
Wolfram Research (2004), RegularExpression, Wolfram Language function, https://reference.wolfram.com/language/ref/RegularExpression.html.
CMS
Wolfram Language. 2004. "RegularExpression." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/RegularExpression.html.
APA
Wolfram Language. (2004). RegularExpression. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/RegularExpression.html
BibTeX
@misc{reference.wolfram_2026_regularexpression, author="Wolfram Research", title="{RegularExpression}", year="2004", howpublished="\url{https://reference.wolfram.com/language/ref/RegularExpression.html}", note=[Accessed: 12-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_regularexpression, organization={Wolfram Research}, title={RegularExpression}, year={2004}, url={https://reference.wolfram.com/language/ref/RegularExpression.html}, note=[Accessed: 12-June-2026]}