A regular expression (often shortened to "regex" or "regexp") is a sequence of characters that specifies a search pattern. It's a powerful tool for searching, matching, and manipulating text. Think of it as a highly advanced "find and replace" functionality, but with the ability to match complex patterns instead of just simple, literal text. This page provides a basic introduction to regular expressions.
Regular expressions are incredibly versatile and are a fundamental part of many programming languages and text editors. They are used for a wide variety of tasks, including:
Here are some of the basic concepts you need to know to get started with regular expressions:
/hello/
will match the string "hello"..
matches any character except a newline. To match a literal metacharacter, you need to escape it with a backslash (e.g., \.
matches a literal dot).[aeiou]
will match any vowel. You can also define a range of characters, like [a-z]
for any lowercase letter.*
means "zero or more times", while +
means "one or more times".()
are used to create groups of characters. This allows you to apply quantifiers to a whole group or to capture the matched text for later use.Let's say you want to find all the email addresses in a text. A simple regex for this could be:
/\w+@\w+\.\w+/
Here's how it breaks down:
\w+
: Matches one or more word characters (letters, numbers, or underscore). This represents the username part of the email.@
: Matches the literal "@" symbol.\w+
: Matches one or more word characters again. This represents the domain name.\.
: Matches a literal dot.\w+
: Matches one or more word characters for the top-level domain (like .com, .org, etc.).While this is a simple example, it illustrates the power of combining different regex components to create a pattern that can match a wide range of text.