”;
The “regexp” command is used to match a regular expression in Tcl. A regular expression is a sequence of characters that contains a search pattern. It consists of multiple rules and the following table explains these rules and corresponding use.
Sr.No. | Rule & Description |
---|---|
1 |
x Exact match. |
2 |
[a-z] Any lowercase letter from a-z. |
3 |
. Any character. |
4 |
^ Beginning string should match. |
5 |
$ Ending string should match. |
6 |
^ Backlash sequence to match special character ^.Similarly you can use for other characters. |
7 |
() Add the above sequences inside parenthesis to make a regular expression. |
8 |
x* Should match 0 or more occurrences of the preceding x. |
9 |
x+ Should match 1 or more occurrences of the preceding x. |
10 |
[a-z]? Should match 0 or 1 occurrence of the preceding x. |
11 |
{digit} Matches exactly digit occurrences of previous regex expression. Digit that contains 0-9. |
12 |
{digit,} Matches 3 or more digit occurrences of previous regex expression. Digit that contains 0-9. |
13 |
{digit1,digit2} Occurrences matches the range between digit1 and digit2 occurrences of previous regex expression. |
Syntax
The syntax for regex is given below −
regexp optionalSwitches patterns searchString fullMatch subMatch1 ... subMatchn
Here, regex is the command. We will see about optional switches later. Patterns are the rules as mentioned earlier. Search string is the actual string on which the regex is performed. Full match is any variable to hold the result of matched regex result. Submatch1 to SubMatchn are optional subMatch variable that holds the result of sub match patterns.
Let”s look at some simple examples before diving into complex ones. A simple example for a string with any alphabets. When any other character is encountered the regex, search will be stopped and returned.
#!/usr/bin/tclsh regexp {([A-Za-z]*)} "Tcl Tutorial" a b puts "Full Match: $a" puts "Sub Match1: $b"
When the above code is executed, it produces the following result −
Full Match: Tcl Sub Match1: Tcl
Multiple Patterns
The following example shows how to search for multiple patterns. This is example pattern for any alphabets followed by any character followed by any alphabets.
#!/usr/bin/tclsh regexp {([A-Za-z]*).([A-Za-z]*)} "Tcl Tutorial" a b c puts "Full Match: $a" puts "Sub Match1: $b" puts "Sub Match2: $c"
When the above code is executed, it produces the following result −
Full Match: Tcl Tutorial Sub Match1: Tcl Sub Match2: Tutorial
A modified version of the above code to show that a sub pattern can contain multiple patterns is shown below −
#!/usr/bin/tclsh regexp {([A-Za-z]*.([A-Za-z]*))} "Tcl Tutorial" a b c puts "Full Match: $a" puts "Sub Match1: $b" puts "Sub Match2: $c"
When the above code is executed, it produces the following result −
Full Match: Tcl Tutorial Sub Match1: Tcl Tutorial Sub Match2: Tutorial
Switches for Regex Command
The list of switches available in Tcl are,
-
nocase − Used to ignore case.
-
indices − Store location of matched sub patterns instead of matched characters.
-
line − New line sensitive matching. Ignores the characters after newline.
-
start index − Sets the offset of start of search pattern.
-
Marks the end of switches
In the above examples, I have deliberately used [A-Z, a-z] for all alphabets, you can easily use -nocase instead of as shown below −
#!/usr/bin/tclsh regexp -nocase {([A-Z]*.([A-Z]*))} "Tcl Tutorial" a b c puts "Full Match: $a" puts "Sub Match1: $b" puts "Sub Match2: $c"
When the above code is executed, it produces the following result −
Full Match: Tcl Tutorial Sub Match1: Tcl Tutorial Sub Match2: Tutorial
Another example using switches is shown below −
#!/usr/bin/tclsh regexp -nocase -line -- {([A-Z]*.([A-Z]*))} "Tcl nTutorial" a b puts "Full Match: $a" puts "Sub Match1: $b" regexp -nocase -start 4 -line -- {([A-Z]*.([A-Z]*))} "Tcl nTutorial" a b puts "Full Match: $a" puts "Sub Match1: $b"
When the above code is executed, it produces the following result −
Full Match: Tcl Sub Match1: Tcl Full Match: Tutorial Sub Match1: Tutorial
”;