Regular expression help and examples for grepWin
grepWin uses the boost regex engine to do its work, with the Perl Regular Expression Syntax.
Introduction
I'll only explain the very basics on how to use regular expressions and some special variables you can use in grepWin that aren't part of the official regular expression syntax.
For a much more detailed tutorial on regular expressions, please go to this site - it also explains a lot on how regex engines work internally.
search basics
- . (dot)
- a dot matches any character. Searching for
t.t
will matchtat
as well astut
. - +
- matches the previous expression one or more times, but at least once. Searching for
spel+ing
will find all words likespeling
orspelling
but notspeing
since thel
must be matched at least once. - *
- matches the previous expression zero or more times. Searching for
spel*ing
will find all words likespeling
orspelling
and alsospeing
since thel
can be matched zero times, which means it doesn't have to be there. - \
- the backslash escapes special characters that would otherwise be treated specially. Searching for a double dot in your text with
..
would not work since the dot matches any character. To search for a double dot you have to escape the dot chars like this:\.\.
. - \Q..\E
- in case you need to search for a literal string that has a lot of special characters in it, you can use the
\Q..\E
sequence. Searching for*.*
would match everything unless you escape every single char like this:\*\.\*
. For such search strings it's easier to just put them inside the\Q..\E
sequence like this:\Q*.*\E
. - []
- With square brackets you can specify so called character classes. Such a class matches all chars that are specified between the brackets. Searching for
[-+0-9]+
will find any string that contains the chars '-', '+' and all chars between 0 and 9, but no other chars. It will match-123
,+123
or123
, but nottestword
. There are a few default character classes defined so you don't have to create one yourself. You can find a list of those classes here. The most used ones are\d
which matches all digits,\w
which matches all word chars and\s
which matches all whitespace chars. - ^, $
- the caret matches the beginning of a line, and the string char
$
matches the end of a line. Searching for^title$
will only find lines that only consist of the wordtitle
, but no places where the wordtitle
is inside a line. Searching for^//
will find all lines that start with two slashes, but not lines where two slashes are not at the very beginning of a line. Searching forgoodbye\.$
will find lines that end withgoodbye.
, but not ifgoodbye.
is somewhere inside a line. - \b
\b
matches word boundaries. Searching for\bword\b
findsword
, but notsubwords
orwords
.- ()
- parenthesis pairs define a group. Grouping is useful for more advanced regex searching, but also for use when replacing text. Each group that matches part of the full matching string can be referenced later in the replace string.
- |
- The
|
char is used as an OR operator. Searching forcat|dog
will match eithercat
ordog
. Note that the OR operator uses everything left and right of the operator. If you want to limit the reach of the operator, you have to use brackets to group them. Searching for(cat|dog)food
findscatfood
anddogfood
.
replacing
replacing strings is not more complicated than searching. Whatever the search finds is replaced with the replace string. Searching for cat
and replacing it with dog
is the most basic example and works just like you'd expect.
in replace strings, you can also use references.
- $1..$9
- in replace strings, you can also refer to matched groups from the search string. Groups are referred to with
$1..$9
. For example, if you search for(cats) and (dogs)
and replace it with$2 and $1
, the stringcats and dogs
gets replaced withdogs and cats
.$1
refers to the first matching group, which iscats
, and$2
refers to the second matching group, which isdogs
. - ${filepath}, ${filename}, ${fileext}
- the
${filepath}
reference gets replaced with the full path of the current file.${filename}
gets replaced with the filename without the file extension, and${fileext}
gets replaced with the file extension of the current file. This is special to grepWin. - ${count0N}, ${count0N(AA)}, ${count0N(AA,BB)}
- grepWin also offers a special replace reference for counting.
${count0N}
is replaced with numbers starting from 1 and incremented by 1. The0
andN
are optional and used for formatting the number. TheN
is a number that specifies how many chars the number should use. The number is then padded with spaces to fill the space. If0
is specified, the number is padded with leading zeros. You can also specify the start count using theAA
number, and the increment values using theBB
number for the counting.
replacing examples
- insert line numbers at the start of each line
Search string:
^
Replace string:
Results in:${count04}
0001 line 1 0002 line 2 0003 line 3