I rather write regular expressions that escape regular expressions, then write escaped regular expressions.
Code:
$ cat testfile
the quick brown fox jumps over the lazy dog
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
$ cat escape-regex
#!/bin/sh
# -- Use here-document with single quoted end-of-document marker
# -- This prevents the shell from messing with any character
pattern=$(cat <<'END'
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
END
)
echo "This is the pattern:\n$pattern"
# ---- For BRE (basic regular expressions) like in sed(1)
# escape everything except:
# * ( and ) because '\(' and '\)' capture text in sed(1)
# * { and } because '\{' and '\}' are used to specify minimum and/or max
# * 'n' because '\n' is the shell symbol for <newline>
# * alphabetic characters
# * whitespace
# * digits
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}n]\)!\\\1!g')
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}[:alpha:]]\)!\\\1!g')
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}[:alpha:][:blank:]]\)!\\\1!g')
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}[:alpha:][:blank:][:digit:]]\)!\\\1!g')
#
# s : start search pattern
# ! : our custom delimiter
#
# \( : start capture in container '\1'
#
# [ : start of character class
# ^ : negate characters in this class
# () : '(' and ')'
# {} : '{' and '}'
# [:alpha:] : alphabetic character class
# [:blank:] : whitespace character class
# [:digit:] : numeric character class
#
# \) : end of capture in container '\1'
#
# ! : end of search pattern, start of replacement
#
# \\ : a literal '\' escaped with itself
# \1 : contents of container '\1'
#
# ! : end of replacement
# g : do a 'g'lobal search and replace, not only first match
#
echo "\n===========The escaped pattern====================="
echo "$pattern_esc"
echo "\nDoing a grep on 'testfile'"
grep -n "$pattern_esc" testfile
echo "\nUsing sed(1) to replace the pattern with 'GORILLA'"
sed -e "s/${pattern_esc}/GORILLA/" testfile
$ ./escape-regex
This is the pattern:
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
===========The escaped pattern=====================
\<\?php \@include(\"http\:\/\/\"\.\$\_SERVER\[\'HTTP\_HOST\'\]\.\"\/linkingblogv\.php\")\; \?\>
Doing a grep on 'testfile'
2:<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
Using sed(1) to replace the pattern with 'GORILLA'
the quick brown fox jumps over the lazy dog
GORILLA
My first attempt was a brute force approach to just escape everything.That succeeded for grep, but failed for sed. Probably because a "n" became a "\n", the newline symbol.
Then I refined the pattern bit by bit, as you can see from the successive definitions of "pattern_esc"
Another approach would be to escape all regular expression symbols. But that is left as exercise for the reader