DaemonForums  

Go Back   DaemonForums > Miscellaneous > Programming

Programming C, bash, Python, Perl, PHP, Java, you name it.

Reply
 
Thread Tools Display Modes
  #1   (View Single Post)  
Old 2nd December 2008
bigb89 bigb89 is offline
Fdisk Soldier
 
Join Date: May 2008
Posts: 69
Thanked 1 Time in 1 Post
Default Searching and replacing weird patterns on a file.

Hi guys,

Here's what I'm trying to accomplish:

I'm trying to replace the following pattern/line on a file:
Code:
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
As you can see, the above pattern/line has certain characters that are treated "special" if you use sed, perl, etc to search and replace it. So in order for me to search for that line, I would have to use a lot of escape characters "\". But that would be time consuming and a lot of the times its not accurate.

So, is there anyway that I can search/replace for special patterns like that on a file without using escape characters?

Regards,
--Bigb89
Reply With Quote
  #2   (View Single Post)  
Old 2nd December 2008
TerryP's Avatar
TerryP TerryP is offline
Arp Constable
 
Join Date: May 2008
Location: USofA
Posts: 1,547
Thanked 112 Times in 104 Posts
Default

Short of getting jiggy with the translate command (tr(1)) or some comparable tool, I would think you would have to find something with a totally different regular expression syntax; or live with live with escaping things you want to manipulate.
__________________
My Journal

Thou shalt check the array bounds of all strings (indeed, all arrays), for surely where thou typest ``foo'' someone someday shall type ``supercalifragilisticexpialidocious''.
Reply With Quote
  #3   (View Single Post)  
Old 2nd December 2008
ephemera's Avatar
ephemera ephemera is offline
Knuth's homeboy
 
Join Date: Apr 2008
Posts: 537
Thanked 49 Times in 43 Posts
Default

Quote:
Originally Posted by bigb89 View Post
Code:
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
If you trying to match (and replace) this string literal then you don't need to use regex at all.
Though you will still need to take care of quoting and var. interpolation.

Try:
Code:
perl -i.bak -pe '$_="yabba dabba doo\n" if $_ eq qq(<?php \@include("http://".\$_SERVER[\047HTTP_HOST\047]."/linkingblogv.php"); ?>\n)' file
Reply With Quote
  #4   (View Single Post)  
Old 2nd December 2008
vermaden's Avatar
vermaden vermaden is offline
Administrator
 
Join Date: Apr 2008
Location: pl_PL.lodz
Posts: 1,052
Thanked 118 Times in 93 Posts
Default

like that for example:
Code:
% sed -E s/".*php.*include.*http.*SERVER.*HTTP.*HOST.*linkingblogv.*php.*"/NEW/g yourfile
__________________
religions, worst damnation of mankind
"If 386BSD had been available when I started on Linux, Linux would probably never had happened." Linus Torvalds

Linux is not UNIX! Face it! It is not an insult. It is fact: GNU is a recursive acronym for “GNU's Not UNIX”.
vermaden's: links resources deviantart spreadbsd
Reply With Quote
  #5   (View Single Post)  
Old 2nd December 2008
J65nko J65nko is offline
Administrator
 
Join Date: May 2008
Location: Budel - the Netherlands
Posts: 3,190
Thanked 182 Times in 149 Posts
Default

I rather write regular expressions that escape regular expressions, then write escaped regular expressions.
Code:
$ cat testfile                                                           
the quick brown fox jumps over the lazy dog
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>

$ cat escape-regex                                                       
#!/bin/sh

# -- Use here-document with single quoted end-of-document marker
# -- This prevents the shell from messing with any character

pattern=$(cat <<'END'
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
END
)

echo "This is the pattern:\n$pattern"

# ---- For BRE (basic regular expressions) like in sed(1)
# escape everything except:
#   * ( and ) because '\(' and '\)' capture text in sed(1)
#   * { and } because '\{' and '\}' are used to specify minimum and/or max
#   * 'n' because '\n' is the shell symbol for <newline>
#   * alphabetic characters
#   * whitespace
#   * digits

pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}n]\)!\\\1!g')
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}[:alpha:]]\)!\\\1!g')
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}[:alpha:][:blank:]]\)!\\\1!g')
pattern_esc=$(echo "$pattern" | sed -e 's!\([^(){}[:alpha:][:blank:][:digit:]]\)!\\\1!g')

#       
#       s               : start search pattern
#       !               : our custom delimiter
#       
#       \(              : start capture in container '\1'
#       
#       [               : start of character class
#       ^               : negate characters in this class
#       ()              : '(' and ')'
#       {}              : '{' and '}'
#       [:alpha:]       : alphabetic character class
#       [:blank:]       : whitespace character class
#       [:digit:]       : numeric character class
#       
#       \)              : end of capture in container '\1'
#       
#       !               : end of search pattern, start of replacement
#       
#       \\              : a literal '\' escaped with itself
#       \1              : contents of container '\1'
#       
#       !               : end of replacement
#       g               : do a 'g'lobal search and replace, not only first match
#       

echo "\n===========The escaped pattern====================="
echo "$pattern_esc"

echo "\nDoing a grep on 'testfile'"
grep -n "$pattern_esc" testfile

echo "\nUsing sed(1) to replace the pattern with 'GORILLA'"
sed -e "s/${pattern_esc}/GORILLA/" testfile
$ ./escape-regex    
This is the pattern:
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>

===========The escaped pattern=====================
\<\?php \@include(\"http\:\/\/\"\.\$\_SERVER\[\'HTTP\_HOST\'\]\.\"\/linkingblogv\.php\")\; \?\>

Doing a grep on 'testfile'
2:<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>

Using sed(1) to replace the pattern with 'GORILLA'
the quick brown fox jumps over the lazy dog
GORILLA
My first attempt was a brute force approach to just escape everything.That succeeded for grep, but failed for sed. Probably because a "n" became a "\n", the newline symbol.

Then I refined the pattern bit by bit, as you can see from the successive definitions of "pattern_esc"

Another approach would be to escape all regular expression symbols. But that is left as exercise for the reader
Attached Files
File Type: sh escape-regex.sh (1.7 KB, 16 views)
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump
Reply With Quote
  #6   (View Single Post)  
Old 3rd December 2008
bigb89 bigb89 is offline
Fdisk Soldier
 
Join Date: May 2008
Posts: 69
Thanked 1 Time in 1 Post
Default

Thanks for all the replies guys.

This sure did help a lot.
Reply With Quote
  #7   (View Single Post)  
Old 6th December 2008
drl's Avatar
drl drl is offline
Port Guard
 
Join Date: May 2008
Posts: 18
Thanked 3 Times in 3 Posts
Default

Hi.

Interesting question. There is an fgrep, so we might wonder why there is no fsed. Rather than escape all the special characters, if we could use features that did not involve regular expressions, we'd be set.

As it turns out, perl function index does a plain string search, and substr allows replacement.

I put together a perl script -- sfs -- to do just that. Here is the brief help:
Code:
$ ./sfs -h

sfs - Substitute fixed strings.  This is a simple string
replacement utility. No regular expressions are involved, so no
escaping is required, except to shield strings from the shell if
the command-line options are used.

usage: sfs [options] -- [files]

options:
--all (or -a)
  Process entire line, otherwise only left-most bad string gets
  replaced.

--bad="string-to-be-replaced"
  Define the string that is not desired.

--good="replacement-string"
  Define the string that will replace the bad string.

  The -b and -g pairs may be repeated as necessary.

--file=pathname
  Specify a file containing pairs of lines, a bad instance
  followed by a good instance.

  The -b,-g pairs and the content of pathname are collected
  together and each data line is subjected to the search and
  replace operation for each pair.  The -b,-g feature is designed
  to allow short strings to be presented on the command line,
  whereas longer strings and many pairs may be placed in the
  pathname file.

--help (or -h)
  print this message and quit.

--version
  print the version and quit.
A sample usage script:
Code:
#!/usr/bin/env sh

# @(#) s1       Demonstrate sfs - substitute fixed strings.

sfs=sfs
sfs=./sfs

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) $sfs
set -o nounset

# Stage data files.

cat >data1 <<'EOF'
Non-alpha ?*[stuff] can be easily changed.
the quick brown fox jumps over the lazy dog
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
EOF

cat >bad-good <<'EOF'
?*[stuff]
( cool again, "a[3] = 7 / 2" )
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
GORILLA
EOF

echo
echo " Data file:"
cat data1

echo
echo " Results from command line:"
$sfs -b='*' -g=" star " data1

echo
echo " Results from strings file:"
$sfs -f=bad-good data1

exit 0
Producing:
Code:
$ ./s1

(Versions displayed with local utility "version")
FreeBSD 4.11-STABLE
sh - ( /bin/sh Mar 7 2007 )
sfs (local) 1.3

 Data file:
Non-alpha ?*[stuff] can be easily changed.
the quick brown fox jumps over the lazy dog
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>

 Results from command line:
Non-alpha ? star [stuff] can be easily changed.
the quick brown fox jumps over the lazy dog
<?php @include("http://".$_SERVER['HTTP_HOST']."/linkingblogv.php"); ?>
 ( Lines read: 3 )

 Results from strings file:
Non-alpha ( cool again, "a[3] = 7 / 2" ) can be easily changed.
the quick brown fox jumps over the lazy dog
GORILLA
 ( Lines read: 3 )
The code is almost 200 lines long (about 10% debug-instrumentation), so I'm not posting it here.

If anyone is interested, I can post the code as an attachment.

I don't visit every day, and the email notifications don't seem to work for me, so I'll respond when I can ... cheers, drl
Reply With Quote
  #8   (View Single Post)  
Old 6th December 2008
ephemera's Avatar
ephemera ephemera is offline
Knuth's homeboy
 
Join Date: Apr 2008
Posts: 537
Thanked 49 Times in 43 Posts
Default

Quote:
Originally Posted by drl View Post
If anyone is interested, I can post the code as an attachment.
I am interested.
Reply With Quote
  #9   (View Single Post)  
Old 6th December 2008
drl's Avatar
drl drl is offline
Port Guard
 
Join Date: May 2008
Posts: 18
Thanked 3 Times in 3 Posts
Default

Hi.

See attached. No warranties, but I am interested in problems found ... cheers, drl
Attached Files
File Type: pl sfs.pl (4.4 KB, 24 views)
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Weird time issues schrodinger OpenBSD General 7 26th October 2009 03:20 PM
Weird network problem rex FreeBSD General 5 16th September 2008 02:05 AM
searching for a SP/PDA like device, advice needed TerryP Off-Topic 5 26th July 2008 03:54 AM
Weird NAT issues EvanED FreeBSD General 3 11th July 2008 11:02 PM
weird history problem mmusang FreeBSD General 2 17th May 2008 07:07 PM


All times are GMT. The time now is 06:13 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick