View Single Post
  #1   (View Single Post)  
Old 12th December 2008
J65nko J65nko is offline
Administrator
 
Join Date: May 2008
Location: Budel - the Netherlands
Posts: 4,128
Default Regular expressions: renaming files with 'sed'

I hade a bunch of directories with files like this:
Code:
$ ls NOW/latest-pkg*
NOW/latest-pkg            NOW/latest-pkg-erlangen   NOW/latest-pkg-plig
NOW/latest-pkg-calyx      NOW/latest-pkg-esat       NOW/latest-pkg-stacken
The '-' between 'latest' and 'pkg' had to be replaced with an underscore '_'.
The plan was to save the file names in a temporay file and use sed to create sh 'mv' commands to rename the files.

First save the file names in a file called 'tmp'.
Code:
$ ls NOW/latest-pkg* >tmp
$ cat tmp

NOW/latest-pkg
NOW/latest-pkg-calyx
NOW/latest-pkg-erlangen
NOW/latest-pkg-esat
NOW/latest-pkg-plig
NOW/latest-pkg-stacken
Now enter a sed(1) one-liner to munge each file name into a 'mv' command like this:
Code:
mv NOW/latest-pkg NOW/latest_pkg
Plan of attack:
  • Save the text up to the first '-' and save the trailing text.
  • prepend a 'mv ' followed by a blank
  • reconstruct the original file name
  • insert a blank
  • retrieve the text leading to the first '-'
  • insert the desired underscore character '_'
  • retrieve the text after the original '-'

Code:
$ sed -e 's/^\([^-]*\)-\(pkg.*\)/mv \1-\2 \1_\2/' tmp
mv NOW/latest-pkg NOW/latest_pkg
mv NOW/latest-pkg-calyx NOW/latest_pkg-calyx
mv NOW/latest-pkg-erlangen NOW/latest_pkg-erlangen
mv NOW/latest-pkg-esat NOW/latest_pkg-esat
mv NOW/latest-pkg-plig NOW/latest_pkg-plig
mv NOW/latest-pkg-stacken NOW/latest_pkg-stacken
That looks good. (I have to admit this was my second try. I had to break out of the first attempt with CNTRL-C. )

We now have two options to execute this:
  1. Save the sed output to file and feed it to the shell for execution:
    Code:
    $ sed -e 's/^\([^-]*\)-\(pkg.*\)/mv \1-\2 \1_\2/' tmp >tmp.sh
    $ sh tmp.sh
  2. Feed the sed output to the shell directly through a pipe line:

    Code:
    $ sed -e 's/^\([^-]*\)-\(pkg.*\)/mv \1-\2 \1_\2/' tmp | sh

I went for the last option. The result:
Code:
 $ ls -l NOW/latest_*
-rw-r--r--  1 j65nko  j65nko  253269 Dec 12 02:41 NOW/latest_pkg
-rw-r--r--  1 j65nko  j65nko  293597 Dec 12 02:41 NOW/latest_pkg-calyx
-rw-r--r--  1 j65nko  j65nko  253269 Dec 12 02:41 NOW/latest_pkg-erlangen
-rw-r--r--  1 j65nko  j65nko  253269 Dec 12 02:42 NOW/latest_pkg-esat
-rw-r--r--  1 j65nko  j65nko  252528 Dec 12 02:41 NOW/latest_pkg-plig
-rw-r--r--  1 j65nko  j65nko  253269 Dec 12 02:41 NOW/latest_pkg-stacken
An explanation of the sed search and replace command:
Code:
s/^\([^-]*\)-\(pkg.*\)/mv \1-\2 \1_\2/
The search pattern:
Code:
s/^\([^-]*\)-\(pkg.*\)/

s		: search
/		: delimiter to mark start of search pattern
^		: beginning of line

\(		: start saving text for replay in containter \1
[^-]		: whatever character as long it is not a '-'
*		: zero or more of the preceding
\)		: stop saving in container \1 
		  we now have saved or stored the text 'NOW/latest'

-		: a '-', which we didn't save because we need to replace
		  it with a underscore '_'

\(		: start saving text for replay in container \1
pkg		: a 'p', followed by a 'k', followed by a 'g'
.		: followed by whatever character
*		: zero or more of instances of the preceding atom '.'
\)		: stop saving in container \2
The replacement:
Code:
/ mv \1-\2 \1_\2/

/		: end of seach pattern, start of replacement
mv 		: a 'm' and a 'v' followed by a space

		: now we reconstruct our original file name:

\1		: replay or the text 'NOW/latest' from container \1
-		: a '-'
\2		: the text from container \2

		: a space to separate the original name from the new one

\1		: fetch text 'latest' from container \1
_		: the underscore we want
\2		: the remainder of the file name from container \2
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump
Reply With Quote