DaemonForums  

Go Back   DaemonForums > Miscellaneous > Guides

Guides All Guides and HOWTO's.

 
 
Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #1   (View Single Post)  
Old 17th December 2012
J65nko J65nko is offline
Administrator
 
Join Date: May 2008
Location: Budel - the Netherlands
Posts: 3,135
Thanked 182 Times in 149 Posts
Default Deleting whitespace from otherwise blank lines

Recently I had to convert several text documents to XML.
To make sure that there were no empty lines with just spaces and/or tabs, I wrote the following small Perl script called 'xlblanks'.

Code:
#!/usr/bin/perl

use warnings;
use strict;
use diagnostics;

# --- delete spaces and tabs from otherwise empty lines

my $total = 0;
my $line_nr;
my @nrs;

while (<>) {
    ++$line_nr; 
    if (
	s/
	^	# at begin of line
	[\t\ ]+	# one or more tabs or blanks
	$	# followed by END OF LINE
	//x	# by nothing
	) {
	++$total;
        push @nrs, $line_nr; 
    }
    print;
}

print STDERR "\n$0: Number of lines found with only tabs or blanks: $total\n";
$, = '-' ;
print STDERR "$0: The line numbers: ", @nrs , "\n\n";
A small sample file shows no visible blanks or tabs on otherwise empty lines:
Code:
FreeBSD
 
DragonFlyBSD
 	   
NetBSD  
	
OpenBSD
Running the script:
Code:
$ xlblanks blanklines.txt                                                         

FreeBSD

DragonFlyBSD

NetBSD  

OpenBSD


./xlblanks: Number of lines found with only tabs or blanks: 3
./xlblanks: The line numbers: -3-5-7-
Displaying the file with 'cat' confirmed these results:
Code:
$ cat -net blanklines.txt                                                         
     1  $
     2  FreeBSD$
     3   $
     4  DragonFlyBSD$
     5   ^I   $
     6  NetBSD  $
     7  ^I$
     8  OpenBSD$
     9  $
    10  $
The two lines reporting the results are sent to STDERR, allowing to create a 'clean' version by redirecting the output to file:

Code:
$ ./xlblanks blanklines.txt >clean.txt 

./xlblanks: Number of lines found with only tabs or blanks: 3
./xlblanks: The line numbers: -3-5-7-

$ cat -net clean.txt
     1  $
     2  FreeBSD$
     3  $
     4  DragonFlyBSD$
     5  $
     6  NetBSD  $
     7  $
     8  OpenBSD$
     9  $
    10  $
The line number output sent to 'stderr' or file descriptor 2, can be redirected to file with:
Code:
$ ./xlblanks blanklines.txt >clean.txt 2> culprits.txt  
$ cat culprits.txt

./xlblanks: Number of lines found with only tabs or blanks: 3
./xlblanks: The line numbers: -3-5-7-
In case you wonder why the line numbers needed to be reported:
The original master files are being maintained in MS Word format , so knowing the line numbers made it easy to eliminate those irritating, useless blanks and tabs.

An equivalent 'sed' script, without the lines reporting stuff:

Code:
$ sed -Ee 's/^[[:blank:]]+$//g' blanklines.txt | cat -net
     1  $
     2  FreeBSD$
     3  $
     4  DragonFlyBSD$
     5  $
     6  NetBSD  $
     7  $
     8  OpenBSD$
     9  $
    10  $
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump
Reply With Quote
 

Tags
perl, sed, text formatting

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
deleting a file or directory divadgnol67 OpenBSD General 7 1st April 2011 03:31 PM
Blank screen after installkernel beaute FreeBSD Installation and Upgrading 1 3rd June 2010 10:54 AM
Deleting lines with certain letters/keywords. bigb89 Programming 4 12th November 2008 11:59 PM
Putting Lines to Together. bigb89 Programming 4 24th September 2008 03:59 AM
root password is blank mfaridi FreeBSD Security 10 16th May 2008 10:19 PM


All times are GMT. The time now is 05:38 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick