DaemonForums  

Go Back   DaemonForums > Miscellaneous > Programming

Programming C, bash, Python, Perl, PHP, Java, you name it.

Reply
 
Thread Tools Display Modes
  #1   (View Single Post)  
Old 18th July 2010
backrow backrow is offline
Real Name: Anthony J. Bentley
Shell Scout
 
Join Date: Jul 2009
Location: Albuquerque, NM
Posts: 116
Thanked 10 Times in 4 Posts
Default Can I make this shell script faster?

Hi guys,

I use nmh for my mail; plain text mails are displayed with less(1). Unfortunately, since my xterm uses UTF-8 by default, any ISO8859 mails will tend to have �’s everywhere.

I thought I’d fix this by piping output through a simple script first:
Code:
#!/bin/sh

while read input
do
echo $input | file - | grep 8859 > /dev/null
if [[ $? = 0 ]]; then
        echo $input | iconv -f ISO-8859-1 -t UTF-8
else
        echo $input
fi
done
This works, but makes reading my mail noticeably slower:
Code:
$ time u8conv.sh  < /etc/hosts >/dev/null
    0m0.22s real     0m0.11s user     0m0.11s system
$ time cat /etc/hosts >/dev/null         
    0m0.00s real     0m0.00s user     0m0.00s system
Is there an obvious way to make the script faster? Perhaps there’s a better way to solve my problem that I’ve missed.
__________________
Many thanks to the forum regulars who put time and effort into helping others solve their problems.
Reply With Quote
  #2   (View Single Post)  
Old 18th July 2010
IdOp's Avatar
IdOp IdOp is offline
Too dumb for a smartphone
 
Join Date: May 2008
Location: twisting on the daemon's fork(2)
Posts: 572
Thanked 14 Times in 13 Posts
Default

I'm not familiar with the conversion issues, and this is just a quick idea of the top of my head, but it seems likely your script may be slow because it's reading and processing the input a line at a time. So, can you work with the input as a whole file?

E.g., copy input to a temp file, apply the file command to that to determine what type it is, then cat the tempfile through the conversion filter or not, as you've done with lines. Then delete the tempfile.

There's probably a neater version of this, which might be worth looking for if this method works at all and gives a good speedup. Hope that helps.
Reply With Quote
  #3   (View Single Post)  
Old 18th July 2010
backrow backrow is offline
Real Name: Anthony J. Bentley
Shell Scout
 
Join Date: Jul 2009
Location: Albuquerque, NM
Posts: 116
Thanked 10 Times in 4 Posts
Default

Tried this, it was much faster, thanks. The mail is already saved to a temporary file, so this is easy.
Code:
#!/bin/sh

file $1 | grep 8859 > /dev/null
if [[ $? = 0 ]]; then
        iconv -f ISO-8859-1 -t UTF-8 < $1
else
        cat $1
fi
__________________
Many thanks to the forum regulars who put time and effort into helping others solve their problems.
Reply With Quote
  #4   (View Single Post)  
Old 18th July 2010
TerryP's Avatar
TerryP TerryP is offline
Arp Constable
 
Join Date: May 2008
Location: USofA
Posts: 1,547
Thanked 112 Times in 104 Posts
Default

Since your case is already solved, I'll just leave a note for the next person to find it.


If you're processing a stream line by line with a while read loop in sh script, you're probably doing it wrong.
__________________
My Journal

Thou shalt check the array bounds of all strings (indeed, all arrays), for surely where thou typest ``foo'' someone someday shall type ``supercalifragilisticexpialidocious''.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ask for a shell script Simon Programming 5 27th April 2010 01:07 AM
shell script compare md5 sum bsdnewbie999 Programming 1 11th April 2009 02:20 PM
incrementing within a shell script? spiderpig Programming 5 29th September 2008 08:12 PM
Shell Script. bsdnewbie999 Programming 21 15th July 2008 07:54 AM
shell script with input c0mrade Programming 5 13th July 2008 04:33 AM


All times are GMT. The time now is 05:34 PM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick