Thread: cat and UTF-8
View Single Post
  #1   (View Single Post)  
Old 15th October 2008
IIMarckus IIMarckus is offline
Port Guard
 
Join Date: Aug 2008
Posts: 13
Default cat and UTF-8

EDIT: A simple "man -k unicode" brought up uxterm. Soz for the useless thread guys

I use UTF-8 text files a lot, generally with a byte-order mark. When I use cat, tail, more, or less to display these files, I get a lot of mojibake. For instance, here's what happens with a file that consists of a byte-order mark and two characters (a©):
Code:
$ file utf8.txt
utf8.txt: UTF-8 Unicode text, with no line terminators
$ hexdump utf8.txt
00000000 bbef 61bf a9c2
00000006
$ cat utf8.txt
ḯ»¿a©
$ more utf8.txt
ḯ»¿a©
$ tail utf8.txt
ḯ»¿a©$ 
Is this a problem with the programs themselves, or is it the shell, or Eterm? Is there a way (by changing settings or by installing programs) that I can make this display correctly?

Last edited by IIMarckus; 15th October 2008 at 03:51 AM.
Reply With Quote