|
Programming C, bash, Python, Perl, PHP, Java, you name it. |
|
Thread Tools | Display Modes |
|
||||
There are several items to check.
First, ISO8859-15 should be your locale for accented characters (and the Euro sign). Second, UTF-8, although somewhat compatible will have to be told you use fr_FR* locale. Third, your system locale can be superseeded by you applications setups: for instance, I can use a system en_US.8859-1 locale, but have my xterm set to UTF-8 (uxterm) or emacs use whatever locale, or gnome set to us UTF-8 by default. I can be on en_US on my terminal, but see Nautilus save my files as UTF-8. Forth: Perl is going UTF-8 default. Currently there is an utf8 pragma (man utf8) allowing/disabling UTF-8. Fifth: when you lauch an UTF-8 capable application, that application might call a subshell which will default run on your systems locale, which can be -as in my case- en_US. This sub-shell could eventually bork out with an error message as "malformed UTF-8 character string". Most of the errors will be spitted out by bad written applications when those applications GNUishly presume your only possible setup is UTF-8 (if not UNICODE which is yet another non-standard standard). My 2cents: check both your applications and DM (gnome, KDE) for proper encoding a common locale output. Check the proper encoding choice when saving files. Portable mails always should use 8859-1 text format. You can read 8859-1 mails or webpages correctly, including diacritics from your most visited links by settin UTF-8 on your local browser setting. Local means that people using other locales will not read those pages the same way you do. Fwiw, when switching from one code page to the other, include a setenv or export ENV or just change LC_* in your routine. Remember that when you re-read the output, the output screen should also be set on the proper environment/locale. For instance, re-read your UTF-8 formatted mail with Thunderbird set to use UTF-8 also. Note that UTF-8 will not be read correctly by 8859-15 set applications. sécurité, is what you get.
__________________
da more I know I know I know nuttin' |
|
||||
Thanks for the reply lvlamb. It is much clearer now.
What confuses me is that I sent a message from me to me via Thunderbird. A look at the full headers shows me that the encoding is ISO8859-1 (my accents were displayed properly): Code:
text/plain; charset=ISO-8859-1; format=flowed Code:
multipart/mixed; boundary="_----------=_1210097604302960"; charset="ISO-8859-15"
__________________
"Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction." |
|
||||
Assumption is the mother of all f*ck-ups. So I'll be cautious.
I did not checked every version or revision of the installed Perl on every OS but I am pretty sure that every maintainer has faced encoding problems and made attempts to correct them. Basically, check resulting code for the encodings and don't google for answers. In theory, Perl works in UTF-8 unless told not to do so (unless defaulted otherwise at compile time ). Perl will default to UTF-8 in a nearby revision. Unless too many users complain. So, I assume Perl is UTF-8 default. Note that, since WinXP(ntfs) for one, since April last year for Linux, late to the party OpenSolaris,the defaults there are UTF-8. I assume *BSD users are smart enough to modify their application to display UTF-8 encodings when needed. I would use UTF-8 defaults and only translate to 8859-1 for mails sent to mailing lists as you always will be called names when not writing pure ASCII-7 and in English Part of the base xorg, you now have luit (man luit) to play with filters. Fwiw, here is a Sun doc that gives some tips on locale/UTF-8 conversions. http://docs.sun.com/app/docs/doc/819...3aglffe?a=view This does not answer your question IMVHO, using UTF-8 throughout will be correctly read by most applications (hence users). It is up to the application to correctly translate the encodings, i.e.: use MIME flags. Most files don't have MIME flags. There is IMHO the error. Much simpler to implement than making the whole OS UTF-8 compliant. Which is changing a bad to a worse. UTF-8 is not an universal encoding either.
__________________
da more I know I know I know nuttin' |
|
|||
Yes, we are.
|
|
||||
I wish it would also be true for every other OS...
__________________
"Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction." |
Tags |
perl, utf-8 |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
dwm status bar won't display apm output | asemisldkfj | General software and network | 6 | 16th August 2009 11:07 PM |
can't open display error | gosha | OpenBSD General | 12 | 28th May 2009 05:49 AM |
Odd font display | TerryP | Feedback and Suggestions | 4 | 2nd November 2008 11:22 AM |
Terminal display behavior | 18Googol2 | FreeBSD General | 8 | 26th September 2008 02:05 PM |
backup mails on NAS directory | milo974 | OpenBSD General | 3 | 8th August 2008 07:39 AM |