I know, you want me to study and work it out by myself.
Actually I've finally found a good tutorial page on this, I simply did not search with the right key before on google. Here's the link:
http://interglacial.com/~sburke/tpj/as_html/tpj14.html
In my personal case, I have two extra letters to sort: š and ū.
I've made this test file:
Code:
abc
aab
bbc
mmn
lmn
aaa
ššš
sss
zzz
ccc
ggg
uuu
šas
saš
cab
uuū
ūuu
ūūū
Here's the code:
Code:
use strict;
use warnings;
open (_file_, "< absolute-path-to-file") or die "Failed to read file : $! ";
my @not_sorted = <_file_>;
sub normalize {
my $in = $_[0];
$in = lc($in);
$in =~ tr<aeiouū>
<aeiouu>;
$in =~ tr<abcdefghijklmnopqrsštuvwxyz>
<\x01-\x1B>; #hexadecimal numbers to tell Perl you have 27 letters to sort
return $in;
}
my @sorted = sort{ normalize($a) cmp normalize($b)or $a cmp $b} @not_sorted;
print @sorted;
close (_file_);
I still don't completely understand why you can sort in proper order ū not considering it an extra letter like you have to do for š, but I evenctually will in the future. Anyway it gives the expected result:
Code:
aaa
aab
abc
bbc
cab
ccc
ggg
lmn
mmn
saš
sss
šas
ššš
uuu
uuū
ūuu
ūūū
zzz
Hope it will be helpful to someone.