NAME mb - Can easy script in Big5, Big5-HKSCS, GBK, Sjis, UHC, UTF-8, ... SYNOPSIS $ perl mb.pm MBCS_Perl_script.pl (auto detect encoding of script) $ perl mb.pm -e big5 MBCS_Perl_script.pl $ perl mb.pm -e big5hkscs MBCS_Perl_script.pl $ perl mb.pm -e eucjp MBCS_Perl_script.pl $ perl mb.pm -e gb18030 MBCS_Perl_script.pl $ perl mb.pm -e gbk MBCS_Perl_script.pl $ perl mb.pm -e sjis MBCS_Perl_script.pl $ perl mb.pm -e uhc MBCS_Perl_script.pl $ perl mb.pm -e utf8 MBCS_Perl_script.pl $ perl mb.pm -e wtf8 MBCS_Perl_script.pl C:\WINDOWS> perl mb.pm script.pl ??-DOS-like *wildcard* available MBCS quotes: qq/ DAMEMOJI 功声乗ソ / q/ DAMEMOJI 功声乗ソ / m/ DAMEMOJI 功声乗ソ / s/ DAMEMOJI 功声乗ソ / DAMEMOJI 功声乗ソ / split / DAMEMOJI 功声乗ソ / tr/ DAMEMOJI 功声乗ソ / DAMEMOJI 功声乗ソ / y/ DAMEMOJI 功声乗ソ / DAMEMOJI 功声乗ソ / qr/ DAMEMOJI 功声乗ソ / MBCS subroutines: mb::chop(...); mb::chr(...); mb::do 'file'; mb::dosglob(...); mb::eval 'string'; mb::getc(...); mb::index(...); mb::index_byte(...); mb::length(...); mb::ord(...); mb::require 'file'; mb::reverse(...); mb::rindex(...); mb::rindex_byte(...); mb::substr(...); mb::use Module; mb::no Module; MBCS special variables: $mb::PERL $mb::ORIG_PROGRAM_NAME supported encodings: Big5, Big5-HKSCS, EUC-JP, GB18030, GBK, Sjis, UHC, UTF-8, WTF-8 supported operating systems: Apple Inc. OS X, Hewlett-Packard Development Company, L.P. HP-UX, International Business Machines Corporation AIX, Microsoft Corporation Windows, Oracle Corporation Solaris, and Other Systems supported perl versions: perl version 5.005_03 to newest perl DESCRIPTION This software is a source code filter, a transpiler-modulino. Perl is said to have been able to handle Unicode since version 5.8. However, unlike JPerl, "Easy jobs easy" has been lost. (but we have got it again :-D) In Shift_JIS and similar encodings(Big5, Big5-HKSCS, GB18030, GBK, Sjis, UHC) have any DAMEMOJI who have metacharacters at second octet. Which characters are DAMEMOJI is depends on whether the enclosing delimiter is single quote or double quote. This software escapes DAMEMOJI in your script, generate a new script and run it. Larry Wall san's Style If you're using the utf8 pragma and you have a big headache, probably, you're on the wrong way. You should back to the Larry Street where is a sign that says ver.5.00503, once. There is another path there. Follow that path. Soon, your headache will be improve. The "length()" described in the script universally functions as "bytes::length()", and the "substr()" in the script universally functions as "bytes::substr()". If you want to know the number of code points of multibyte characters contained in a scalar value, you have to write "mb::length()". If you want to execute "substr()" in code point context, you have to write "mb::substr()". Once, Larry Wall san said like this; "Easy jobs must be easy." Welcome to world of Larry Wall san's Style!! SEE ALSO https://metacpan.org/author/INA http://backpan.cpantesters.org/authors/id/I/IN/INA/ https://metacpan.org/release/Jacode4e-RoundTrip https://metacpan.org/release/Jacode4e https://metacpan.org/release/Jacode