NAME iconv_ja - code set conversions in ja locale DESCRIPTION The following code set conversions are supported: ____________________________________________________________ | Code Set Conversions Supported | | Source Code | Target Code | | eucJP | PCK | | eucJP | ISO-2022-JP | | eucJP | ISO-2022-JP.RFC1468 | | eucJP | JIS7 | | eucJP | SJIS | | eucJP | UTF-8 | | eucJP | UTF-8-Java | | eucJP | jis | | eucJP | ibmj | | eucJP | ibmj-EBCDIK | | SJIS | eucJP | | SJIS | ISO-2022-JP | | SJIS | UTF-8 | | SJIS | jis | | SJIS | ibmj | | PCK | eucJP | | PCK | UTF-8 | | PCK | UTF-8-Java | | PCK | ISO-2022-JP | | PCK | ISO-2022-JP.RFC1468 | | PCK | jis | | PCK | ibmj | | PCK | ibmj-EBCDIK | | ISO-2022-JP | eucJP | | ISO-2022-JP | PCK | | ISO-2022-JP | SJIS | | ISO-2022-JP | UTF-8 | | UTF-8 | eucJP | | UTF-8 | SJIS | | UTF-8 | PCK | | UTF-8 | ISO-2022-JP | | UTF-8 | ISO-2022-JP.RFC1468 | | UTF-8-Java | eucJP | | UTF-8-Java | PCK | | JIS7 | eucJP | | jis | eucJP | | jis | PCK | | jis | SJIS | | ibmj | eucJP | | ibmj | PCK | | ibmj | SJIS | | ibmj-EBCDIK | eucJP | | ibmj-EBCDIK | PCK | |______________________|____________________________________| ____________________________________________________________ | Code Set Conversions Supported | | Source Code | Target Code | | eucJP | ibm930 | | eucJP | ibm931 | | eucJP | ibm939 | | eucJP | ibm5026 | | eucJP | ibm5035 | | PCK | ibm930 | | PCK | ibm931 | | PCK | ibm939 | | PCK | ibm5026 | | PCK | ibm5035 | | UTF-8 | ibm930 | | UTF-8 | ibm931 | | UTF-8 | ibm939 | | UTF-8 | ibm5026 | | UTF-8 | ibm5035 | | UTF-8 | ms932 | | UTF-8 | UTF-8-ms932 | | UTF-8-ms932 | UTF-8 | | ibm930 | eucJP | | ibm930 | PCK | | ibm930 | UTF-8 | | ibm931 | eucJP | | ibm931 | PCK | | ibm931 | UTF-8 | | ibm939 | eucJP | | ibm939 | PCK | | ibm939 | UTF-8 | | ibm5026 | eucJP | | ibm5026 | PCK | | ibm5026 | UTF-8 | | ibm5035 | eucJP | | ibm5035 | PCK | | ibm5035 | UTF-8 | | ms932 | UTF-8 | |_____________________|_____________________________________| The descriptions of each code sets in the above table are followings: ____________________________________________________________ Description of Supported Code Sets Codeset Description eucJP Japanese EUC PCK PC kanji SJIS the same as PC kanji (eol in future) ISO-2022-JP Coded representation of the character sets ISO 646 IRV or JIS X 0201, JIS X 0208, and JIS X 0212 according to UI/OSF Application Platform Profile for Japanese Environment Version 1.1 item 7.1 using the designation sequence to G0 specified by ISO 2022 JIS7 same as ISO-2022-JP ISO-2022-JP.RFC1468 Coded representation of the character sets ISO 646 IRV or JIS X 0201-1976 (except for figure character set for katakana), and JIS X 0208-1983 according to RFC1468 (Request for Com- ments: 1468 Japanese Char- acter Encoding for Internet Messages) using the designa- tion sequence to G0 speci- fied by ISO 2022 jis JIS 7bit code used in JLE, JFP 2.4 and the preceding releases ibmj IBM Kanji code ibmj-EBCDIK Maps single-byte code set (SBCS) of IBM host code to the character set that is called the EBCDIK code set in general. The character code set includes the IBM code page 290 and threee more characters '`' (0x79),'{' (0xc0), and '}' (0xd0). Japanese katakana characters are included, but lowercase alphabet letters are not. In case of double-byte code set (DBCS), the description is the same as the code set "ibmj." UTF-8 UNI CODE UTF-8-Java UNI CODE implemented in Java ____________________________________________________________ | ____________________________________________________________ | Description of Supported Code Sets | | Codeset | Description | | ibm930 | IBM CCSID 930: SBSC code page 290 | | | (extended), character set 1172, DBCS | | | code page 300, character set 1001 | | | 4370 user defined characters | | ibm931 | IBM CCSID 931: SBSC code page 37, | | | character set 101, DBCS code page | | | 300, character set 1001 | | | 4370 user defined characters | | ibm939 | IBM CCSID 930: SBSC code page 1027, | | | character set 1172, DBCS code page | | | 300, character set 1001 4370 user | | | defined characters | | ibm5026 | IBM CCSID 5026: same as ibm930, | | | except this code set supports 1880 | | | user defined characters | | ibm5035 | IBM CCSID 5035: same as ibm939, | | | except this code set supports 1880 | | | user defined characters | | ms932 | Shift JIS codeset which is supported | | | by Windows NT 3.51. Conversion | | | betwenn this codeset and UTF-8 is | | | done in the same way Windows NT 3.51 | | | does. | | UTF-8-ms932 | UTF-8 encoded Unicode which was con- | | | verted from ms932 | |____________________|______________________________________| Conversions are performed as described below. For all conversions, if the source code set includes characters not included in the target code set, conversion and output for all such characters will be done using a substitute charac- ter. eucJP to PCK (SJIS) and PCK (SJIS) to eucJP Conversion between eucJP and PCK (SJIS) can be used to con- vert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined and vendor-defined characters based on TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specifica- tion between Japanese EUC and Shift-JIS. If input data which does not belong to the source code set is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to ISO-2022-JP(JIS7) and ISO-2022-JP(JIS7) to eucJP Conversion between eucJP and ISO-2022-JP(JIS7) can be used to convert JIS X 0201, JIS X 0208 and JIS X 0212. If input data which does not belong to the source code set is encoun- tered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to ISO-2022-JP.RFC1468 Conversion from eucJP to ISO-2022-JP.RFC1468 can be used to convert JIS X 0201 (except for figure character set for katakana) and JIS X 0208. If JIS X 0201 (figure character set for katakana), JIS X 0212, a user-defined, or a vendor- defined character is encountered among input data, it will be replaced with the substitute character ` ? ' (0x3f). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to jis and jis to eucJP Conversion between eucJP and jis is provided for the compa- tibility with ujtojis7() and jis7touj() libraries ,and euctojis and jistoeuc utilities. It is extended to handle JIS X 0212. See jisconv(3X) and jistoeuc(1). eucJP to UTF-8 and UTF-8 to eucJP Conversion between eucJP and UTF-8 can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, a user-defined, and a vendor-defined character. If input data which does not have the corresponding character in the target code set is encountered, it will be replaced with the substitute charac- ter (eucJP: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to UTF-8-Java and UTF-8-Java to eucJP Conversion between eucJP and UTF-8-Java can be used to con- vert JIS X 0201, JIS X 0208, and JIS X 0212. If a user- defined or vendor-defined character is encountered among input data, it will be replaced with the substitute charac- ter (eucJP: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP to ibmj and ibmj to eucJP Conversion between eucJP and ibmj is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert eucJP to ibmj, JISX 0201 and JIS X 0201 are all converted to substitute character. See ibmjcode(3X). eucJP to ibmj-EBCDIK and ibmj-EBCDIK to eucJP Conversion between eucJP and ibmj-EBCDIK is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert eucJP to ibmj-EBCDIK, JISX 0201 and JIS X 0201 that have not correspondence char- acters with ibmj-EBCDIKare all converted to substitute char- acter. PCK (SJIS) to ISO-2022-JP and ISO-2022-JP to PCK (SJIS) Conversion between PCK (SJIS) and ISO-2022-JP can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined and vendor-defined characters based on TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specifica- tion between Japanese EUC and Shift-JIS. If input data which does not belong to the source code set is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to ISO-2022-JP.RFC1468 Conversion from PCK (SJIS) to ISO-2022-JP.RFC1468 can be used to convert JIS X 0201 (except for figure character set for katakana) and JIS X 0208. If JIS X 0201 (figure charac- ter set for katakana), a user-defined, or a vendor-defined character is encountered among input data, it will be replaced with the substitute character ` ? ' (0x3f). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to UTF-8 and UTF-8 to PCK (SJIS) Conversion between PCK (SJIS) and UTF-8 can be used to con- vert JIS X 0201, JIS X 0208, a user-defined, and a vendor- defined character. If input data which does not have the corresponding character in the target code set is encoun- tered, it will be replaced with the substitute character (PCK: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to UTF-8-Java and UTF-8-Java to PCK (SJIS) Conversion between PCK (SJIS) and UTF-8-Java can be used to convert JIS X 0201 and JIS X 0208. If a user-defined or vendor-defined character is encountered among input data, it will be replaced with the substitute character (PCK: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. PCK (SJIS) to jis and jis to PCK (SJIS) Conversion between PCK (SJIS) and jis is provided for the compatibility with sjtojis7() and jis7tosj() libraries , and sjtojis jistosj utilities. It is extended besed on TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS. See jisconv(3X) and jistosj(1). PCK (SJIS) to ibmj and ibmj to PCK (SJIS) Conversion between PCK (SJIS) and ibmj is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert PCK (SJIS) to ibmj, all characters converted to JIS X 0212 by kana characters (0xa1 to 0xdf) and TOG Japanese Vendors Council (TOG/JVC) Recom- mended Code Set Conversion Specification between Japanese EUC and Shift-JIS are all converted to substitute character. See ibmjcode(3X). PCK to ibmj-EBCDIK and ibmj-EBCDIK to PCK Conversion between PCK and ibmj-EBCDIK is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If you convert PCK to ibmj-EBCDIK, all characters converted to JIS X 0212 by JIS X 0212 and TOG Japanese Vendors Council (TOG/JVC) Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS are all converted to substitute character. ISO-2022-JP to UTF-8 and UTF-8 to ISO-2022-JP Conversion between ISO-2022-JP and UTF-8 can be used to con- vert JIS X 0201, JIS X 0208, JIS X 0212, and user-defined and vendor-defined characters. If input data which does not have the corresponding character in the target code set is encountered, it will be replaced with the substitute charac- ter (ISO-2022-JP: ` ? ' (0x3f), UTF-8: U+FFFD (0xefbfbd)). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. UTF-8 to ISO-2022-JP.RFC1468 Conversion from UTF-8 to ISO-2022-JP.RFC1468 can be used to convert JIS X 0201 (except for figure character set for katakana) and JIS X 0208. If JIS X 0201 (figure character set for katakana), JIS X 0212, a user-defined, or a vendor- defined character is encountered among input data, it will be replaced with the substitute character ` ? ' (0x3f). If input data which does not belong to these code sets is encountered, iconv(3C) will return EILSEQ for errno. iconv(1) stops at the last point of successful conversion. eucJP,PCK,UTF-8 to ibm930,ibm931,ibm939,ibm5026,ibm5035 Conversion from eucJP, PCK, or UTF-8 to ibm930,ibm931,ibm939, ibm5026,ibm5035 can be used to convert JIS X 0201, JIS X 0208, JIS X 0212, IBM extension characters, and user defined character. Input data which does not have corresponding character in the target code set is replaced with the substitute character. Since ibm931 does not support Kana characters in its single byte code set (SBCS), JIS X 0201 Kana characters are replaced with substi- tute characters in conversion to ibm931. ibm930,ibm931,ibm939,ibm5026,ibm5035 to eucJP,PCK,UTF-8 Conversion from ibm930,ibm931,ibm939,ibm5026,ibm5035 to eucJP,PCK, UTF-8 can be used to convert SBCS/DBCS characters defined in input code set. Input data which does not have corresponding character in the target code set is replaced with the substitute character. ms932 to UTF-8 and UTF-8 to ms932 Conversion between ms932 and UTF-8 is done using same way of mapping characters between the two codesets as Win- dows NT 3.51 does. UTF-8 to UTF-8-ms932 and UTF-8-ms932 to UTF-8 This converts between "UTF-8" and "UTF-8-ms932", which are UTF-8 encoded Unicode converted from PCK, and that converted from ms932. SEE ALSO iconv(1), jistoeuc(1), jistosj(1), iconv(3C), jisconv(3X), ibmjcode(3X), iconv(5), iconv_en_US.UTF-8(5), iconv_unicode(5)
Закладки на сайте Проследить за страницей |
Created 1996-2024 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |