Проект OpenNet: MAN kiconv_open (9) Ядро (FreeBSD и Linux)

Интерактивная система просмотра системных руководств (man-ов)

kiconv_open (9)

>> kiconv_open (9) ( Solaris man: Ядро )

NAME

kiconv_open - code conversion descriptor allocation function

SYNOPSIS

#include <sys/sunddi.h>



kiconv_t kiconv_open(const char *tocode, const char *fromcode);

INTERFACE LEVEL

Solaris DDI specific (Solaris DDI).

PARAMETERS

tocode

: Points to a target codeset name string.

fromcode

: Points to a source codeset name string.

DESCRIPTION

The kiconv_open() function returns a code conversion descriptor that describes a conversion from the codeset specified by fromcode to the codeset specified by tocode. For state-dependent encodings, the conversion descriptor is in a codeset-dependent initial state (ready for immediate use with the kiconv() function).

Supported code conversions are between UTF-8 and the following:

Name                    Description

Big5                    Traditional Chinese Big5
Big5-HKSCS              Traditional Chinese Big5-Hong Kong
                        Supplementary Character Set
CP720                   DOS Arabic                                          
CP737                   DOS Greek                                           
CP850                   DOS Latin-1 (Western European)                      
CP852                   DOS Latin-2 (Eastern European)                      
CP857                   DOS Latin-5 (Turkish)                               
CP862                   DOS Hebrew                                          
CP866                   DOS Cyrillic Russian                                
CP932                   Japanese Shift JIS (Windows)                       
CP950-HKSCS             Traditional Chinese HKSCS-2001 (Windows)           
CP1250                  Central Europe
CP1251                  Cyrillic
CP1252                  Western Europe
CP1253                  Greek
CP1254                  Turkish
CP1255                  Hebrew
CP1256                  Arabic
CP1257                  Baltic
EUC-CN                  Simplified Chinese EUC
EUC-JP                  Japanese EUC
EUC-JP-MS               Japanese EUC MS
EUC-KR                  Korean EUC
EUC-TW                  Traditional Chinese EUC
GB18030                 Simplified Chinese GB18030
GBK                     Simplified Chinese GBK                              
ISO-8859-1              Latin-1 (Western European)
ISO-8859-2              Latin-2 (Eastern European) 
ISO-8859-3              Latin-3 (Southern European)                         
ISO-8859-4              Latin-4 (Northern European)                         
ISO-8859-5              Cyrillic
ISO-8859-6              Arabic
ISO-8859-7              Greek
ISO-8859-8              Hebrew
ISO-8859-9              Latin-5 (Turkish)
ISO-8859-10             Latin-6 (Nordic)                                    
ISO-8859-13             Latin-7 (Baltic)
ISO-8859-15             Latin-9 (Western European with euro sign)
KOI8-R                  Cyrillic
Shift_JIS               Japanese Shift JIS (JIS)                            
TIS_620                 Thai (a.k.a. ISO 8859-11)
Unified-Hangul          Korean Unified Hangul

UTF-8 and the above names can be used at tocode and fromcode to specify the desired code conversion. The following aliases are also supported as alternative names to be used:

Aliases                 Original Name                                
 720                     CP720                                          
 737                     CP737                                         
 850                     CP850                                         
 852                     CP852                                      
 857                     CP857                                              
 862                     CP862                                          
 866                     CP866                                          
 932                     CP932                                        
 936, CP936              GBK                                            
 949, CP949              Unified-Hangul                                 
 950, CP950              Big5                                           
 1250                    CP1250                                         
 1251                    CP1251                                         
 1252                    CP1252                                         
 1253                    CP1253                                         
 1254                    CP1254                                         
 1255                    CP1255                                         
 1256                    CP1256                                         
 1257                    CP1257                                         
 ISO-8859-11             TIS_620                                        
 PCK, SJIS               Shift_JIS

A conversion descriptor remains valid until it is closed by using kiconv_close().

RETURN VALUES

Upon successful completion, kiconv_open() returns a code conversion descriptor for use on subsequent calls to kiconv(). Otherwise, if the conversion specified by fromcode and tocode is not supported or for any other reasons the code conversion descriptor cannot be allocated, kiconv_open() returns (kiconv_t)-1 to indicate the error.

CONTEXT

kiconv_close() can be called from user context only.

EXAMPLES

Example 1 Opening a Code Conversion

The following example shows how to open a code conversion from ISO 8859-15 to UTF-8

#include <sys/sunddi.h>

kiconv_t cd;

cd = kiconv_open("UTF-8", "ISO-8859-15");
if (cd == (kiconv_t)-1) {
        /* Cannot open up the code conversion. */
        return (-1);
}

ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE ATTRIBUTE VALUE

Interface Stability Committed

NOTES

The code conversions are available between UTF-8 and the above noted codesets. For example, to convert from EUC-JP to Shift_JIS, first convert EUC-JP to UTF-8 and then convert UTF-8 to Shift_JIS.

The code conversions supported are based on simple one-to-one mappings. There is no special treatment or processing done during code conversions such as case conversion, Unicode Normalization, or mapping between combining or conjoining sequences of UTF-8 and pre-composed characters in non-UTF-8 codesets.

All supported non-UTF-8 codesets use pre-composed characters only. However, UTF-8 allows combining or conjoining characters too. For this reason, using a form of Unicode Normalizations on UTF-8 text with u8_textprep_str() before or after doing code conversions might be necessary.