ADC Home > Reference Library > Reference > Mac OS X > Mac OS X Man Pages

 

This document is a Mac OS X manual page. Manual pages are a command-line technology for providing documentation. You can view these manual pages locally using the man(1) command. These manual pages come from many different sources, and thus, have a variety of writing styles.

For more information about the manual page format, see the manual page for manpages(5).



MULTIBYTE(3)             BSD Library Functions Manual             MULTIBYTE(3)

NAME
     multibyte -- multibyte and wide character manipulation functions

LIBRARY
     Standard C Library (libc, -lc)

SYNOPSIS
     #include <limits.h>
     #include <stdlib.h>
     #include <wchar.h>

DESCRIPTION
     The basic elements of some written natural languages, such as Chinese, cannot be represented uniquely
     with single C chars.  The C standard supports two different ways of dealing with extended natural lan-guage language
     guage encodings: wide characters and multibyte characters.  Wide characters are an internal representa-tion representation
     tion which allows each basic element to map to a single object of type wchar_t.  Multibyte characters
     are used for input and output and code each basic element as a sequence of C chars.  Individual basic
     elements may map into one or more (up to MB_LEN_MAX) bytes in a multibyte character.

     The current locale (setlocale(3)) governs the interpretation of wide and multibyte characters.  The
     locale category LC_CTYPE specifically controls this interpretation.  The wchar_t type is wide enough to
     hold the largest value in the wide character representations for all locales.

     Multibyte strings may contain `shift' indicators to switch to and from particular modes within the
     given representation.  If explicit bytes are used to signal shifting, these are not recognized as sepa-rate separate
     rate characters but are lumped with a neighboring character.  There is always a distinguished `initial'
     shift state.  Some functions (e.g., mblen(3), mbtowc(3) and wctomb(3)) maintain static shift state
     internally, whereas others store it in an mbstate_t object passed by the caller.  Shift states are
     undefined after a call to setlocale(3) with the LC_CTYPE or LC_ALL categories.

     For convenience in processing, the wide character with value 0 (the null wide character) is recognized
     as the wide character string terminator, and the character with value 0 (the null byte) is recognized
     as the multibyte character string terminator.  Null bytes are not permitted within multibyte charac-ters. characters.
     ters.

     The C library provides the following functions for dealing with multibyte characters:

     Function       Description
     mblen(3)       get number of bytes in a character
     mbrlen(3)      get number of bytes in a character (restartable)
     mbrtowc(3)     convert a character to a wide-character code (restartable)
     mbsrtowcs(3)   convert a character string to a wide-character string (restartable)
     mbstowcs(3)    convert a character string to a wide-character string
     mbtowc(3)      convert a character to a wide-character code
     wcrtomb(3)     convert a wide-character code to a character (restartable)
     wcstombs(3)    convert a wide-character string to a character string
     wcsrtombs(3)   convert a wide-character string to a character string (restartable)
     wctomb(3)      convert a wide-character code to a character

SEE ALSO
     mklocale(1), setlocale(3), stdio(3), big5(5), euc(5), gb18030(5), gb2312(5), gbk(5), mskanji(5),
     utf8(5)

STANDARDS
     These functions conform to ISO/IEC 9899:1999 (``ISO C99'').

BSD                              April 8, 2004                             BSD

Did this document help you?
Yes: Tell us what works for you.
It’s good, but: Report typos, inaccuracies, and so forth.
It wasn’t helpful: Tell us what would have helped.