2 .\" The Regents of the University of California. All rights reserved.
4 .\" This code is derived from software contributed to Berkeley by
5 .\" Donn Seeley of BSDI.
7 .\" Redistribution and use in source and binary forms, with or without
8 .\" modification, are permitted provided that the following conditions
10 .\" 1. Redistributions of source code must retain the above copyright
11 .\" notice, this list of conditions and the following disclaimer.
12 .\" 2. Redistributions in binary form must reproduce the above copyright
13 .\" notice, this list of conditions and the following disclaimer in the
14 .\" documentation and/or other materials provided with the distribution.
15 .\" 3. All advertising materials mentioning features or use of this software
16 .\" must display the following acknowledgement:
17 .\" This product includes software developed by the University of
18 .\" California, Berkeley and its contributors.
19 .\" 4. Neither the name of the University nor the names of its contributors
20 .\" may be used to endorse or promote products derived from this software
21 .\" without specific prior written permission.
23 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
35 .\" @(#)multibyte.3 8.1 (Berkeley) 6/4/93
36 .\" $FreeBSD: src/lib/libc/locale/multibyte.3,v 1.6.2.5 2001/12/14 18:33:54 ru Exp $
37 .\" $DragonFly: src/lib/libc/locale/Attic/multibyte.3,v 1.3 2005/03/24 12:48:04 swildner Exp $
48 .Nd multibyte character support for C
54 .Fn mblen "const char *mbchar" "size_t nbytes"
56 .Fn mbstowcs "wchar_t *wcstring" "const char *mbstring" "size_t nwchars"
58 .Fn mbtowc "wchar_t *wcharp" "const char *mbchar" "size_t nbytes"
60 .Fn wcstombs "char *mbstring" "const wchar_t *wcstring" "size_t nbytes"
62 .Fn wctomb "char *mbchar" "wchar_t wchar"
64 The basic elements of some written natural languages such as Chinese
65 cannot be represented uniquely with single C
67 The C standard supports two different ways of dealing with
68 extended natural language encodings,
73 Wide characters are an internal representation
74 which allows each basic element to map
75 to a single object of type
77 Multibyte characters are used for input and output
78 and code each basic element as a sequence of C
80 Individual basic elements may map into one or more
83 bytes in a multibyte character.
87 governs the interpretation of wide and multibyte characters.
90 specifically controls this interpretation.
93 type is wide enough to hold the largest value
94 in the wide character representations for all locales.
96 Multibyte strings may contain
98 indicators to switch to and from
99 particular modes within the given representation.
100 If explicit bytes are used to signal shifting,
101 these are not recognized as separate characters
102 but are lumped with a neighboring character.
103 There is always a distinguished
110 functions assume that multibyte strings are interpreted
111 starting from the initial shift state.
117 functions maintain static shift state internally.
120 pointer returns nonzero if the current locale requires shift states,
122 if shift states are required, the shift state is reset to the initial state.
123 The internal shift states are undefined after a call to
131 For convenience in processing,
132 the wide character with value 0
133 (the null wide character)
134 is recognized as the wide character string terminator,
135 and the character with value 0
137 is recognized as the multibyte character string terminator.
138 Null bytes are not permitted within multibyte characters.
142 function computes the length in bytes
143 of a multibyte character
151 function converts a multibyte character
153 into a wide character and stores the result
154 in the object pointed to by
162 function converts a wide character
164 into a multibyte character and stores
167 The object pointed to by
169 must be large enough to accommodate the multibyte character.
173 function converts a multibyte character string
175 into a wide character string
179 wide characters are stored.
180 A terminating null wide character is appended if there is room.
184 function converts a wide character string
186 into a multibyte character string
192 Partial multibyte characters at the end of the string are not stored.
193 The multibyte character string is null terminated if there is room.
195 If multibyte characters are not supported in the current locale,
196 all of these functions will return \-1 if characters can be processed,
208 functions return nonzero if shift states are supported,
213 then these functions return
214 the number of bytes processed in
216 or \-1 if no multibyte character
217 could be recognized or converted.
221 function returns the number of wide characters converted,
222 not counting any terminating null wide character.
225 function returns the number of bytes converted,
226 not counting any terminating null byte.
227 If any invalid multibyte characters are encountered,
228 both functions return \-1.
246 The current implementation does not support shift states.