UFO: Alien Invasion
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
utf8.h File Reference
#include <stddef.h>

Go to the source code of this file.

Macros

#define UTF8_CONTINUATION_BYTE(c)   (((c) & 0xc0) == 0x80)
 

Functions

int UTF8_delete_char_at (char *s, int pos)
 Delete a whole (possibly multibyte) character from a string. More...
 
int UTF8_insert_char_at (char *s, int n, int pos, int codepoint)
 Insert a (possibly multibyte) UTF-8 character into a string. More...
 
int UTF8_char_len (unsigned char c)
 length of UTF-8 character starting with this byte. More...
 
int UTF8_next (const char **str)
 Get the next utf-8 character from the given string. More...
 
int UTF8_encoded_len (int codepoint)
 
size_t UTF8_strlen (const char *str)
 Count the number of character (not the number of bytes) of a zero termination string. More...
 
int UTF8_char_offset_to_byte_offset (char *str, int pos)
 Convert UTF-8 character offset to a byte offset in the given string. More...
 
char * UTF8_strncpyz (char *dest, const char *src, size_t limit)
 UTF8 capable string copy function. More...
 

Macro Definition Documentation

#define UTF8_CONTINUATION_BYTE (   c)    (((c) & 0xc0) == 0x80)

Is this the second or later byte of a multibyte UTF-8 character?

Definition at line 35 of file utf8.h.

Referenced by Com_sprintf(), R_FontFindFit(), R_FontFindTruncFit(), R_FontMakeChunks(), UTF8_delete_char_at(), UTF8_next(), and UTF8_strncpyz().

Function Documentation

int UTF8_char_len ( unsigned char  c)

length of UTF-8 character starting with this byte.

Returns
length of character encoding, or 0 if not start of a UTF-8 sequence
Todo:
Using this does not solve the truncation problem in case of decomposed characters. For example a code for "a" followed by a code for "put dots above previous character: the "a" will be reported as a character of length 1 by this function, even though the code that follows is part of its visual appearance and should not be cut off separately. Fortunately decomposed characters are rarely used.

Definition at line 109 of file utf8.cpp.

Referenced by Com_sprintf(), Con_Print(), R_FontMakeChunks(), UTF8_char_offset_to_byte_offset(), UTF8_strlen(), and UTF8_strncpyz().

int UTF8_char_offset_to_byte_offset ( char *  str,
int  pos 
)

Convert UTF-8 character offset to a byte offset in the given string.

Parameters
[in]strStart of the string
[in]posUTF-8 character offset from the start
Returns
offset of the first byte of the UTF-8 character at that offset
Note
If there aren't enough UTF-8 characters, returns the offset of the NULL terminator.
See also
UTF8_char_len

Definition at line 227 of file utf8.cpp.

References UTF8_char_len().

Referenced by TEST_F(), UTF8_delete_char_at(), and UTF8_insert_char_at().

int UTF8_delete_char_at ( char *  s,
int  pos 
)

Delete a whole (possibly multibyte) character from a string.

Parameters
[in]sStart of the string
[in]posUTF-8 char offset from the start (not the byte offset)
Returns
Number of bytes deleted

Definition at line 35 of file utf8.cpp.

References UTF8_char_offset_to_byte_offset(), and UTF8_CONTINUATION_BYTE.

Referenced by TEST_F(), and UI_TextEntryNodeEdit().

int UTF8_encoded_len ( int  c)

Calculate how long a Unicode code point (such as returned by SDL key events in unicode mode) would be in UTF-8 encoding.

Definition at line 188 of file utf8.cpp.

Referenced by IN_TranslateKey(), UI_TextEntryNodeEdit(), and UTF8_insert_char_at().

int UTF8_insert_char_at ( char *  s,
int  n,
int  pos,
int  c 
)

Insert a (possibly multibyte) UTF-8 character into a string.

Parameters
[in]sStart of the string
[in]nBuffer size of the string
[in]posUTF-8 char offset from the start (not the byte offset)
[in]cUnicode code as 32-bit integer
Returns
Number of bytes added

Definition at line 63 of file utf8.cpp.

References UTF8_char_offset_to_byte_offset(), and UTF8_encoded_len().

Referenced by uiTextEntryNode::draw(), TEST_F(), and UI_TextEntryNodeEdit().

int UTF8_next ( const char **  str)

Get the next utf-8 character from the given string.

Parameters
[in]strThe source string to get the utf-8 char from. The string is not touched, but the pointer is advanced by the length of the utf-8 character.
Returns
The utf-8 character, or -1 on error

Definition at line 132 of file utf8.cpp.

References i, len, and UTF8_CONTINUATION_BYTE.

Referenced by IN_Frame(), and TEST_F().

size_t UTF8_strlen ( const char *  str)

Count the number of character (not the number of bytes) of a zero termination string.

Note
the \0 termination character is not counted
to count the number of bytes, use strlen
See also
strlen

Definition at line 207 of file utf8.cpp.

References UTF8_char_len().

Referenced by uiTextEntryNode::draw(), uiTextEntryNode::onFocusGained(), uiTextEntryNode::onKeyPressed(), TEST_F(), and UI_TextEntryNodeEdit().

char* UTF8_strncpyz ( char *  dest,
const char *  src,
size_t  limit 
)

UTF8 capable string copy function.

Parameters
[out]destPointer to the output string
[in]srcPointer to the input string
[in]limitMaximum number of bytes to copy
Returns
dest pointer

Definition at line 247 of file utf8.cpp.

References dest, i, length, UTF8_char_len(), and UTF8_CONTINUATION_BYTE.

Referenced by uiTextEntryNode::draw(), Q_strncpyz(), and TEST_F().