#include <stddef.h>

Macros
#define	UTF8_CONTINUATION_BYTE(c) (((c) & 0xc0) == 0x80)

Functions
int	UTF8_delete_char_at (char *s, int pos)
	Delete a whole (possibly multibyte) character from a string. More...

int	UTF8_insert_char_at (char *s, int n, int pos, int codepoint)
	Insert a (possibly multibyte) UTF-8 character into a string. More...

int	UTF8_char_len (unsigned char c)
	length of UTF-8 character starting with this byte. More...

int	UTF8_next (const char **str)
	Get the next utf-8 character from the given string. More...

int	UTF8_encoded_len (int codepoint)

size_t	UTF8_strlen (const char *str)
	Count the number of character (not the number of bytes) of a zero termination string. More...

int	UTF8_char_offset_to_byte_offset (char *str, int pos)
	Convert UTF-8 character offset to a byte offset in the given string. More...

char *	UTF8_strncpyz (char dest, const char src, size_t limit)
	UTF8 capable string copy function. More...

Macro Definition Documentation

#define UTF8_CONTINUATION_BYTE ( c ) (((c) & 0xc0) == 0x80)

Is this the second or later byte of a multibyte UTF-8 character?

Definition at line 35 of file utf8.h.

Referenced by Com_sprintf(), R_FontFindFit(), R_FontFindTruncFit(), R_FontMakeChunks(), UTF8_delete_char_at(), UTF8_next(), and UTF8_strncpyz().

Function Documentation

int UTF8_char_len ( unsigned char c )

length of UTF-8 character starting with this byte.

Returns: length of character encoding, or 0 if not start of a UTF-8 sequence

Todo:: Using this does not solve the truncation problem in case of decomposed characters. For example a code for "a" followed by a code for "put dots above previous character: the "a" will be reported as a character of length 1 by this function, even though the code that follows is part of its visual appearance and should not be cut off separately. Fortunately decomposed characters are rarely used.

Definition at line 109 of file utf8.cpp.

Referenced by Com_sprintf(), Con_Print(), R_FontMakeChunks(), UTF8_char_offset_to_byte_offset(), UTF8_strlen(), and UTF8_strncpyz().

int UTF8_char_offset_to_byte_offset	(	char *	str,
		int	pos
	)

Convert UTF-8 character offset to a byte offset in the given string.

Parameters

[in]	str	Start of the string
[in]	pos	UTF-8 character offset from the start

Returns: offset of the first byte of the UTF-8 character at that offset

Note: If there aren't enough UTF-8 characters, returns the offset of the NULL terminator.

See also: UTF8_char_len

Definition at line 227 of file utf8.cpp.

References UTF8_char_len().

Referenced by TEST_F(), UTF8_delete_char_at(), and UTF8_insert_char_at().

int UTF8_delete_char_at	(	char *	s,
		int	pos
	)

Delete a whole (possibly multibyte) character from a string.

Parameters

[in]	s	Start of the string
[in]	pos	UTF-8 char offset from the start (not the byte offset)

Returns: Number of bytes deleted

Definition at line 35 of file utf8.cpp.

References UTF8_char_offset_to_byte_offset(), and UTF8_CONTINUATION_BYTE.

Referenced by TEST_F(), and UI_TextEntryNodeEdit().

int UTF8_encoded_len ( int c )

Calculate how long a Unicode code point (such as returned by SDL key events in unicode mode) would be in UTF-8 encoding.

Definition at line 188 of file utf8.cpp.

Referenced by IN_TranslateKey(), UI_TextEntryNodeEdit(), and UTF8_insert_char_at().

int UTF8_insert_char_at	(	char *	s,
		int	n,
		int	pos,
		int	c
	)

Insert a (possibly multibyte) UTF-8 character into a string.

Parameters

[in]	s	Start of the string
[in]	n	Buffer size of the string
[in]	pos	UTF-8 char offset from the start (not the byte offset)
[in]	c	Unicode code as 32-bit integer

Returns: Number of bytes added

Definition at line 63 of file utf8.cpp.

References UTF8_char_offset_to_byte_offset(), and UTF8_encoded_len().

Referenced by uiTextEntryNode::draw(), TEST_F(), and UI_TextEntryNodeEdit().

int UTF8_next ( const char ** str )

Get the next utf-8 character from the given string.

Parameters

[in] str The source string to get the utf-8 char from. The string is not touched, but the pointer is advanced by the length of the utf-8 character.

Returns: The utf-8 character, or -1 on error

Definition at line 132 of file utf8.cpp.

References i, len, and UTF8_CONTINUATION_BYTE.

Referenced by IN_Frame(), and TEST_F().

size_t UTF8_strlen ( const char * str )

Count the number of character (not the number of bytes) of a zero termination string.

Note: the \0 termination character is not counted; to count the number of bytes, use strlen

See also: strlen

Definition at line 207 of file utf8.cpp.

References UTF8_char_len().

Referenced by uiTextEntryNode::draw(), uiTextEntryNode::onFocusGained(), uiTextEntryNode::onKeyPressed(), TEST_F(), and UI_TextEntryNodeEdit().

char* UTF8_strncpyz	(	char *	dest,
		const char *	src,
		size_t	limit
	)

UTF8 capable string copy function.

Parameters

[out]	dest	Pointer to the output string
[in]	src	Pointer to the input string
[in]	limit	Maximum number of bytes to copy

Returns: dest pointer

Definition at line 247 of file utf8.cpp.

References dest, i, length, UTF8_char_len(), and UTF8_CONTINUATION_BYTE.

Referenced by uiTextEntryNode::draw(), Q_strncpyz(), and TEST_F().

Macros

Functions

Macro Definition Documentation

Function Documentation