project-navigation
Personal tools

Author Topic: wrap chinese text  (Read 4327 times)

Offline sunsolzn

  • Rookie
  • ***
  • Posts: 18
    • View Profile
wrap chinese text
« on: November 24, 2008, 09:02:23 am »
I try to first find one line max len,than back find '-' or ' '.
Code: [Select]
static int R_FontFindFit (const font_t *f, char *text, int maxlen, int maxWidth, int *widthp)
{
int bestbreak = 0;
int width;
int len;

*widthp = 0;
width = R_FontChunkLength(f, text, maxlen);
if (width<=maxWidth) {
*widthp = width;
return maxlen-1;
}

/*  Break first word anywhere. */
for (len = max(6,maxlen*maxWidth/width)-5; len < maxlen; len++) {
if (UTF8_CONTINUATION_BYTE(text[len]))
continue;
width = R_FontChunkLength(f, text, len);
if (width > maxWidth)
break;
bestbreak = len;
*widthp = width;
}

/** @todo Smart breaking of Chinese text */
/* Fit whole words */ /* Fit hyphenated word parts */
/* last 1/6 Width to find break */
for (len = bestbreak; len > (bestbreak - bestbreak/6); len--) {
if (text[len] == ' ' || text[len] == '-') {
width = R_FontChunkLength(f, text, len);
bestbreak = len;
*widthp = width;
break;
}
}

return bestbreak;
}

in ufopedia this is ok,but in 3D Geoscape can't display  basename or nation name .

[attachment deleted by admin]

Amtep

  • Guest
Re: wrap chinese text
« Reply #1 on: November 24, 2008, 10:30:46 pm »
Hmm, I don't really see how this is supposed to work. Remember that the game engine doesn't know when it's dealing with Chinese text, so it has to work on any text.

It might be that the best thing to do is to drop the assumption that a hyphen is a good place to break a line. I guess UFO:AI has a lot of "TR-20" and not very much "forward-reverse shocks".  Getting rid of that loop would fix the weird line breaks that are visible on the example jpg.

Then the next improvement would be to actually implement the @todo. There could be a loop that detects the Chinese comma and period and breaks after those. Are there others? What are the Unicode codes for them?

Offline sunsolzn

  • Rookie
  • ***
  • Posts: 18
    • View Profile
Re: wrap chinese text
« Reply #2 on: November 25, 2008, 06:08:58 am »
my english very poor.
Code: [Select]
#-------width-------##-------width-------##-------width-------#
xxxx xxxxx xxxxx xxxx xxxxx xxxxx xxxx xxxxx xxxxx xxxx
^-------------------------------------------------->
The original procedure is front to back
I think if first Guess the width
#-------width-------##-------width-------##-------width-------#
xxxx xxxxx xxxxx xxxx xxxxx xxxxx xxxx xxxxx xxxxx xxxx
                    ^
                    | this is guess
           ^<-------| move front some number char
           +---> now find break
I hope you can understand what I mean

Offline bayo

  • Professional loser
  • Project Coder
  • Captain
  • ***
  • Posts: 733
    • View Profile
Re: wrap chinese text
« Reply #3 on: November 25, 2008, 03:03:16 pm »
Maybe sunsolzn want to break word with ' ' and '-'.

Code: [Select]
xxxxxxxxx xxxxxxxxxx xxxx xxxxxxx xxxxx xxxxxx
xxxxxxxxx xxxxxxxxxx xxxx xxxxxxxxx xxxxxxxxxx
xxxx xxxxxxxx xxxxxxxxxxx xxxxx xxxx xxxxx-
xxxxxx xxx xxxxxxxxxx xxxxxxxxxx xxxxxxx

instead of

Code: [Select]
xxxxxxxxx xxxxxxxxxx xxxx xxxxxxx xxxxx xxxxxx
xxxxxxxxx xxxxxxxxxx xxxx xxxxxxxxx xxxxxxxxxx
xxxx xxxxxxxx xxxxxxxxxxx xxxxx xxxx
xxxxx-xxxxxx xxx xxxxxxxxxx xxxxxxxxxx xxxxxxx

I check, Unicode provide a lot of hyphen, at least:
* U+00AD i dont know if it is breakable?
* U+2010 breakable
* U+2011 non-breakable

I dont check the code, maybe the loop can break word looking space, and at least U+2010 (if we dont use U+00AD, its nor realy a problem)... For a special typo we only need to use into the text input the right character U+2010 or U+2011. Is anything here is a special case for Chinese?

Amtep

  • Guest
Re: wrap chinese text
« Reply #4 on: November 26, 2008, 01:37:08 pm »
Guessing the length will not work well with English text. For example, "ammo" is 40 pixels wide and "it's" is only 16 pixels wide, but both are 4 bytes.

Don't worry about speed in R_FontFindFit. Text is wrapped only one time.

bayo, thanks for looking up the hyphens.
I think it would be good to break only at U+2010 hyphens, and change the translations to use U+2010 where possible.